CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
delete tag in xml

 



perlmagix
Novice

Apr 30, 2016, 11:33 PM

Post #1 of 10 (2407 views)
delete tag in xml Can't Post

I am working on an input xml file containing the following data,

inputfile.xml

<data>
<line> sdfe abc adsfefsdf </line>
<line> abc sdffedcfsdf sdf </line>
<line> sdfe </line><line> abc </line>
<line> sd sfefsdf </line>
<line> sdfe abc adsfefsdf </line>
<line> fhgh kk jj hjsda </line>
<line> abc </line>
..
..
..
</data>


My intention is to produce the following output file

outputfile.xml


<data>
<line> sdfe </line>
<line> sd sfefsdf </line>
<line> fhgh kk jj hjsda </line>
..
..
..
</data>

Desired output:
Remove all the tags which contain the data "abc",


I have tried the following command, with no success. In Perl, can the output be delivered by use of regular expression (regex).



Code
`sed '\|<line>*abc*| ,\|</line>|d' inputfile.xml > outputfile.xml`



(This post was edited by perlmagix on May 1, 2016, 4:26 AM)


BillKSmith
Veteran

May 1, 2016, 7:17 AM

Post #2 of 10 (2393 views)
Re: [perlmagix] delete tag in xml [In reply to] Can't Post

The only reliable way to edit xml is to parse it with a module before attempting the edits. In special cases, such as your example, you can get away with editing the xml directly. My "solution" will fail if any xml "line" contains an embedded tag or if the xml line is spread over more than one perl line. It is likely that there are other special conditions, which I have not thought of.


Code
use strict; 
use warnings;
while (my $record = <DATA>) {
$record =~ s/\<line\>[^<]*?\babc\b[^<]*?\<\/line\>//ig;
print $record if $record =~ /\S/;
}
__DATA__
<data>
<line> sdfe abc adsfefsdf </line>
<line> abc sdffedcfsdf sdf </line>
<line> sdfe </line><line> abc </line>
<line> sd sfefsdf </line>
<line> sdfe abc adsfefsdf </line>
<line> fhgh kk jj hjsda </line>
<line> abc </line>
..
..
..
</data>


Output:

Code
<data> 
<line> sdfe </line>
<line> sd sfefsdf </line>
<line> fhgh kk jj hjsda </line>
..
..
..
</data>

Good Luck,
Bill


perlmagix
Novice

May 1, 2016, 9:15 AM

Post #3 of 10 (2387 views)
Re: [BillKSmith] delete tag in xml [In reply to] Can't Post

Cheers Bill,

Smile

Mike F

In Reply To


perlmagix
Novice

May 1, 2016, 3:17 PM

Post #4 of 10 (2381 views)
Re: [BillKSmith] delete tag in xml [In reply to] Can't Post

Also, for the given example,
Is it possible to do similar operation
For an array of values,

Array Sample:
abc
de
fghi
jkl


Laurent_R
Veteran / Moderator

May 1, 2016, 11:28 PM

Post #5 of 10 (2370 views)
Re: [perlmagix] delete tag in xml [In reply to] Can't Post

If you want to filter out any array element containing "abc", one possible way is this:

Code
my @array = qw / sSDQ sdd abcsg dlk ssq/; 
my @filtered = grep { not /abc/ } @array;

Now, @filtered should contain all the elements of @array except "abcsg".


perlmagix
Novice

May 1, 2016, 11:59 PM

Post #6 of 10 (2367 views)
Re: [Laurent_R] delete tag in xml [In reply to] Can't Post

Thank you Laurent,

I have not properly conveyed my question,

"inputfile.xml" is the input file

array of values to be checked and removed from input file

@array = qw / abc de fghi jklm /;


"outputfile.xml" is the output file,

the output file, should remove all the tags which contains the elements of the array,

My question is to incorporate this array into the regex provided by Bill, :)


Mike F


BillKSmith
Veteran

May 2, 2016, 8:50 PM

Post #7 of 10 (2345 views)
Re: [perlmagix] delete tag in xml [In reply to] Can't Post

My previous warnings still apply.

Code
use strict; 
use warnings;
my @array = qw / abc de fghi jklm sdfe kk/;
my $filter = join '|', @array;
$filter = qr/$filter/;
while (my $record = <DATA>) {
$record =~ s/\<line\>[^<]*?\b$filter\b[^<]*?\<\/line\>//ig;
print $record if $record =~ /\S/;
}
__DATA__
<data>
<line> sdfe abc adsfefsdf </line>
<line> abc sdffedcfsdf sdf </line>
<line> sdfe </line><line> abc </line>
<line> sd sfefsdf </line>
<line> sdfe abc adsfefsdf </line>
<line> fhgh kk jj hjsda </line>
<line> abc </line>
..
..
..
</data>


Note: I added two array elements to remove two more lines from your original example.
Good Luck,
Bill


perlmagix
Novice

May 2, 2016, 9:07 PM

Post #8 of 10 (2343 views)
Re: [BillKSmith] delete tag in xml [In reply to] Can't Post

Simply amazing bill,

can this be extended to following "data" if

__DATA__
<data>
<line> sdfe abc adsfefsdf </line>
<line> abc sdffedcfsdf sdf </line>
<line> sdfe </line><line> abc </line>
<line> sd abcsfefsdf </line>
<line> sdfe abc adsfefsdf </line>
<line> fhgh kk dejj hjsda </line>
<line> abc </line>
..
..
..
</data>


Modified data details:

"abc" joining with other letters in a single word,

as an example

abcsdfedf or zdfcfabc


"de" joining with other letters in a single word,

as an example

degfegf or fgsfvjde or gergdesdcdf



Smile


(This post was edited by perlmagix on May 2, 2016, 9:08 PM)


BillKSmith
Veteran

May 3, 2016, 5:18 AM

Post #9 of 10 (2330 views)
Re: [perlmagix] delete tag in xml [In reply to] Can't Post

You should be able to figure this out yourself.

Assertions
Good Luck,
Bill


perlmagix
Novice

May 3, 2016, 9:34 AM

Post #10 of 10 (2322 views)
Re: [BillKSmith] delete tag in xml [In reply to] Can't Post

Sure thing,

Cheers :)

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives