CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Regular Expressions:
Parsing XML

 



mrominski
New User

Oct 21, 2009, 10:01 AM

Post #1 of 3 (2747 views)
Parsing XML Can't Post

I'm trying to parse XML nodes where some nodes have node data and some do not.

for example:

<Instance_Group>

<Instance id="1" name="node_1" />

<Instance id="2" name="node_2">Node data</Instance>

</Instance_Group>



I want to grab each "Instance" node within the "Instance_Group", but am having issues because of the different node types.

Does anyone have a REGEXP that can parse each XML node, regardless whether or not there is node data?



I have tried a few different ideas.

- The code:

while( ($tag =~ m#<Instance[\s\S]+?/>#mi)||($tag =~ m#<Instance[\s\S]+?/Instance>#mi))

was my first try. However, if there are multiple mixed nodes, the code will grab all the way to the "/>" first, even though the first node may end in "</Instance>. That is to say, it may grab 2 or more nodes, rather than just one.



- The code:

while($tag =~ m#<Instance.+?(?:/>||/Instance>)#mi)

This attempt will grab a node with no node data successfully (i.e., <Instance id="1" name="node_1" />), but will only grab the node information up to the node data, if it exists. For example, parsing

<Instance id="2" name="node_2">Node data</Instance>

will only grab

<Instance id="2" name="node_2"> with a remainder of Node data</Instance>



Note: I did try setting the or using a single pipe in the non-saved match {i.e., (?:/>|/Instance>) } However, this produced no matches.



I've tried a few other solutions as well with no luck. Any suggestions would be appreciated.


FishMonger
Veteran / Moderator

Oct 21, 2009, 10:38 AM

Post #2 of 3 (2744 views)
Re: [mrominski] Parsing XML [In reply to] Can't Post

Don't use a regex to parse an XML file.

Use one of the XML parsers that are on cpan, such as XML::Simple.



http://search.cpan.org/~grantm/XML-Simple-2.18/lib/XML/Simple.pm


ichi
User

Oct 22, 2009, 6:36 PM

Post #3 of 3 (2703 views)
Re: [mrominski] Parsing XML [In reply to] Can't Post

although i do not know exactly what's your output, here's a way without too much regex, by toggling flags

Code
while(<>){ 
if( /<\/Instance_Group>/){ $f=0;}
if( /<Instance_Group>/){
$f=1;
next
}
if ( $f==1 ) {
print $_ ; #do your stuff here.
}
}


 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives