
mrominski
New User
Oct 21, 2009, 10:01 AM
Post #1 of 3
(1510 views)
|
I'm trying to parse XML nodes where some nodes have node data and some do not. for example: <Instance_Group> <Instance id="1" name="node_1" /> <Instance id="2" name="node_2">Node data</Instance> </Instance_Group> I want to grab each "Instance" node within the "Instance_Group", but am having issues because of the different node types. Does anyone have a REGEXP that can parse each XML node, regardless whether or not there is node data? I have tried a few different ideas. - The code: while( ($tag =~ m#<Instance[\s\S]+?/>#mi)||($tag =~ m#<Instance[\s\S]+?/Instance>#mi)) was my first try. However, if there are multiple mixed nodes, the code will grab all the way to the "/>" first, even though the first node may end in "</Instance>. That is to say, it may grab 2 or more nodes, rather than just one. - The code: while($tag =~ m#<Instance.+?(?:/>||/Instance>)#mi) This attempt will grab a node with no node data successfully (i.e., <Instance id="1" name="node_1" />), but will only grab the node information up to the node data, if it exists. For example, parsing <Instance id="2" name="node_2">Node data</Instance> will only grab <Instance id="2" name="node_2"> with a remainder of Node data</Instance> Note: I did try setting the or using a single pipe in the non-saved match {i.e., (?:/>|/Instance>) } However, this produced no matches. I've tried a few other solutions as well with no luck. Any suggestions would be appreciated.
|