CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
Search Posts SEARCH
Who's Online WHO'S
Log in LOG

Home: Perl Programming Help: Intermediate:
algorithm for nested parsing needed



Aug 4, 2003, 3:20 PM

Post #1 of 3 (939 views)
algorithm for nested parsing needed Can't Post

Hi Folks,

I could use a hand coming up with an algorithm for parsing nested "cells" which is not recursive.

Here's an example source document which is to be parsed:

 ==== parse test ==== 
<!--Record: outer_record1 --><table border="2">
<!--Record: inner_record1--><tr><td><b>%%title%%</b></td></tr>
<!--End_Record: inner_record1-->
<tr><td>%%one%%, %%two%%, %%three%%</td></tr>
<!--Record: inner_record2--><tr><td bgcolor="#FFEEDD">%%type%%</td></tr>
<!--End_Record: inner_record2-->
</table><!--End_Record: outer_record1 -->
<!--Record: outer_record2--> <li>%%one%%</li>
<!--End_Record: outer_record2-->
<!--Record: outer_record3 --><table border="2">
<tr><tr><b>Quantity</b> %%qty%%</td></tr>
<!--Record: inner_record3--><tr><td>%%retail_price%%</td></tr><!--End_Record: inner_record3--><!--Record: inner_record4--><tr><td>%%discount_price%%</td></tr><!--End_Record: inner_record4-->
</table><!--End_Record: outer_record3 -->
==== end of parse test ====

And here's what I'm using to parse out the records. As you can see, this only parses 1 level deep (the top or outer level).


# set record containers
# eg: <!--Record: inner_record--><tr><td bgcolor="%%bgcolor3%%">%%cell_content%%</td></tr><!--End_Record: inner_record-->
my $placeholder_prefix = $arg_ref->{'placeholder_prefix'} || '%%';
my $placeholder_suffix = $arg_ref->{'placeholder_suffix'} || '%%';
my $record_start_prefix = $arg_ref->{'record_start_prefix'} || '<!--Record:';
my $record_start_suffix = $arg_ref->{'record_start_suffix'} || '-->';
my $record_end_prefix = $arg_ref->{'record_end_prefix'} || '<!--End_Record:';
my $record_end_suffix = $arg_ref->{'record_end_suffix'} || '-->';

# parse records
my $record = qr/\Q$record_start_prefix\E\s*([a-zA-Z0-9_-]+?)\s*\Q$record_start_suffix\E(.*?)\Q$record_end_prefix\E\s*\1\s*\Q$record_end_suffix\E/s;
while ($file =~ s/$record/$placeholder_prefix$1$placeholder_suffix/) {
my ($record_placeholder, $record_content) = ($1, $2);
$Templates{$arg_ref->{'template'}}{'records'}{$record_placeholder} = $record_content;
$Templates{$arg_ref->{'template'}}{'file'} = $file;

It would be easy enough parse the inner records recursively, but I just can't get my head around how to do this iteratively Unsure

Any thoughts on even a general approach to converting a recursive algorithm to an itterative one are appreciated :)

PS Sorry for the horizontal scroll... Shocked

>> If you can't control it, improve it, correlate it or disseminate it with PERL, it doesn't exist!


Aug 14, 2003, 11:52 PM

Post #2 of 3 (913 views)
Re: [kencl] algorithm for nested parsing needed [In reply to] Can't Post

all recursive systems can be converted to iterated ones. the key is just managing your own stack and state instead of having the language (perl in this case) do it for you.

i won't get into your code since i would recommend a recursive solution as it will be much easier to code. in fact the module Parse::RecDescent would make this very easy.

but if you must keep it iterative, just create a stack (an array is perfect) and push onto it where you currently are in the string, what you have parsed so far (internal structure) and other state info. then do another loop in the parser. if you parse an end of section, then pop that stack info and continue where it left off.

for simple rescursion to interative solutions this stack info is simple. but for your parser it can be very complex which is why a real recursive solution is much easier to implement.


Aug 16, 2003, 5:09 PM

Post #3 of 3 (908 views)
Re: [uri] algorithm for nested parsing needed [In reply to] Can't Post

Hi uri. Thanks for the reply. I ended up doing it recursively because, as you mention, keeping track of the stack would be a big job. Also, the templates I'm parsing will rarely go more than 3 levels deep so I don't have to worry about overusing system resources.

>> If you can't control it, improve it, correlate it or disseminate it with PERL, it doesn't exist!


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives