CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Repeatly reading information between two tags

 



alalleyn
Novice

May 19, 2002, 4:52 AM

Post #1 of 7 (1064 views)
Repeatly reading information between two tags Can't Post

I have been reading over a few other posts that cover this area. But one place that I am still having problems with is how to deal with the tags if there are mutliple occurances. For instance the input file might read

Anything in general
<start> This is important<end>
Anything in general again. More genreal sections.
<start> Here is another important section that <end>

That could be repeated over any number of times, but most the posts that I have found only deal with one instance of the start and end tags in the file, where as I am trying to do the same kind of thing but with mutliple instances of the tags.

Hopefully someone might be able to help and sorry if this has been written in the wrong place was not sure if I was meant to attach to the end of the original post that I have found most interesting or start a new one.

Thanks
Andrea


mhx
Enthusiast / Moderator

May 19, 2002, 1:35 PM

Post #2 of 7 (1063 views)
Re: [alalleyn] Repeatly reading information between two tags [In reply to] Can't Post

I'm not sure if this is what you need, but it might point you in the right direction. You can use a regular expression with the /g modifier to grab all occurrences of text between two tags. Say you have your example text stored in $text. Then

[perl]
@tag = $text =~ /<start>(.*?)<end>/g;
[/perl]

will put all text between <start> and <end> into the @tag array, which would then look like this:


Code
@tag = ( 'This is important', 'Here is another important section that' );


You can also use the regex in a while loop:

[perl]
while( $text =~ /<start>(.*?)<end>/g ) {
print "$1\n"; # do something with $1
}
[/perl]

Feel free to ask if you need more specific information or some extra explanation. Smile

-- mhx

At last with an effort he spoke, and wondered to hear his own words, as if some other will was using his small voice. "I will take the Ring," he said, "though I do not know the way."

-- Frodo



alalleyn
Novice

May 19, 2002, 2:10 PM

Post #3 of 7 (1060 views)
Re: [mhx] Repeatly reading information between two tags [In reply to] Can't Post

Mhx, Hope you dont mind me getting back in contact, but this might be a little clearer as to excatly what my problem is. I have written the following Perl Code which should write to given output file all the programlisting sections, however it only seems to copy across the first section. Ulimatley I would like all of them copied across and am tearing my hair out to see why it is not.


Code
#!C/usr/bin/perl 

#use strict;
use diagnostics;
use CGI();
use CGI::Carp qw(fatalsToBrowser);
use Fcntl qw(:flock);

my $file = $query -> param('file');
my $output = $query -> param ('output');

open (TXT, $file) or die "Can't open $file: $!";
#flock(TXT, LOCK_SH); # no one can edit the file now.
while (<TXT>) {
@code = ($text) = do { local $/=undef; <TXT> } =~ /<programlisting>(.*?)<\/programlisting>/gsi;
}
close TXT;

open (OUT, ">$output") or die "Cant write to $output: $!";
print OUT @code;
close (OUT);

print "Content-type: text/html\n\n";
print "<html><h1>A source file has been created!</h1>\n";
print "Please feel free to open $output and use at ones pleasure</html>\n";


The input file that I am using is as attached and the generated output is

-- This is smaple code in Java, using Hello World
-- From the first scrap
public class Welcome {


Why does it only do the first section? I think that I am doing correctly. Hopefully I have made myself a little clearer now and am eagley awaiting any useful input.
Thanks again for any help that you can offer.
Andrea


(This post was edited by alalleyn on May 19, 2002, 2:22 PM)


mhx
Enthusiast / Moderator

May 19, 2002, 3:16 PM

Post #4 of 7 (1052 views)
Re: [alalleyn] Repeatly reading information between two tags [In reply to] Can't Post

Try if this piece of code is doing what you want:

[perl]
#!/usr/bin/perl -wT
use strict;
use CGI;
use CGI::Carp qw(fatalsToBrowser);

my $q = new CGI;
my $file = $q->param('file');
my($output) = $q->param('output') =~ /^([\w.]+)$/
or die "invalid output file name";

open TXT, $file or die "cannot open $file: $!";
my @code = do { local $/; <TXT> } =~ /<programlisting>(.*?)<\/programlisting>/gsi;
close TXT;

open OUT, ">$output" or die "cannot open $output: $!";
print OUT @code;
close OUT;

print $q->header,
$q->start_html,
$q->h1('A source file has been created!'),
$q->p("Please feel free to open $output and use at ones pleasure"),
$q->end_html;
[/perl]

Hope this helps.

-- mhx

At last with an effort he spoke, and wondered to hear his own words, as if some other will was using his small voice. "I will take the Ring," he said, "though I do not know the way."

-- Frodo



alalleyn
Novice

May 19, 2002, 3:45 PM

Post #5 of 7 (1047 views)
Re: [mhx] Repeatly reading information between two tags [In reply to] Can't Post

Thank you!! It now works. I know all the CGI header lines had to be changed in my original script to tidy it up. My question is now why did the coding for reading the information not work correctly? As, as far as I can tell you have only really changed the way I handled input and outputs. Just need to know for my own benefit,so next time it will not take me as long.

Thanks for making the header changes though it will save much typing.

Andrea


mhx
Enthusiast / Moderator

May 20, 2002, 12:38 AM

Post #6 of 7 (1043 views)
Re: [alalleyn] Repeatly reading information between two tags [In reply to] Can't Post

The problem with your script is the following part:


Code
while( <TXT> ) { 
@code = ($text) = do { local $/=undef; <TXT> } =~ /<programlisting>(.*?)<\/programlisting>/gsi;
}


First of all, the while loop is wrong, since the loop's body slurps the whole text file at once. The tricky thing is that the while( <TXT> ) will read the first line of the text file which will never be used. So, let's remove the while loop:


Code
@code = ($text) = do { local $/=undef; <TXT> } =~ /<programlisting>(.*?)<\/programlisting>/gsi;


The do { local $/=undef; <TXT> } will read the whole text file. You don't need to assign undef since localizing a variable will automatically undefine it locally. However, that part - including the regex - is absolutely correct. The problem is here:


Code
@code = ($text) = do { local $/; <TXT> } =~ /<programlisting>(.*?)<\/programlisting>/gsi;


You are first assinging the list of all matches to ($text), which will keep the first match in $text and throw away all other matches. Next, that single match is assigned to @code. If you had left out the assignment to ($text) (which you don't need anyway):


Code
@code = do { local $/; <TXT> } =~ /<programlisting>(.*?)<\/programlisting>/gsi;


or at least had written it the other way round:


Code
($text) = @code = do { local $/; <TXT> } =~ /<programlisting>(.*?)<\/programlisting>/gsi;


the solution would have worked.

All other changes I've made are more or less for style. I'm a use strict; fanatic, and I don't like embedded HTML code very much... Wink

Hope this helps.

-- mhx

At last with an effort he spoke, and wondered to hear his own words, as if some other will was using his small voice. "I will take the Ring," he said, "though I do not know the way."

-- Frodo



alalleyn
Novice

May 20, 2002, 2:55 AM

Post #7 of 7 (1039 views)
Re: [mhx] Repeatly reading information between two tags [In reply to] Can't Post

Thanks that was excatly what I needed to know!

Andrea

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives