CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
Search Posts SEARCH
Who's Online WHO'S
Log in LOG

Home: Perl Programming Help: Regular Expressions: Perl String Match Problem: Edit Log

New User

Jun 27, 2009, 1:16 PM

Views: 6045
Perl String Match Problem

I am writing a parser to extract information from web pages. Example, (or see attachment)

I am trying to extract the post content of this page. So I read the page source of this page to $page_src and I try to use string match to extract the corresponding portion of the source.

Here is my code to do the match. I identified the
start tag:
<tr class="white">
and end tag
<td><div class="pad5x10">&nbsp;<\/div><\/td>
of the post content. (.|\n)*? will match any characters as well as new lines. My code works for other pages but failed when parsing the above linked page (or attached).

Can one point out the problem in my code? Really appreciate!

while ($page_src =~ /<tr class=\"white\">((.|\n)*?)<td><div class=\"pad5x10\">&nbsp;<\/div><\/td>\s+<\/tr>/g) { 
my $match_str = $1;
print $match_str . "\n";

(This post was edited by langqinren on Jun 27, 2009, 1:17 PM)

Edit Log:
Post edited by langqinren (New User) on Jun 27, 2009, 1:17 PM

Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives