CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Regular Expressions:
extract some text from a webpage

 



slekness
New User

Nov 17, 2011, 7:18 AM

Post #1 of 3 (5735 views)
extract some text from a webpage Can't Post

hey all..

So.. i have this regex /(%)((?:[a-z][a-z0-9_]*))/
I'm not sure if it's correct but anyway what i'm trying to do is use it to get a line of text from a webpage that begins with a % character and ends in a whitespace so it would be like: whatever %Get_This123 whatever

I tried grep but erm it wouldnt work..


rovf
Veteran

Nov 18, 2011, 4:25 AM

Post #2 of 3 (5640 views)
Re: [slekness] extract some text from a webpage [In reply to] Can't Post

Whether or not the grep function is useful, depends on how you want to process the result. A more frequently used solution is, however, to loop through the file and apply your regex to each line.

Note that the pattern matching operator in Perl is =~

(See perldoc perlop)

BTW, your regexp doesn't fulfil the condition "ends in a whitespace", because the word string might also be terminated, for instance, by a dot, comma, or uppercase letter (for this reason, your example string ".... %Get_This123 ..." would not match).


warlock
New User

Nov 30, 2011, 12:37 AM

Post #3 of 3 (5365 views)
Re: [slekness] extract some text from a webpage [In reply to] Can't Post

Try the following instead:

Code
my $string = "whatever %Get_This123 whatever"; 
if ($string =~ /.*?^%(.*?)\s/)
{
my $match = $1;
print "Matched: $match\n";
}

Note that the above regexp is quite light and will also match "* %whatever * <space>"

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives