CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
need help for perl web crawler

 



zhihe2
New User

Oct 6, 2010, 7:17 AM

Post #1 of 2 (342 views)
need help for perl web crawler Can't Post

Hi all, I am beginner for perl, need help to use perl to get information from website

for example, the website is

http://www.almanac.com/weather/history/zipcode/21218/2008-09-02

I need to store the mean temperature, maximum temperature and minimum temperature.

can anyone tell me how to do?
I tried, but failed.


7stud
Enthusiast

Oct 6, 2010, 11:40 AM

Post #2 of 2 (335 views)
Re: [zhihe2] need help for perl web crawler [In reply to] Can't Post


Code
use strict; 
use warnings;
use 5.010;

my $string =<<'END_OF_HTML';
<p class="explanation"></p></div><div class="weatherhistory_results_datavalue
temp"><h4>Mean Temperature</h4><p><span class="value">77.4</span> <span
class="units">&#176;F</span>
END_OF_HTML


$string =~ /<h4>Mean.+?<span .*?>(.+?)</;
say $1;

--output:--
77.4




Code
use strict;  
use warnings;
use 5.010;

my $string =<<'END_OF_HTML';
<p class="explanation"></p></div><div class="weatherhistory_results_datavalue
temp"><h4>Mean Temperature</h4><p><span class="value">77.4</span> <span
class="units">&#176;F</span>
<span class="value">89.4</span>
END_OF_HTML

#The following regex should have a period between the brackets
#but the forum software does not display it correctly:

while ($string =~ /(\d+ [] \d+)/xmsg) {
say $1;
}


--output:--
77.4
89.4



Code
use strict; 
use warnings;
use 5.010;

use LWP::Simple;
use HTML::TreeBuilder;

my $url = 'http://www.almanac.com/weather/history/zipcode/21218/2008-09-02';
my $html = get($url);


my $tree = HTML::TreeBuilder->new_from_content($html);

my @spans = $tree->look_down(
_tag => 'span',
class => 'value',
);

for my $span (@spans) {
say $span->as_trimmed_text();
}

$tree->delete();

--output:--
68.5
77.4
89.4
30.08
0.00
4.03
7.00



(This post was edited by 7stud on Oct 6, 2010, 2:36 PM)

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives