CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner: Re: [wisnoskij] Perl Documentation: Edit Log



Zhris
Enthusiast

Feb 19, 2013, 1:07 PM


Views: 587
Re: [wisnoskij] Perl Documentation

Hey,

You could modify the data after you've scraped, as i'm uncertain if it can be done via an xpath style expression:

untested

Code
#!/usr/bin/perl 
use strict;
use warnings FATAL => qw(all);
use URI;
use Web::Scraper;

my $scrape_url = "http://www.example.com";

my $scraper = scraper { process "tag1.class1", key => 'TEXT';
process "tag2.class2", name => 'TEXT'; };

my $response = $scraper->scrape( URI->new( $scrape_url ) );

$response->{name} = substr $response->{name}, 3, 6;

printf "%s, %s\n", $response->{key}, $response->{name};


I've found documentation across CPAN to be generally clear and complete. Web::Scraper does not expand into much detail, but there is enough to experiment with. It also states that "There are many examples in the eg/ dir packaged in this distribution. It is recommended to look through these". Documentation should provide enough information to support its reader's requirements without them having to study the source. If you are competent enough to study the source, then you will inevitably develop a deeper understanding of the module and its limitations. Its also a good idea to look at the modules dependencies, in the instance of Web::Scraper, HTML::TreeBuilder::XPath and HTML::Selector::XPath appear to handle xpath expressions therefore may provide additional syntax / documentation / examples. If this is the first time you've approached web scraping in Perl, although Web::Scraper has been designed to simplify the process, it would be good to research into "rawer" techniques, which give you more control at every stage of the scraping process i.e. HTML::Element.

Best regards,

Chris


(This post was edited by Zhris on Feb 19, 2013, 1:18 PM)


Edit Log:
Post edited by Zhris (Enthusiast) on Feb 19, 2013, 1:07 PM
Post edited by Zhris (Enthusiast) on Feb 19, 2013, 1:13 PM
Post edited by Zhris (Enthusiast) on Feb 19, 2013, 1:18 PM


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives