CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Help with HTML::TokeParser

 



StarkRavingCalm
User

Oct 12, 2015, 11:23 AM

Post #1 of 1 (595 views)
Help with HTML::TokeParser Can't Post

I have a script I am working on as a POC that will go out to a webpage and get a listing of files on the page.
For the POC, I am using the local machine.

I have two questions\issues.

I would ultimatly like to create a hash out of the data with filename as key and file size as value.
I was only able to get filename, so I changed the script to use an array instead of a hash.
Will this module give me filesize so I can use a hash?

Also, I am pleased with the output but I get garbage at the beginning, how can I filter this out?
(in the output below, I only want files)
Is there a way to do it within TokeParser or should I use a regex?
My preference is to do it in TokeParser

Or, is there a better module for what I am trying to do?

Here is my code with an array:


Code
use HTML::TokeParser; 
use LWP::Simple;
use File::Basename;
use List::Compare;

my $page=get('http://localhost/docs');

my %urlhash;
my @urlfiles;

my $p= HTML::TokeParser->new(\$page);

while (my $token = $p->get_tag("a")) {

my @array = $token->[1]{href} || "-";
my $text = $p->get_trimmed_text("/a");
print join("\n",@array),"\n";



Output:

?C=N;O=D
?C=M;O=A
?C=S;O=A
?C=D;O=A
/
pic1.jpg
pic2.jpg
pic3.jpg

Thanks in advance

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives