CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Intermediate:
Login to SSL Site and click on a download link using Perl

 



u356137
New User

Jan 11, 2015, 7:52 AM

Post #1 of 4 (2967 views)
Login to SSL Site and click on a download link using Perl Can't Post

Hi,

I am trying to creating a crawler that logs into a SSL site and download's a CSV File.

I basically used WWW::Mechanize module to login into the SSL site and try to download the file. But whenever i submit a follow_link, I do not get any result:

$mech->follow_link(tag => 'a', text_regex => qr/Download/i );

The link should actually download the file.

I went ahead and printed out the link's url like this:
$link->url
And all I got as a result was:
#
I was expecting the link to be something like https://.../filename.csv
but it seems it is a hidden link of some kind. I checked the html source and it does infact have # as a href.

I previously used VBScript that automated this process. But it would physically open up the IExplorer and click on the download button.
I was able to resolve that in VB using the following code:
IE.Document.All.tags("a").item(17).click()
As I found out the 17th tag item was for Download.

My question here is -> Can I do the same in Perl with follow_link?
Or is there any other way to do it?


Zhris
Enthusiast

Jan 11, 2015, 8:56 AM

Post #2 of 4 (2963 views)
Re: [u356137] Login to SSL Site and click on a download link using Perl [In reply to] Can't Post

Hi,

Inevitably I can only go on the description you have provided, but it is common to provide a dummy placeholder # href for hyperlinks that trigger a javascript function to handle the action instead, therefore I assume this is the case here.

If this is indeed the case you would probably want to look into using modules that have javascript capabilities such as WWW::Mechanize::Firefox, which uses "Firefox as if it were WWW::Mechanize".

Regards,

Chris


(This post was edited by Zhris on Jan 11, 2015, 8:57 AM)


u356137
New User

Jan 11, 2015, 2:42 PM

Post #3 of 4 (2952 views)
Re: [Zhris] Login to SSL Site and click on a download link using Perl [In reply to] Can't Post

Hi Zhris,

Thanks for your response. I think you are right about it being a Javascript Function Call.

Just pasting the Javascript so that you can advice if, in fact, this Javacript call can be automated by Perl:

# View Source Result --> Function Call

<a href="#" onclick="if(typeof jsfcljs == 'function'){
jsfcljs(document.forms['j_id62'],'j_id62:j_id63:4:j_id65:j__id67:0:j__id67:1::j_id69,j_id62:j_id63:4:j_id65:j__id67:0:j__id67:1::j_id69','');
}return false">Download complete register <br/>(CSV)</a>

# Functions Being Called

function jsfcljs(f, pvp, t)
{
apf(f, pvp);

if (t)
{
f.target = t;
}

f.submit();
dpf(f);
};


function dpf(f)
{
var adp = f.adp;
if (adp != null)
{
for (var i = 0;i < adp.length;i++) {
f.removeChild(adp);
}
}
};

function apf(f, pvp)
{
var adp = new Array();
f.adp = adp;
var ps = pvp.split(',');
for (var i = 0,ii = 0;i < ps.length;i++,ii++)
{
var p = document.createElement("input");
p.type = "hidden";
p.name = ps;
p.value = ps[i + 1];
f.appendChild(p);
adp[ii] = p;
i += 1;
}
};

Thanks again for looking into this.



(This post was edited by u356137 on Jan 11, 2015, 2:42 PM)


Zhris
Enthusiast

Jan 12, 2015, 7:03 AM

Post #4 of 4 (2934 views)
Re: [u356137] Login to SSL Site and click on a download link using Perl [In reply to] Can't Post

Hi,

I can't say from personal experience, I have never before had to consider javascript when scraping pages, but from researching this in the past, my advice is to try out one of the modules I linked to and see if you are able to achieve the result you desire. If you are unable to solve then post your code and we can perhaps work out how to proceeed.

Regards,

Chris

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives