CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Advanced:
From beginner To Advanced: Get a multilanguage sit

 



pbuks
Deleted

Jan 5, 2001, 3:14 AM

Post #1 of 6 (1453 views)
From beginner To Advanced: Get a multilanguage sit Can't Post

Hello,

I want to thank BigRich for helping me so far.
My question was how to get a HTML-source from a remote site. Unfortunately I didn't mentioned that is was a multilanguage site (stupid me :))). Well I will post the last message BigRich has posted me maybe someone can help me how to download the site with a Perl/CGI script.
Here is the URL:http://games.skynet.be/page.html?channel=arena&pagelang=nl&subject=scores&sid=2&sort=fph&offset=50

This is what BigRich wrote :
__________________________________________
It didn't work because in your original post you asked how to retrieve the content from a simple html page when in fact the site you are trying to access is a multi-language site that is done in frames and uses cookies.

You have to have the proper cookie to access the page your are trying to access. If not, you get sent to index2.html where you get a cookie based on the menu selection you choose. It doesn't matter if it's a browser or CGI scrip, you still need the proper cookie.

You may be able to construct a UserAgent using LWP::UserAgent that can accept cookies but more than likely you'll need a bot/spider to access the information you want to get at.

You need to study the docs that came with Perl. The docs you need to concentrate on are HTTP (Cookies, Headers, Request, Response) and LWP(UserAgent, RobotUA, lwpcook, etc).

You'll also want to do a search for "perl bots" for sites and info pertaining to bots and spiders.

You could also re-post in the "Advanced" section of this forum where someone with more experince with bots/spiders may be able to help. Be sure to give the url that you are trying to access, not a simple example as you did here and explain that the site is done in frames and uses cookies.

Good luck,

BigRich
__________________________________________



BigRich
Novice

Jan 5, 2001, 9:56 PM

Post #2 of 6 (1444 views)
Re: From beginner To Advanced: Get a multilanguage sit [In reply to] Can't Post

The reason I mentioned that it's multi-language is that the pages are dynamically generated (en, nl, fr, etc) based on the contents of a cookie so you'll need a script that sends the appropriate cookie when accessing the page.

Hope that helps.

Good luck,

BigRich



pbuks
Deleted

Jan 6, 2001, 2:39 PM

Post #3 of 6 (1439 views)
Re: From beginner To Advanced: Get a multilanguage sit [In reply to] Can't Post

Hmm this cookie stuff is complicated.
I have to set the cookie into the HTML-header, then I will be able to get the HTML-source from this link >>>http://games.skynet.be/page.html?channel=arena&pagelang=nl&subject=scores&sid=2&sort=fph&offset=50

I found the cookie that was set in my system by their webserver it is attach in this post.

I cant figure out who I get this cookie from my dir and send him back to their server.
I hope you will understand my problem :))



BigRich
Novice

Jan 7, 2001, 6:04 AM

Post #4 of 6 (1435 views)
Re: From beginner To Advanced: Get a multilanguage sit [In reply to] Can't Post

I didn't have much experience using LWP::UserAgent so I thought I would see if I could come up with something to help you out.

I came up with a simple script that uses LWP::UserAgent and HTTP::Cookies to get the page you want.

It checks your cookie file (creates one if you don't already have one) to see if you have the cookie.

If not, it requests the page that sets the cookie, gets and stores the cookie, then sends another request for the page you are trying to access.

If you already have the cookie, it just requests the page you are trying to access.

I've attached it here as a text file, just rename it to .cgi, put your path/to/perl -w and it should work.

BigRich



pbuks
Deleted

Jan 7, 2001, 7:26 AM

Post #5 of 6 (1430 views)
Re: From beginner To Advanced: Get a multilanguage sit [In reply to] Can't Post

Haha yes it works. Well today the site was down :(( but it gets the site I want so, thank you for your time BigRich and when I have a problem I will find you :)))
You are the best.

Greetz Michael




BigRich
Novice

Jan 7, 2001, 11:05 AM

Post #6 of 6 (1427 views)
Re: From beginner To Advanced: Get a multilanguage sit [In reply to] Can't Post

Glad to help. I learned a little myself.

BigRich


 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives