Jan 5, 2001, 3:14 AM
Post #1 of 6
From beginner To Advanced: Get a multilanguage sit
I want to thank BigRich for helping me so far.
My question was how to get a HTML-source from a remote site. Unfortunately I didn't mentioned that is was a multilanguage site (stupid me :))). Well I will post the last message BigRich has posted me maybe someone can help me how to download the site with a Perl/CGI script.
Here is the URL:http://games.skynet.be/page.html?channel=arena&pagelang=nl&subject=scores&sid=2&sort=fph&offset=50
This is what BigRich wrote :
You have to have the proper cookie to access the page your are trying to access. If not, you get sent to index2.html where you get a cookie based on the menu selection you choose. It doesn't matter if it's a browser or CGI scrip, you still need the proper cookie.
You may be able to construct a UserAgent using LWP::UserAgent that can accept cookies but more than likely you'll need a bot/spider to access the information you want to get at.
You need to study the docs that came with Perl. The docs you need to concentrate on are HTTP (Cookies, Headers, Request, Response) and LWP(UserAgent, RobotUA, lwpcook, etc).
You'll also want to do a search for "perl bots" for sites and info pertaining to bots and spiders.