CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Advanced:
Help With Meta Search

 



mike
User

Jun 29, 2001, 7:31 PM

Post #1 of 8 (2202 views)
Help With Meta Search Can't Post

Hello,

I was wondering if someone could help me out. I have had this meta search engine running on my site for a while now, and it all of a sudden won't work anymore. This is the coding I am using, I would appreciate some help, it has been driving me insane :)

This is what I have without the Form html
-------------------------------------------------


&data_parse;
&main;

sub main{
$| = 1;
if ($FORM{'q'} eq '')
{
&show();
exit;
}
&IOSocket;
$output = join(" ", @output);
$output =~ /([^ ]*?) pages found\./i;
$total = $1;
$total =~ s/[^\d+]//g;
($rsltpgs, @results, $footer) = split(/<dl>/i, $output);

$FORM{'nh'} = 0 unless($FORM{'nh'});
$prev = $FORM{'nh'} - 10;
$next = $FORM{'nh'} + 10;
$rsltpgs = qq~<p><font size=-1><b>Results Pages:\ </b>~;
$rsltpgs .= qq~<a href="$ENV{'SCRIPT_NAME'}?q=$query&nh=$prev&user=$FORM{'user'}">[<b>\<\< Prev</b>]</a>~ if ($prev >= 0);
$Remainder = $total % 10;
$Remainder = 10 - $Remainder;
$Results_Pages = ($total + $Remainder) / 10;
$Results_Pages = 20 if ($Results_Pages > 20);
for ($i = 1; $i <= $Results_Pages; $i++)
{
$nh = ($i - 1) * 10;
if ($nh == $FORM{'nh'})
{
$rsltpgs .= qq~\ \ <b>$i</b>\ ~;
}
else
{
$rsltpgs .= qq~\ \ <a href="$ENV{'SCRIPT_NAME'}?q=$query&nh=$nh&user=$FORM{'user'}">$i</a>\ ~;
}
}
$rsltpgs .= qq~\ \ <a href="$ENV{'SCRIPT_NAME'}?q=$query&nh=$next&user=$FORM{'user'}">[<b>Next \>\></b>]</a>~ if ($next < $total && $FORM{'nh'} < 190);
$rsltpgs .= qq~</font>
~;

$start = $FORM{'nh'} + 1;
$end = $start + 9;
$end = $total if ($end > $total);
$output = "<font face=arial size=2><b>Showing results $start - $end of $total</b>
Results Provided By <a href=http://www.altavista.com>Altavista</a></font>$rsltpgs<ul>\n";
my $last = $#results - 1;
for(0..$last){
($result) = split(/<\/dl>/i, $results[$_]);
$result =~ s/\[[^\]]\]//ig;
$result =~ /<a\s+href=\"([^\"]*?)\">/ig;
$url = $1;
($tmp, $title) = split(/\">/i, $result);
($title, $tmp) = split(/<\/a>/i, $title);
($ttp, $desc) = split(/<dd>/i, $tmp);
($desc, $tmp) = split(/
/i, $desc);
$output .= qq~<li><a href="$url">$title</a> - $desc<p>\n~;
}
$output .= qq~</ul>$rsltpgs~;
&show($start,$stop,$command,$input,$output,$special);
}

sub IOSocket{
use IO::Socket;
$method = "GET";
my @f = split(/\//, $input);
$host = "www.altavista.com";
$path = "/sites/search/web?q=$query&pg=q&text=yes&kl=XX&stq=$FORM{'nh'}&search=Search";
$socket = new IO::Socket::INET( PeerAddr => $host,
PeerPort => 80,
Proto => 'tcp',
Type => SOCK_STREAM, ) or die &show($start,$stop,$command,$input,"<p>Error: Unable to connect to search server<p>",$special);
print $socket "$method $path HTTP/1.0\nReferer: $host\n";
print $socket "User-Agent: $ENV{'HTTP_USER_AGENT'}\n\n";
@output = <$socket>;
close ($socket);
}


sub data_parse{
if ($ENV{"REQUEST_METHOD"} eq 'GET') {
$buffer = $ENV{'QUERY_STRING'};
}
else {
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
}



@pairs = split(/&/, $buffer);
foreach $pair (@pairs) {
($name, $value) = split(/=/, $pair);
if ($name eq 'q'){ $query = $value; }
$value =~ tr/+/ /;
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
$value =~ s/>//g;
$value =~ s/<//g;
$FORM{$name} = $value;
}
$FORM{'q'} = $FORM{'search'} if $FORM{'search'};
}




Thanks for the help.

Mike





Rivotti
User

Jul 3, 2001, 6:49 AM

Post #2 of 8 (2184 views)
Re: Help With Meta Search [In reply to] Can't Post

Hi Mike,

I think all is right... did you cheked the error_log file? What happens whem you have the problem?

Rivotti




mike
User

Jul 3, 2001, 9:11 PM

Post #3 of 8 (2174 views)
Re: Help With Meta Search [In reply to] Can't Post

Hi,

OK, nothing happens as far as 500 errors or anything, and no server errors at all. What happens is that it just doesn't display anything at all, like for instance it is supposed to print how many pages it found, instead no results come up and it says

Page 1 of


It doesn't come up with another number. It just says that and doesn't print results. It was working fine then just quit.

Mike



mike
User

Jul 10, 2001, 12:06 AM

Post #4 of 8 (2164 views)
Re: Help With Meta Search [In reply to] Can't Post

Hello,

I was wondering if anyone found anything that could help me yet. I never got a reply. Thanks guys.

Mike



Jasmine
Administrator / Moderator

Jul 10, 2001, 10:07 AM

Post #5 of 8 (2158 views)
Re: Help With Meta Search [In reply to] Can't Post

The program fetches the results just fine, but the data parsing routines are off. Perhaps altavista changed the html code that the program was relying on for parsing. There's a lot of errors when running under -w (which you should :). Here's the errors that may give you a clue about the parsing portion of the program:


Code
Use of uninitialized value in split at postid11456.pl line 75. 
Use of uninitialized value in concatenation (.) at postid11456.pl line 77.
Use of uninitialized value in substitution (s///) at postid11456.pl line 22.
Use of uninitialized value in modulus (%) at postid11456.pl line 30.
Use of uninitialized value in addition (+) at postid11456.pl line 32.
Use of uninitialized value in numeric lt (<) at postid11456.pl line 46.
Use of uninitialized value in numeric gt (>) at postid11456.pl line 52.
Use of uninitialized value in concatenation (.) at postid11456.pl line 53.
Use of uninitialized value in concatenation (.) at postid11456.pl line 53.
Undefined subroutine &main::show called at postid11456.pl line 69.

To prove that it works, but the problem's with the parsing, just add this line to the end of your IOSocket subroutine.


Code
print @output;

You'll see the output of the sockets call. So, focus on the parts after the socket call.

Hope this helps!



abstracts
Novice

Jul 10, 2001, 10:20 AM

Post #6 of 8 (2158 views)
Re: Help With Meta Search [In reply to] Can't Post

Hello,
Just a question, why are you not using CGI, LWP, HTML::Parser modules? You seem to be doing many things that these modules specifically do.
I can see you're calling almost all your subs without parameters. All your data is global which is ok for 3-liners but not for such a program.

Also, there are many modules on cpan for searching the web (do a search for altavista). Using these modules should really make your code more consise which makes it easier to spot errors and problems.

Hope this helps,

Aziz,,,



mike
User

Jul 11, 2001, 3:14 PM

Post #7 of 8 (2148 views)
Re: Help With Meta Search [In reply to] Can't Post

Hello,

OK, I am a complete moron with this stuff. I am gona need some help figuring out how to parse it all out. I think altavista changed their search database, because the address to it changed as well. I would really appreciate some help,

Thanks

Mike



mhx
Enthusiast

Jul 11, 2001, 10:36 PM

Post #8 of 8 (2143 views)
Re: Help With Meta Search [In reply to] Can't Post

Hi Mike,

there's really no need for you to parse out all that stuff. Simply because there were loads of other people who already did it for you! There are existing Perl Modules at CPAN (as mentioned in previous posts) for searching dozens of different engines. Here's a page with a list of all WWW::Search modules. The WWW::Search manpage describes in detail how these packages are used. If you have some experience with perl (which I imply from your post in the advanced forum), using this module should be no problem.

I've never worked with those packages before, but downloading and installing the modules and creating the following example took me about half an hour:

Code
#!/bin/perl -w 
use strict;
use WWW::Search qw(strip_tags);
use Text::Wrap;

my $search = new WWW::Search('AltaVista');

$search->native_query( WWW::Search::escape_query('CPAN') );

while( my $res = $search->next_result ) {
print "\n[ ", strip_tags( $res->{title} ), " ]\n\n";
print wrap( ' ', ' ', strip_tags( $res->{description} ) ), "\n\n";
print " $_\n" for @{$res->{urls}};
}

This will print all AltaVista search results for query CPAN nicely formatted and with URLs, just like this:

Code
[ CPAN ] 

Comprehensive Perl Archive Network. Welcome to CPAN! Here you will find
All Things Perl. Browsing. modules. scripts. binary distributions
("ports")...

http://www.cpan.org/

[ CPAN Module documentation ]

CPAN Module documentation. The pod files for many of the modules and
packages available under CPAN have been collected below, including the
manual...

http://theoryx5.uwinnipeg.ca/CPAN/

[ CPAN - query, download and build perl modules from CPAN sites ]

Contained in perl-5.7.1. NAME. SYNOPSIS. DESCRIPTION. Interactive Mode.
CPAN Shell. autobundle. recompile. The four CPAN Classes Author Bundle
Module...

http://theoryx5.uwinnipeg.ca/CPAN/perl/CPAN.html

I hope this will help.

-- Marcus


 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives