CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
Search Posts SEARCH
Who's Online WHO'S
Log in LOG

Home: Perl Programming Help: Advanced:
Help With Meta Search



Jun 29, 2001, 7:31 PM

Post #1 of 8 (2819 views)
Help With Meta Search Can't Post


I was wondering if someone could help me out. I have had this meta search engine running on my site for a while now, and it all of a sudden won't work anymore. This is the coding I am using, I would appreciate some help, it has been driving me insane :)

This is what I have without the Form html


sub main{
$| = 1;
if ($FORM{'q'} eq '')
$output = join(" ", @output);
$output =~ /([^ ]*?) pages found\./i;
$total = $1;
$total =~ s/[^\d+]//g;
($rsltpgs, @results, $footer) = split(/<dl>/i, $output);

$FORM{'nh'} = 0 unless($FORM{'nh'});
$prev = $FORM{'nh'} - 10;
$next = $FORM{'nh'} + 10;
$rsltpgs = qq~<p><font size=-1><b>Results Pages:\ </b>~;
$rsltpgs .= qq~<a href="$ENV{'SCRIPT_NAME'}?q=$query&nh=$prev&user=$FORM{'user'}">[<b>\<\< Prev</b>]</a>~ if ($prev >= 0);
$Remainder = $total % 10;
$Remainder = 10 - $Remainder;
$Results_Pages = ($total + $Remainder) / 10;
$Results_Pages = 20 if ($Results_Pages > 20);
for ($i = 1; $i <= $Results_Pages; $i++)
$nh = ($i - 1) * 10;
if ($nh == $FORM{'nh'})
$rsltpgs .= qq~\ \ <b>$i</b>\ ~;
$rsltpgs .= qq~\ \ <a href="$ENV{'SCRIPT_NAME'}?q=$query&nh=$nh&user=$FORM{'user'}">$i</a>\ ~;
$rsltpgs .= qq~\ \ <a href="$ENV{'SCRIPT_NAME'}?q=$query&nh=$next&user=$FORM{'user'}">[<b>Next \>\></b>]</a>~ if ($next < $total && $FORM{'nh'} < 190);
$rsltpgs .= qq~</font>

$start = $FORM{'nh'} + 1;
$end = $start + 9;
$end = $total if ($end > $total);
$output = "<font face=arial size=2><b>Showing results $start - $end of $total</b>
Results Provided By <a href=>Altavista</a></font>$rsltpgs<ul>\n";
my $last = $#results - 1;
($result) = split(/<\/dl>/i, $results[$_]);
$result =~ s/\[[^\]]\]//ig;
$result =~ /<a\s+href=\"([^\"]*?)\">/ig;
$url = $1;
($tmp, $title) = split(/\">/i, $result);
($title, $tmp) = split(/<\/a>/i, $title);
($ttp, $desc) = split(/<dd>/i, $tmp);
($desc, $tmp) = split(/
/i, $desc);
$output .= qq~<li><a href="$url">$title</a> - $desc<p>\n~;
$output .= qq~</ul>$rsltpgs~;

sub IOSocket{
use IO::Socket;
$method = "GET";
my @f = split(/\//, $input);
$host = "";
$path = "/sites/search/web?q=$query&pg=q&text=yes&kl=XX&stq=$FORM{'nh'}&search=Search";
$socket = new IO::Socket::INET( PeerAddr => $host,
PeerPort => 80,
Proto => 'tcp',
Type => SOCK_STREAM, ) or die &show($start,$stop,$command,$input,"<p>Error: Unable to connect to search server<p>",$special);
print $socket "$method $path HTTP/1.0\nReferer: $host\n";
print $socket "User-Agent: $ENV{'HTTP_USER_AGENT'}\n\n";
@output = <$socket>;
close ($socket);

sub data_parse{
if ($ENV{"REQUEST_METHOD"} eq 'GET') {
$buffer = $ENV{'QUERY_STRING'};
else {
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});

@pairs = split(/&/, $buffer);
foreach $pair (@pairs) {
($name, $value) = split(/=/, $pair);
if ($name eq 'q'){ $query = $value; }
$value =~ tr/+/ /;
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
$value =~ s/>//g;
$value =~ s/<//g;
$FORM{$name} = $value;
$FORM{'q'} = $FORM{'search'} if $FORM{'search'};

Thanks for the help.



Jul 3, 2001, 6:49 AM

Post #2 of 8 (2801 views)
Re: Help With Meta Search [In reply to] Can't Post

Hi Mike,

I think all is right... did you cheked the error_log file? What happens whem you have the problem?



Jul 3, 2001, 9:11 PM

Post #3 of 8 (2791 views)
Re: Help With Meta Search [In reply to] Can't Post


OK, nothing happens as far as 500 errors or anything, and no server errors at all. What happens is that it just doesn't display anything at all, like for instance it is supposed to print how many pages it found, instead no results come up and it says

Page 1 of

It doesn't come up with another number. It just says that and doesn't print results. It was working fine then just quit.



Jul 10, 2001, 12:06 AM

Post #4 of 8 (2781 views)
Re: Help With Meta Search [In reply to] Can't Post


I was wondering if anyone found anything that could help me yet. I never got a reply. Thanks guys.


Administrator / Moderator

Jul 10, 2001, 10:07 AM

Post #5 of 8 (2775 views)
Re: Help With Meta Search [In reply to] Can't Post

The program fetches the results just fine, but the data parsing routines are off. Perhaps altavista changed the html code that the program was relying on for parsing. There's a lot of errors when running under -w (which you should :). Here's the errors that may give you a clue about the parsing portion of the program:

Use of uninitialized value in split at line 75. 
Use of uninitialized value in concatenation (.) at line 77.
Use of uninitialized value in substitution (s///) at line 22.
Use of uninitialized value in modulus (%) at line 30.
Use of uninitialized value in addition (+) at line 32.
Use of uninitialized value in numeric lt (<) at line 46.
Use of uninitialized value in numeric gt (>) at line 52.
Use of uninitialized value in concatenation (.) at line 53.
Use of uninitialized value in concatenation (.) at line 53.
Undefined subroutine &main::show called at line 69.

To prove that it works, but the problem's with the parsing, just add this line to the end of your IOSocket subroutine.

print @output;

You'll see the output of the sockets call. So, focus on the parts after the socket call.

Hope this helps!


Jul 10, 2001, 10:20 AM

Post #6 of 8 (2775 views)
Re: Help With Meta Search [In reply to] Can't Post

Just a question, why are you not using CGI, LWP, HTML::Parser modules? You seem to be doing many things that these modules specifically do.
I can see you're calling almost all your subs without parameters. All your data is global which is ok for 3-liners but not for such a program.

Also, there are many modules on cpan for searching the web (do a search for altavista). Using these modules should really make your code more consise which makes it easier to spot errors and problems.

Hope this helps,



Jul 11, 2001, 3:14 PM

Post #7 of 8 (2765 views)
Re: Help With Meta Search [In reply to] Can't Post


OK, I am a complete moron with this stuff. I am gona need some help figuring out how to parse it all out. I think altavista changed their search database, because the address to it changed as well. I would really appreciate some help,




Jul 11, 2001, 10:36 PM

Post #8 of 8 (2760 views)
Re: Help With Meta Search [In reply to] Can't Post

Hi Mike,

there's really no need for you to parse out all that stuff. Simply because there were loads of other people who already did it for you! There are existing Perl Modules at CPAN (as mentioned in previous posts) for searching dozens of different engines. Here's a page with a list of all WWW::Search modules. The WWW::Search manpage describes in detail how these packages are used. If you have some experience with perl (which I imply from your post in the advanced forum), using this module should be no problem.

I've never worked with those packages before, but downloading and installing the modules and creating the following example took me about half an hour:

#!/bin/perl -w 
use strict;
use WWW::Search qw(strip_tags);
use Text::Wrap;

my $search = new WWW::Search('AltaVista');

$search->native_query( WWW::Search::escape_query('CPAN') );

while( my $res = $search->next_result ) {
print "\n[ ", strip_tags( $res->{title} ), " ]\n\n";
print wrap( ' ', ' ', strip_tags( $res->{description} ) ), "\n\n";
print " $_\n" for @{$res->{urls}};

This will print all AltaVista search results for query CPAN nicely formatted and with URLs, just like this:

[ CPAN ] 

Comprehensive Perl Archive Network. Welcome to CPAN! Here you will find
All Things Perl. Browsing. modules. scripts. binary distributions

[ CPAN Module documentation ]

CPAN Module documentation. The pod files for many of the modules and
packages available under CPAN have been collected below, including the

[ CPAN - query, download and build perl modules from CPAN sites ]

Contained in perl-5.7.1. NAME. SYNOPSIS. DESCRIPTION. Interactive Mode.
CPAN Shell. autobundle. recompile. The four CPAN Classes Author Bundle

I hope this will help.

-- Marcus


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives