Oct 4, 2002, 8:14 PM
Post #1 of 1
Looking for search engine script
I am looking for a search engine script that runs on a Unix server. We have no illusions about becoming another Yahoo or Google, so we're not wanting a top-of-the-line custom product ... in fact, ideally we'd find one reasonably priced "off-the-shelf".
Here is what I'd like it to do:
 Almost all the pages it would index would be located outside of our directory, so therefore, this script must be capable of crawling/indexing specifically defined external url's;
 I'd prefer to have the option to be able to add all new url's myself, so we could review the website's content prior to adding it to the index (ie, inclusion is not automatic);
 It would be useful if we could set how deep the crawl would be on these external sites (to keep the size of the database manageable);
 Going along with #3, the spider would *only* go through pages located on the submitted domain. In other words, it would not continue to spider other sites which are linked to the submitted url;
 Since I cannot write perl, I'd like to be able to set the "look" of the results pages via html, and it would be best for us to have the opportunity to make other settings via a built-in control panel (or at the very least, clearly explained admin pages);
 While it is difficult to predict how many url's would participate (and thus how many pages are ultimately going to be indexed), it should at the least be able to initially handle a couple thousand url's (perhaps 20 to 30 thousand pages??). Ideally, it would be possible to expand its capabilities at a later date as it became necessary, by purchasing additional scripting "power";
 We'd ideally want to set an automatic re-indexing schedule, to update content and remove dead links on a regular basis (though if we had to do this manually, that would be ok);
 It would rank the returned results of a search query by their relevance.
 Ideally, we could set which part of a page to index - meta tags, text, alt tags, etc;
--- These next features would be nice to have, but not essential ---
 Some type of filtering feature (to block spamming, for example);
 The script could import url's from a deliniated database, and export in the same manner.
--------- Additional notes ---------
* It is not important for this script to categorize its results from these external sites;
* It is not important that this search engine have the ability to search the web at large if it could not find results in our own database of indexed pages;
* People from external sites will not be accessing any kind of account, so it is not necessary to have password protected access.
Would appreciate any guidance as to where we can find a script that has these features....