CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Intermediate:
searching txt files

 



xargon
Deleted

Mar 22, 2001, 2:11 AM

Post #1 of 4 (621 views)
searching txt files Can't Post

I've written a search script that searches for terms in txt files.
The txt files contain lines like this:
Title of Screensaver|Download Url|Description|......(and a few more properties)...

The script takes line by line and devides it into variables, like $title|$download_url etc.
The script has to search for the term that has been given in a little search form, IN the
$title and $description.

Everything works fine. If you enter "car" (without the quotes) in the form,
it returns results like "car", " car" and "care".
No problem so far. My plan was to make a checkbox at the searchform, that sais: Exact search.
If checked, it should only return the EXACT word.
I added a space before and after the searchterm: "car" becomes: " car ".
That worked fine. It didn't return results like "cars" or something like that.
Now, I want the script to return results like " car." also, since the term can also be found at
the end of a line/sentence in the $description variable.

I did that like this:

------
if($FORM{'boolean'} eq "exact") {
$term = " $termtemp "; # add spaces for the exact word in the middle of a sentence
$term2 = " $termtemp\."; # ad space before and dot behind word for the end of a sentence
}

# Search for the term
if($DESC =~ m/$term/ig || $TITLE =~ m/$term/ig || $DESC =~ m/$term2/ig) {
push(@results_tmp,"$TITLE|$DOWNLOAD_URL|$DESC|$SCREEN_URL|$FILE_SIZE|$AUTHOR|$AUTHOR_URL");
}
------

Unfortunately, this doesn't work somehow..and I can't find out why not. It returns results like
"care" again, while it didn't without the last option (word at the end of a sentence).

Any idea?



Jean
User


Mar 25, 2001, 11:07 AM

Post #2 of 4 (604 views)
Re: searching txt files [In reply to] Can't Post

Instead of using the $term and $term2 try
m/$term(\s|\.)/ig
i.e. after the term you expect one of the following characters: any whitespace (space, tab) or period.
You can update the list by adding new characters, e.g. comma (\s|\.|,)

Jean Spector
QA Engineer @ Extent Technologies, Ltd.
mage@lycosmail.com


freddo
User

Mar 25, 2001, 11:49 AM

Post #3 of 4 (603 views)
Re: searching txt files [In reply to] Can't Post

Hello

You may also want to use \b (for word boundaries), this is from "perldoc perlre":


"A word boundary (\b) is a spot between two characters that
has a \w on one side of it and a \W on the other side of it
(in either order), counting the imaginary characters off the
beginning and end of the string as matching a \W. (Within
character classes \b represents backspace rather than a word
boundary, just as it normally does in any double-quoted
string.) The \A and \Z are just like ``^'' and ``$'', except
that they won't match multiple times when the /m modifier is
used, while ``^'' and ``$'' will match at every internal
line boundary. To match the actual end of the string and not
ignore an optional trailing newline, use \z."


Here's my try from the perl debugger:


Code
$ perl -de 1 
Default die handler restored.
Loading DB routines from perl5db.pl version 1.07
Editor support available.
Enter h or `h h' for help, or `man perldebug' for more help.

main::(-e:1): 1
DB<1> $_ = "hello, this is a test! quite useful for testing, isnt it\n";

DB<2> print $1 if /\b(hello)\b/; # match
hello
DB<3> print $1 if /\b(test)\b/; # match
test
DB<4> print $1 if /\b(ful)\b/; # dont match

DB<5> print $1 if /\b(it)\b/; # match
it
DB<6> q # bye bye

I hope this helps...
see you



xargon
Deleted

Mar 25, 2001, 1:58 PM

Post #4 of 4 (600 views)
Re: searching txt files [In reply to] Can't Post

Thanks Jean and freddo!
I tested both ways, and they worked :)



 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives