CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Need a Custom or Prewritten Perl Program?: I need a program that...:
comparing two arrays which contain strings

 



msds
User

Aug 6, 2002, 12:37 AM

Post #1 of 6 (1597 views)
comparing two arrays which contain strings Can't Post

SmileHi! I'm new to perl, and I'm going nuts with this script! someone please help

This is what I'm trying to do.(I've pasted the code I tried below):1.Read a text file,split the text into tokens,using whitespace as delimiter,store tokens in an array

2.Do the same: split into tokens, for another file(this has a single,long, column of words) , and store tokens in a hash.

3.Compare each token from the array, with each token in the hash, and display those words which did not find a match in the hash.



#Read text from flat file database, and
#split into tokens,store in a hash


open(FH, "<spip.txt");
$lastt=0;
while($read2=<FH>)
{
@wordlist=split " ", $read2;

foreach $word(@wordlist){
chomp($word);
%lexicon=($last=>$word);

#The variable $last is auto. initialised to 0,and is used #for key values

#Print contents of lexicon

#print $lexicon{$last};
#print "\n";

$lastt++;

}
}

#print $lastt;




#Match each token(whole word) from input file with all tokens in database



print "The following words were not found in the lexicon:";
print"\n";

open(ER, "<spip.txt");

$lasttoken=0;

while($read=<ER>)

{

@tokens=split " ", $read;


foreach $token(@tokens){
chomp($token);
$lasttoken++;

}



for(my $j=0;$j<=$lasttoken;$j++)
{
#for(my $i=0;$i<=$lastt;$i++)
#{
if ($tokens[$j] ne /$lexicon{$i}/)
{
print $tokens[$j];
print "\n";

}

#}
}
#print $j;
}

will storing tokens from the input file in a hash instead of
an array speed up the search?(then i need to compare values in two hashes).
Then,.. after this works, i need to do a binary search on the flat file database(values in the hash)

e.g. The input text file can contain:

blue
house

The database file can have:
blue
green
red
yellow
......
.....

So the program should return 'house' as not having found a match in the database('blue' exists in the database)

hope you can suggest some help.my code is returning
both 'house' and 'blue' as not having found a match, when it should be returning only 'house'

Thanks,
msds


fashimpaur
User

Aug 6, 2002, 5:12 AM

Post #2 of 6 (1593 views)
Re: [msds] comparing two arrays which contain strings [In reply to] Can't Post

msds,

First of all, I have not gone through your complete code. I have only gone
so far as to look for noticeable errors and point them out. You can learn
more if I do not make the script work for you.

First, your chomp method was performed in a separate loop.
Try using it as you read the data to help performance.

Use it like this:


Code
 while (my $read2 = <FH>){ 

chomp $read2;
... do something here with data ...
}


Next, the split function takes as paramaters, a regex and a scalar string to split.

Your code is trying to use a split on a space character.

Try using the function like this:


Code
 @wordlist = split(/ /, $read2);

Also, when you are filling the @wordlist, you keep overwriting the contents of
the array, not adding to it. Use the push function to add the new elements to
the array. Like this:


Code
   
while (my $read2 = <FH>){
chomp $read2;
push @wordlist, (split(/ /, $read2));
}

Now, all words read from the filehandle will be in @wordlist.

You also have a problem with your variable names. Try using strict. It forces
you to declare variables before they are used and prevents namespace pollution.
This would have kept you from trying to use the variable $last instead of $lastt
when you are loading the hash %lexicon.

Now, as constructive criticism, you do not need to read the same file again.
You did not change the contents so, the words in @wordlist will be the same
as the words in @tokens.

Now that you have the words in @wordlist, you can search @wordlist like this:


Code
   
my $word2find = "house";

$wordfound = 0;

foreach my $word(@wordlist){
next if $word ne $word2find;
$wordfound = 1;
}


if ($wordfound) {
... do something
}
else {
... do something else
}

As far as which is faster, searching an array or searching a hash, I can only say
that it depends on how much you know. If your program can remember exactly
which key in the hash contains the value you are searching for, I would like to
see how you do it. Either way, you have to search the array or the values of the
hash.

I hope this helps. Please post back if you have further questions or just to
say that the help you got at PerlGuru solved the problem. Also, if this did
help, please spread the word about what a great site this is.

Good Luck,
Dennis

$a="c323745335d3221214b364d545".
"a362532582521254c3640504c3729".
"2f493759214b3635554c3040606a0",
print unpack"u*",pack "h*",$a,"\n\n";


msds
User

Aug 6, 2002, 10:09 PM

Post #3 of 6 (1588 views)
Re: [fashimpaur] comparing two arrays which contain strings [In reply to] Can't Post

Smile Thanx a lot for ur time Dennis.The tips were helpful for
a newbie like me, and I'll definitely spread the word around.
I goofed up a bit though,actually I'm trying to read two files, and not one.i.e.(typing mistake! I named both files
as "spip.txt" in the code I posted)

open(FH, "<dicword.txt");

and

open(ER, "<spip.txt");

i.e. I need to check which words are present in the input file ''spip.txt", but not in the database "dicword.txt",and return all words that did not find a match in the database.

So my search code will be different from that you've suggested.Anyway I'm trying the rest of your suggestions,and hope you will post me something on this too.
Thanx,
msds


fashimpaur
User

Aug 7, 2002, 7:14 AM

Post #4 of 6 (1585 views)
Re: [msds] comparing two arrays which contain strings [In reply to] Can't Post

msds,

Let me ask a question before I post my suggestion.

Can you describe in words the purpose of the two files used?

Thanks,
Dennis

$a="c323745335d3221214b364d545".
"a362532582521254c3640504c3729".
"2f493759214b3635554c3040606a0",
print unpack"u*",pack "h*",$a,"\n\n";


davorg
Thaumaturge

Aug 7, 2002, 8:36 AM

Post #5 of 6 (1582 views)
Re: [msds] comparing two arrays which contain strings [In reply to] Can't Post

Is there any good reason why wer'e discussing this in two places - both here and in the beginners forum?

--
Dave Cross, Perl Hacker, Trainer and Writer
http://www.dave.org.uk/
Get more help at Perl Monks


msds
User

Aug 7, 2002, 9:51 PM

Post #6 of 6 (1574 views)
Re: [fashimpaur] comparing two arrays which contain strings [In reply to] Can't Post

Smile Hi Dennis,
thanx for the reply.

Actually, I have to find words which are present in one file, but not in the other.

However I've been discussing this problem in another forum too(guess you know), and have more or less got the hang of it, however I did learn from your tips.My apologies for the mess.

But I have a different question for you: I need a program which reads each word from a text file, and stores some properties of that word in a record, like a "C" structure.
Is it possible to do that in Perl? How do I go about it?

Thanx,
msds

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives