CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
word frequency

 



viddy
Novice

Jun 24, 2008, 10:06 AM

Post #1 of 7 (1577 views)
word frequency Can't Post

I was trying to code a program so that it would grep through a text file and look for a user-specified word, then count how many times it appears in the file. However, I cannot get the grep command to work properly (atleast I think so, it might be a logic error too). I was hoping someone could point me into the right direction and give me some feedback if I am approaching this program in the wrong direction. Thanks!

heres my script...


Code
  1 #!/usr/bin/perl 
2 use strict;
3 use warnings;
4
5 print "
6 ===========================================================================
7 This program searches through a file and counts how many times a word is in it\n
8 ===========================================================================
9 ";
10 print "Which file would you like to use?\n";
11 my $file = <>;
12 chomp($file);
13 print "Which word do you want to search for?\n";
14 my $word = <>;
15 chomp($word);
16 open (FILE, $file) or die $!;
17 my @file = <FILE>;
18 while (<FILE>)
19 {
20 my @grep = grep($word,@file);
21 print my $grep;
22 }
23
24 print "This is your file: $file\n";
25 print "This is your word: $word\n";



PGScooter
stranger

Jun 24, 2008, 10:27 AM

Post #2 of 7 (1572 views)
Re: [viddy] word frequency [In reply to] Can't Post

an easier way would be to use regexes directly.

Instead of reading in the file as an array (of lines), read everything in as one string.

Then, do something like

Code
my $count=($filestring=~/$word/g);


I have not tested this code, but it or at least something like it should do the trick!

good luck, and if you have trouble post back and when I have more time I'll help you figure it out.
The more you teach me, the more I learn. The more I learn, the more I teach.


meloyelo
User

Jun 24, 2008, 10:32 AM

Post #3 of 7 (1571 views)
Re: [viddy] word frequency [In reply to] Can't Post

First problem:

Code
open (FILE, $file) or die $!;  
my @file = <FILE>;
while (<FILE>) { ... }

The while loop will be executeed exactly zero times. Omit the my @file = <FILE> assignment.

Secondly, do you want to count the number of occurrences of the word or do you want to print the lines on which the word is found? The following code will count occurrences:

Code
... 
while (<FILE>) {
my $n = grep(qr/\Q$word\E/, $_);
$count += $n; # $n = number of matches on this line
}
print "Total matches: $count\n";



viddy
Novice

Jun 24, 2008, 10:57 AM

Post #4 of 7 (1569 views)
Re: [meloyelo] word frequency [In reply to] Can't Post

Thanks for the replies, I omitted the broken loop and tried meloyelo's fix, however it just prints the total amount of lines in the file. I would imagine that the grep statement is off by a little but i do not understand any of (qr/\Q$word\E/, $_) in the least to attempt to fix it. Did I do something wrong or is it just set up so that it will count all lines in file. here is my updated script

#!/usr/bin/perl
2 use strict;
3 use warnings;
4
5 print "
6 ===========================================================================
7 This program searches through a file and counts how many times a word is in it\n
8 ===========================================================================
9 ";
10 print "Which file would you like to use?\n";
11 my $file = <>;
12 chomp($file);
13 print "Which word do you want to search for?\n";
14 my $word = <>;
15 chomp($word);
16 open (FILE, $file) or die $!;
17 while (<FILE>)
18 {
19 my $n = grep(qr/\Q$word\E/, $_);
20 $count += $n;
21 }
22
23 print "This is your file: $file\n";
24 print "This is your word: $word\n";
25 print "Total matches: $count\n";


Also, I had to take off strict and warnings because it was complaining about global symbols conflicting with $count - since it is in a loop and then called outside of it. How can I fix this? Sorry, im quite new to programming. Thanks alot!


meloyelo
User

Jun 24, 2008, 11:05 AM

Post #5 of 7 (1567 views)
Re: [viddy] word frequency [In reply to] Can't Post

Sorry, I got confused by the user of grep. Try this:

Code
... 
my @list = m/\Q$word\E/g;
my $n = scalar(@list);
$count += $n;
...


In Reply To


viddy
Novice

Jun 24, 2008, 11:20 AM

Post #6 of 7 (1566 views)
Re: [meloyelo] word frequency [In reply to] Can't Post

That works like a charm. Thank you very much! Could you briefly explain what these two lines do?
my @list = m/\Q$word\E/g;
my $n = scalar(@list);

I would greatly appreciate it! Thanks again


KevinR
Veteran


Jun 24, 2008, 11:56 AM

Post #7 of 7 (1562 views)
Re: [viddy] word frequency [In reply to] Can't Post


In Reply To
Thanks for the replies, I omitted the broken loop and tried meloyelo's fix, however it just prints the total amount of lines in the file. I would imagine that the grep statement is off by a little but i do not understand any of (qr/\Q$word\E/, $_) in the least to attempt to fix it. Did I do something wrong or is it just set up so that it will count all lines in file. here is my updated script

#!/usr/bin/perl
2 use strict;
3 use warnings;
4
5 print "
6 ===========================================================================
7 This program searches through a file and counts how many times a word is in it\n
8 ===========================================================================
9 ";
10 print "Which file would you like to use?\n";
11 my $file = <>;
12 chomp($file);
13 print "Which word do you want to search for?\n";
14 my $word = <>;
15 chomp($word);
16 open (FILE, $file) or die $!;
17 while (<FILE>)
18 {
19 my $n = grep(qr/\Q$word\E/, $_);
20 $count += $n;
21 }
22
23 print "This is your file: $file\n";
24 print "This is your word: $word\n";
25 print "Total matches: $count\n";


Also, I had to take off strict and warnings because it was complaining about global symbols conflicting with $count - since it is in a loop and then called outside of it. How can I fix this? Sorry, im quite new to programming. Thanks alot!


The program posted above will not compile. $count was never declared with "my".

Are you trying to count whole words or substrings? A substring would be "air" in "airplane". If you are counting words then a regexp is probably the least efficient way to count the frequency. Regexps are for patterns, not things like words, which are strings. It would be better coded using string operators.
-------------------------------------------------

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives