Home: Perl Programming Help: Frequently Asked Questions:
How can I print out a word-frequency or line-frequ



Jasmine
Administrator

Mar 15, 2001, 6:05 AM


Views: 33183
How can I print out a word-frequency or line-frequ

How can I print out a word-frequency or line-frequency summary?

To do this, you have to parse out each word in the input stream. We'll pretend that by word you mean chunk of alphabetics, hyphens, or apostrophes, rather than the non-whitespace chunk idea of a word given in the previous question:


Code
    while (<>) { 
while ( /(\b[^\W_\d][\w'-]+\b)/g ) { # misses "`sheep'"
$seen{$1}++;
}
}
while ( ($word, $count) = each %seen ) {
print "$count $word\n";
}

If you wanted to do the same thing for lines, you wouldn't need a regular expression:


Code
    while (<>) {  
$seen{$_}++;
}
while ( ($line, $count) = each %seen ) {
print "$count $line";
}

If you want these output in a sorted order, see the section on Hashes.