Home: Perl Programming Help: Beginner:
Suggestions needed!!!



dianazhengdz
Novice

Jun 19, 2005, 4:39 PM


Views: 4896
Suggestions needed!!!

Hi,

I have input that is like this


Code
testing test test 
This is a testing program
test test test test test test




What are the method that i can use to produce a output that is


Code
1 This 
1 is
1 a
1 program
2 testing
8 test



KevinR
Veteran


Jun 19, 2005, 6:50 PM


Views: 4894
Re: [dianazhengdz] Suggestions needed!!!

Use a hash to count the instances of unique words.
-------------------------------------------------


dianazhengdz
Novice

Jun 19, 2005, 7:06 PM


Views: 4891
Re: [KevinR] Suggestions needed!!!

Beside using a hash, is there any other way to do that.... like using an array instead....

Can you give me an example....


KevinR
Veteran


Jun 19, 2005, 10:50 PM


Views: 4889
Re: [dianazhengdz] Suggestions needed!!!

assuming the lines you posted are lines in a file, you can open and read the file and count the number of unique words like this:


Code
#!perl  
use strict;

my %hash = ();
open(FILE,'path/to/file.txt') or die "Can't open the file: $!";
while(<FILE>){
chomp;
my (@words) = split(/\s+/);
$hash{$_}++ for @words;
}
close(FILE);
foreach my $keys (sort {$hash{$a} <=> $hash{$b}} keys %hash) {
print "$hash{$keys} = $keys\n";
}


when I tested the code it printed:


Code
1 = a 
1 = is
1 = This
1 = program
2 = testing
8 = test

-------------------------------------------------


dianazhengdz
Novice

Jun 20, 2005, 6:56 AM


Views: 4886
Re: [KevinR] Suggestions needed!!!

ya, i get that result too....

i was wondering why the output given print "This" first instead of "a" first.....


KevinR
Veteran


Jun 20, 2005, 10:38 AM


Views: 4882
Re: [dianazhengdz] Suggestions needed!!!

there is no easy way to get the printed output to look like this:

1 This
1 is
1 a
1 program
2 testing
8 test

unless you know all the words and print them in that order manually. Not even using alternate sorting would produce that result.
-------------------------------------------------


dianazhengdz
Novice

Jun 20, 2005, 5:37 PM


Views: 4881
Re: [KevinR] Suggestions needed!!!

Oh, i see...

Thanks for your help....

Because i am a beginner in perl, i don't think i can do it that way.....

If you could, can you show me how that can be done..

Thanks...


(This post was edited by dianazhengdz on Jun 20, 2005, 6:43 PM)


KevinR
Veteran


Jun 20, 2005, 7:51 PM


Views: 4875
Re: [dianazhengdz] Suggestions needed!!!

Thanks for your help....

>> You're welcome


Because i am a beginner in perl, i don't think i can do it that way.....

>> do it what way?

If you could, can you show me how that can be done..

>> how what can be done?

I have show you how to count the instance of each unique word using a hash. Did you wish to have some of the code I posted explained in more detail?
-------------------------------------------------


dianazhengdz
Novice

Jun 20, 2005, 10:20 PM


Views: 4873
Re: [KevinR] Suggestions needed!!!

oh.. i have done out the code, no further explaination needed....

What i mean is that, how should the code be like, for it to produce the result of the sample output that i have posted earlier.... for example "This" should be at the top...


KevinR
Veteran


Jun 20, 2005, 10:48 PM


Views: 4871
Re: [dianazhengdz] Suggestions needed!!!


In Reply To
What i mean is that, how should the code be like, for it to produce the result of the sample output that i have posted earlier.... for example "This" should be at the top...


If the words form a sentence like your example, and you know the sentence beforehand then you can dictate the order of the words to be printed. But if you do not know the order that the words should be in to form a sentence then there is no way to get the order correct. I will assume you must know what the sentence is and are just counting words for some unknown reason:




Code
#!perl  
use strict;

my %hash = ();
open(FILE,'path/to/file.txt') or die "Can't open the file: $!";
while(<DATA>){
chomp;
my (@words) = split(/\s+/);
$hash{$_}++ for @words;
}

print qq~
$hash{'This'} = This
$hash{'is'} = is
$hash{'a'} = a
$hash{'program'} = program
$hash{'testing'} = testing
$hash{'test'} = test
~;

-------------------------------------------------


dianazhengdz
Novice

Jun 20, 2005, 11:00 PM


Views: 4869
Re: [KevinR] Suggestions needed!!!

Cool Oh, what you mean is to hard code and print them out in the order given....

I see i see....

Thanks...


KevinR
Veteran


Jun 20, 2005, 11:04 PM


Views: 4868
Re: [dianazhengdz] Suggestions needed!!!


In Reply To
Cool Oh, what you mean is to hard code and print them out in the order given....

I see i see....

Thanks...


That's right. Smile
-------------------------------------------------


rork
User

Jun 22, 2005, 12:04 AM


Views: 4859
Re: [KevinR] Suggestions needed!!!

I agree there is no easy way to do it, but it isn't impossible. (There are few things (if any) that are impossible with Perl)


Code
#!perl 

use strict;
use warnings;

my %count = ();
my @order;
my %sorted;

# open(DATA,'path/to/file.txt') or die "Can't open the file: $!";
while(<DATA>){
chomp;
my (@words) = split(/\s+/);
foreach my $word(@words) {
unless (exists $count{$word}) {
push @order, $word;
}
$count{$word}++;
}
}

foreach my $word(@order) {
unless (exists $sorted{$count{$word}}) {
$sorted{$count{$word}} = [];
}
push @{$sorted{$count{$word}}}, $word;
}

foreach my $i(sort keys %sorted) {
foreach my $word(@{$sorted{$i}}) {
print "$count{$word} = $word\n";
}
}

__DATA__
testing test test
This is a testing program
test test test test test test


Remove the # before open en everything from __DATA__ to the end to adapt it for opening a file.

I'll try to explain.
I use an array (@order) to memorize the order words were found in, if a word already appeared in the hash and in the array it's not added. Arrays have a fixed order, the keys of a hash don't.

Then I itter over the array storing the data in a hash (%sorted). The key is the times a word is found, the value is an
array reference containing all the words that are found that many times.

Now I sort %sorted and print the output:

1 = This
1 = is
1 = a
1 = program
2 = testing
8 = test
--
Don't reinvent the wheel, use it, abuse it or hack it.


KevinR
Veteran


Jun 22, 2005, 10:04 AM


Views: 4858
Re: [rork] Suggestions needed!!!

that is impressive, but it seems the answer is too contrived. If it were a random sentence with random quantities per word, the output will be very different.
-------------------------------------------------


rork
User

Jun 22, 2005, 2:01 PM


Views: 4855
Re: [dianazhengdz] Suggestions needed!!!

Thanks,

Well, as far as I can see dianazhengdz wanted to sort all words of the example by number of occurrence first, and then on order of appearance.

This is what the script does with any text.
--
Don't reinvent the wheel, use it, abuse it or hack it.


dianazhengdz
Novice

Jun 23, 2005, 12:40 AM


Views: 4851
Re: [rork] Suggestions needed!!!

Woow... That was really impressive....

Thanks...


dianazhengdz
Novice

Jun 23, 2005, 12:49 AM


Views: 4849
Re: [dianazhengdz] Suggestions needed!!!

Now i have another problem...

The problem is when we enter a number(that is powerset), we will have to print out the subset of that power set. Eg input

Code
2



My output is like this

Code
NULL 
1
2
1,2

Because the powerset of 2 is P(2) = { NULL, {1}, {2}, {1,2} }
Any suggestion on how can i do that??
Thanks


(This post was edited by dianazhengdz on Jun 23, 2005, 8:36 AM)


davorg
Thaumaturge / Moderator

Jun 23, 2005, 9:32 AM


Views: 4837
Re: [dianazhengdz] Suggestions needed!!!

Please start a new discussion thread for a new question.

And, when you do, please include the definition of a "powerset".

--
Dave Cross, Perl Hacker, Trainer and Writer
http://www.dave.org.uk/
Get more help at Perl Monks