CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Intermediate:
HELP

 



yaroba
Novice

Mar 29, 2013, 6:34 PM

Post #1 of 11 (1127 views)
HELP Can't Post

I need to write a program that counts words seperated by spaces, double spaces, commas and dots. I wrote it, but it does not work right. Can anyone help me? Here is the code.

open(FILE, "1z.txt") or die "Negaliu atidaryt: $!";
my ($words) = (0,0,0);
while (<FILE>) {
chomp;
$a=$words = split(/ {2,}/, $_);
$b=$words = split(/ /, $_);
$c=$words = split(/./, $_);
$d=$words = split(/,/, $_);
$sum= $a+$b+$c+$d
}
print("$a, $b, $c, $d\n");
print("words=$sum\n");


(This post was edited by yaroba on Mar 29, 2013, 6:34 PM)


BillKSmith
Veteran

Mar 29, 2013, 9:11 PM

Post #2 of 11 (1122 views)
Re: [yaroba] HELP [In reply to] Can't Post

I am not clear what you want to do. I do see some mistakes.

Words is an array, not a scalar. Use @words not $words.

Special characters in your Regex have to be escaped with a backslash.

You are printing data only from the last line.

If you provide a sample of your input and the output that your expect, I can probably provide more help.
Good Luck,
Bill


Laurent_R
Veteran / Moderator

Mar 30, 2013, 3:19 AM

Post #3 of 11 (1113 views)
Re: [yaroba] HELP [In reply to] Can't Post

You may try something along these lines:


Code
my $c = "The quick brown     fox. The    lazy dog. One, two, three."; 
my @e;
my $d = (@e = split /[, ]+/, $c);
print "There are $d words\n";


It will print that there are 10 words in the input string.

You could also split on all non word characters, repeated or not:


Code
my $d = (@e = split /\W+/, $c);



BillKSmith
Veteran

Mar 30, 2013, 8:00 AM

Post #4 of 11 (1108 views)
Re: [Laurent_R] HELP [In reply to] Can't Post

Can we define word as a sequence of non-separation characters?

Code
use strict; 
use warnings;
local $_ = "The quick brown fox. The lazy dog. One, two, three.";
print scalar((my @matches) = /[^., ]+/g);

Good Luck,
Bill


yaroba
Novice

Mar 30, 2013, 8:28 AM

Post #5 of 11 (1103 views)
Re: [yaroba] HELP [In reply to] Can't Post

I need a program that read file from disk, file contains of words that are separated by spaces, double spaces, commas and dots. And the program selects only those words who starts with a or b letter and puts them to other file alphabeticaly. Also finds longest, shortest words.


Kenosis
User

Mar 30, 2013, 9:28 AM

Post #6 of 11 (1096 views)
Re: [yaroba] HELP [In reply to] Can't Post

Perhaps the following will get you started:


Code
use strict; 
use warnings;

my @words;
my $shortest = my $longest = '';

while (<DATA>) {
for ( grep { $_ = uc; /^(?:A|B)/ } split /[\x20,.]+/ ) {
push @words, $_;
$shortest = $_ if length $_ < length $shortest or !$shortest;
$longest = $_ if length $_ > length $longest;
}
}

print "$_\n" for sort @words;
print "\nShortest: $shortest\nLongest: $longest";

__DATA__
I need a program that read file from disk, file contains of words that are separated by spaces, double spaces, commas and dots.
And the program selects only those words who starts with a or b letter and puts them to other file alphabeticaly.
Also finds longest, shortest words.


Output:


Code
A 
A
ALPHABETICALY
ALSO
AND
AND
AND
ARE
B
BY

Shortest: A
Longest: ALPHABETICALY


The code in the for() explained:


Code
grep { $_ = uc; /^(?:A|B)/ } split /[\x20,.]+/ 
^ ^ ^ ^
| | | |
| | | + - Split line on space (\x20), comma or period te get 'words'
| | + - Regex match: word begins with either "A" or "B"
| + - Convert to all upper-case
+ - If the regex successfully matches, let the 'word' pass from the split


Your specs say "words," and the above generates a list of word tokens, so you see some words repeated. However, if you want word types, you can use a hash instead of an array to keep track of those:


Code
use strict; 
use warnings;

my %words;
my $shortest = my $longest = '';

while (<DATA>) {
for ( grep { $_ = uc; /^(?:A|B)/ } split /[\x20,.]+/ ) {
$words{$_}++;
$shortest = $_ if length $_ < length $shortest or !$shortest;
$longest = $_ if length $_ > length $longest;
}
}

print "$_\n" for sort keys %words;
print "\nShortest: $shortest\nLongest: $longest";

__DATA__
I need a program that read file from disk, file contains of words that are separated by spaces, double spaces, commas and dots.
And the program selects only those words who starts with a or b letter and puts them to other file alphabeticaly.
Also finds longest, shortest words.


Output:


Code
A 
ALPHABETICALY
ALSO
AND
ARE
B
BY

Shortest: A
Longest: ALPHABETICALY


Hope this helps!


(This post was edited by Kenosis on Mar 30, 2013, 12:34 PM)


yaroba
Novice

Mar 30, 2013, 12:07 PM

Post #7 of 11 (1087 views)
Re: [Kenosis] HELP [In reply to] Can't Post

Thanks ! :)


Kenosis
User

Mar 30, 2013, 12:20 PM

Post #8 of 11 (1084 views)
Re: [yaroba] HELP [In reply to] Can't Post

You're most welcome, yaroba!


BillKSmith
Veteran

Mar 30, 2013, 12:57 PM

Post #9 of 11 (1076 views)
Re: [yaroba] HELP [In reply to] Can't Post

With my previous definition of word, the solution is much shorter and it preserves the original case.

Code
#!perl -n  
use strict;
use warnings;
BEGIN{
use vars qw(@words);
}

push @words, /(?<![^., ])[ab][^., ]*/gi;

END{
@words = sort {uc $a cmp uc $b} @words;
print join( "\n", @words), "\n";
@words = sort {length $a <=> length $b} @words;
print $words[0], ' ', $words[-1], "\n";
}


Usage:

Code
perl script.pl infile.txt [>outfile.txt]

Good Luck,
Bill


yaroba
Novice

Mar 30, 2013, 1:20 PM

Post #10 of 11 (1071 views)
Re: [Kenosis] HELP [In reply to] Can't Post

Kenosis, check p.m :)


Kenosis
User

Mar 30, 2013, 11:50 PM

Post #11 of 11 (1059 views)
Re: [BillKSmith] HELP [In reply to] Can't Post

Excellent regex for getting the OP's words! I prefer to uc all, but appreciate you preserving case. An option using your regex:


Code
use strict; 
use warnings;

my @len;

print "$_\n"
for sort @len = map "\U$_",
sort { length $a <=> length $b }
map /(?<![^., ])[ab][^., ]*/gi, <>;

print "\n$len[0] - $len[-1]\n";


 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives