CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Using search and replace

 



mykolg
User

Nov 22, 2011, 2:13 PM

Post #1 of 21 (2093 views)
Using search and replace Can't Post

I have the following line of code:


Code
print FILE $ParsedTitle[$i] =~ s/\n/ /g;


and for some reason it is displaying a value. Not sure what I'm doing wrong, but I need another set of eyes to see my simple mistake.

I'd greatly appreciate it! :)


(This post was edited by mykolg on Nov 22, 2011, 2:24 PM)


BillKSmith
Veteran

Nov 22, 2011, 3:06 PM

Post #2 of 21 (2083 views)
Re: [mykolg] Using search and replace [In reply to] Can't Post

This substitution operator replaces each of the newlines in the string with a space character. It then returns the number of substitutions that it made. (It returns an empty string if there are not any. Refer to the section s/pattern/replacement/ in perldoc perlop) The print function prints that return value. This is probably not what you intend.
Good Luck,
Bill


mykolg
User

Nov 22, 2011, 4:17 PM

Post #3 of 21 (2077 views)
Re: [BillKSmith] Using search and replace [In reply to] Can't Post

Hmm, ok, well it must be finding the /n in the paragraph, so now my question is... How do I correct this? I'm still not sure why it's not working the way I think it should.


BillKSmith
Veteran

Nov 22, 2011, 5:02 PM

Post #4 of 21 (2073 views)
Re: [mykolg] Using search and replace [In reply to] Can't Post

We do not know what your are trying to do. That is why I tried to explain what you are doing.
Good Luck,
Bill


mykolg
User

Nov 22, 2011, 5:23 PM

Post #5 of 21 (2072 views)
Re: [BillKSmith] Using search and replace [In reply to] Can't Post

ok, I figured out what it was doing and I corrected it. It was counting the number of newlines and just printing that, I separated the operation and had the print statement on another line and the operation worked beautifully.

Now I have another issue... this piece of code is suppose count the number of words in a string... but it's only printing

Quote
1


Here is the code:

Code
$wordCt = $ParsedTitle[$i] =~ s/((^|\s)\S)/$1/g;



BillKSmith
Veteran

Nov 22, 2011, 6:40 PM

Post #6 of 21 (2064 views)
Re: [mykolg] Using search and replace [In reply to] Can't Post

You should be using the match operator (m/.../) not the substitution operator (s/.../.../). After you make the first substitution, there are no more matches. Also, I would prefer a different regular expression.


Code
my @words = $string =~ m/\b\w+\b/g; 
my $count = @words;


Good Luck,
Bill


mykolg
User

Nov 22, 2011, 7:14 PM

Post #7 of 21 (2060 views)
Re: [BillKSmith] Using search and replace [In reply to] Can't Post

I did exactly this:

Code
 
@words = $ParsedTitle[$i] =~ m/\b\w+\b/g;
$wordCt = @words;
print FILE $wordCt;

and still my output is:

Quote
1



BillKSmith
Veteran

Nov 22, 2011, 8:13 PM

Post #8 of 21 (2054 views)
Re: [mykolg] Using search and replace [In reply to] Can't Post

Check the content of $ParsedTitle[$i]; Everything works fine with my data.


Code
use strict; 
use warnings;
my @ParsedTitle;
my $i = 0;
$ParsedTitle[$i] =
"This is a sentence.\n"
."Here is the rest of the paragraph.\n";
my @words = $ParsedTitle[$i] =~ m/\b\w+\b/g;
my $count = @words;
print $count;

Good Luck,
Bill


mykolg
User

Nov 22, 2011, 8:16 PM

Post #9 of 21 (2054 views)
Re: [mykolg] Using search and replace [In reply to] Can't Post

Gah, I retract my last statement, that is a valid piece of code, I was looking at the wrong output there.

:) Please bear with me, I'm a noob with the regex

I'm trying to also pick the first word out of the string, I'm using the following code:

Code
        $firstWord = $ParsedTitle[$i] =~ /^(.*?)\s/; 
print FILE $firstWord;


and I even tried:

Code
        $firstWord = $ParsedTitle[$i] =~ /\b\w+\b/;


and that is what is outputing:

Quote
1



(This post was edited by mykolg on Nov 22, 2011, 8:26 PM)


BillKSmith
Veteran

Nov 23, 2011, 4:41 AM

Post #10 of 21 (2046 views)
Re: [mykolg] Using search and replace [In reply to] Can't Post

Your problem has nothing to do with the regular expression. In scalar context, the match operator returns a true/false value. I introduced the array variable @words to force array context.

$words[0] contains the word that you want.

Refer to perl's own documentation of m/PATTERN/REPLACEMENT/ (in perldoc perlop) for a better example. You could also use the special variable $& (Refer: perldoc perlvar).
Good Luck,
Bill


mykolg
User

Nov 23, 2011, 6:24 AM

Post #11 of 21 (2038 views)
Re: [BillKSmith] Using search and replace [In reply to] Can't Post

Ahh, ok, I got that working, now I'm getting a stream of warnings for some reason, stating the following:


Quote
Use of uninitialized value $firstWord in print at p2.pl line 35.
Use of uninitialized value $firstWord in print at p2.pl line 35.
Use of uninitialized value $firstWord in print at p2.pl line 35.
Use of uninitialized value $firstWord in print at p2.pl line 35.
Use of uninitialized value $firstWord in print at p2.pl line 35.
Use of uninitialized value $firstWord in print at p2.pl line 35.
Use of uninitialized value $firstWord in print at p2.pl line 35.
Use of uninitialized value $firstWord in print at p2.pl line 35.
Use of uninitialized value $firstWord in print at p2.pl line 35.
Use of uninitialized value $firstWord in print at p2.pl line 35.
Use of uninitialized value $firstWord in print at p2.pl line 35.
Use of uninitialized value $firstWord in print at p2.pl line 35.
Use of uninitialized value $firstWord in print at p2.pl line 35.
Use of uninitialized value $firstWord in print at p2.pl line 35.
Use of uninitialized value $firstWord in print at p2.pl line 35.
Use of uninitialized value $firstWord in print at p2.pl line 35.
Use of uninitialized value $firstWord in print at p2.pl line 35.


Which is in the following bit of code at

Code
my $i; 
my $posPID = 1;
my $firstWord;
my @words;
my $wordCt;
my $lineCt;

for($i = 0; $i < scalar(@ParsedTitle); $i++){
print FILE $posPID;
print FILE ",";
@words = $ParsedTitle[$i] =~ m/^(.*?)\s/;
$firstWord = $words[0];
print FILE $firstWord;
print FILE ",";
@words = $ParsedTitle[$i] =~ m/\b\w+\b/g;
$wordCt = @words;
print FILE $wordCt;
print FILE ",\"";
$lineCt = $ParsedTitle[$i] =~ s/\n/ /g;
print FILE $ParsedTitle[$i];
print FILE "\"\n";
$posPID++;
}


The lines that seem to be causing a problem are here:

Code
        @words = $ParsedTitle[$i] =~ m/^(.*?)\s/; 
$firstWord = $words[0];
print FILE $firstWord;



Chris Charley
User

Nov 23, 2011, 7:46 AM

Post #12 of 21 (2030 views)
Re: [mykolg] Using search and replace [In reply to] Can't Post

Worked ok for me. Here is your code with some changes to look cleaner (I didn't print out to FILE but you probably will).

Code
#!/usr/bin/perl 
use strict;
use warnings;
use 5.014;


my $posPID = 1;
my $lineCt;
my @ParsedTitle = ("Here are some words\nalong with 2\n embedded newlines\n");

for(my $i = 0; $i < @ParsedTitle; $i++){
my ($firstWord) = $ParsedTitle[$i] =~ m/(\S+)/;
print $firstWord;
print ",";
my $wordCt = () = $ParsedTitle[$i] =~ m/\b\w+\b/g;
print $wordCt;
print ",\"";
$lineCt = $ParsedTitle[$i] =~ s/\n/ /g;
print $ParsedTitle[$i];
print "\"\n";
$posPID++;
}



mykolg
User

Nov 23, 2011, 8:02 AM

Post #13 of 21 (2027 views)
Re: [Chris Charley] Using search and replace [In reply to] Can't Post

Try theses Files out, because that warning is weird... also what is the use 5.014 about?

See if it gives the same error...
Attachments: p2.pl (0.79 KB)
  anna.book (187 KB)


BillKSmith
Veteran

Nov 23, 2011, 11:12 AM

Post #14 of 21 (2018 views)
Re: [Chris Charley] Using search and replace [In reply to] Can't Post

The message is telling you that the regular expression failed to match. I do not understand what you intend it to match.
Good Luck,
Bill


Chris Charley
User

Nov 23, 2011, 11:36 AM

Post #15 of 21 (2017 views)
Re: [mykolg] Using search and replace [In reply to] Can't Post

Funny, I got the same message, Use of uninitialized value $firstWord in print at t2.pl line 35. but I can't see why exactly with your code. Here is my code that did what you intended (I believe).

Code
 #!/usr/bin/perl  
use warnings;
use strict;

my $bookTitle = "anna.book";
my $out_file = "anna.csv";

open BOOK, "<", $bookTitle or die "Could not open $bookTitle!. $!";
open FILE, ">", $out_file or die "Could not open $out_file for writing. $!";
my $posID = 1;

{
local $/ = "";
while (<BOOK>) {
chomp;
my ($first_word) = /(\w+)/;
my $word_count = () = /\w+/g;
my $line_count = s/\n/ /g || 0;
print FILE join(",", $posID++, $first_word, $word_count, $line_count, "\"$_\""), "\n";
}
}

close BOOK or die "Could not close $bookTitle. $!";
close FILE or die "Could not close $out_file. $!";

I got output like:

Quote

Code
 1,Chapter,2,0,"Chapter 1"  
2,Happy,14,1,"Happy families are all alike; every unhappy fa
3,Everything,200,17,"Everything was in confusion in the Oblo
4,Three,102,8,"Three days after the quarrel, Prince Stepan A
5,Yes,81,7,""Yes, yes, how was it now?" he thought, going ov
6,Stepan,156,13,"Stepan Arkadyevitch's eyes twinkled gaily,
7,Ah,39,3,""Ah, ah, ah! Oo!..." he muttered, recalling ever
8,Yes,65,5,""Yes, she won't forgive me, and she can't forgiv
9,Most,67,5,"Most unpleasant of all was the first minute whe
10,She,39,3,"She, his Dolly, forever fussing and worrying ov
11,What,10,0,""What's this? this?" she asked, pointing to th
12,And,34,2,"And at this recollection, Stepan Arkadyevitch,
13,There,96,9,"There happened to him at that instant what do
14,This,47,4,"This idiotic smile he could not forgive himsel
15,It,15,1,""It's that idiotic smile that's to blame for it
16,But,21,1,""But what's to be done? What's to be done?" he
17,Chapter,2,0,"Chapter 2"
18,Stepan,214,18,"Stepan Arkadyevitch was a truthful man in
19,Oh,155,12,""Oh, it's awful! oh dear, oh dear! awful!" Ste
20,There,73,6,"There was no solution, but that universal sol
21,Then,107,9,""Then we shall see," Stepan Arkadyevitch said
22,Are,20,2,""Are there any papers from the office?" asked S
23,On,30,2,""On the table," replied Matvey, glancing with in
24,Stepan,49,4,"Stepan Arkadyevitch made no reply, he merely
25,Matvey,24,2,"Matvey put his hands in his jacket pockets,
26,I,27,2,""I told them to come on Sunday, and till then not
27,Stepan,37,3,"Stepan Arkadyevitch saw Matvey wanted to mak
28,Matvey,31,2,""Matvey, my sister Anna Arkadyevna will be h
29,Thank,40,3,""Thank God!" said Matvey, showing by this res


Quote
PS line_count is off by one (since I chomped off the newline at the end of each paragraph).


(This post was edited by Chris Charley on Nov 23, 2011, 11:39 AM)


mykolg
User

Nov 23, 2011, 12:34 PM

Post #16 of 21 (2008 views)
Re: [Chris Charley] Using search and replace [In reply to] Can't Post

This is what I'm getting:

Quote
Use of uninitialized value $firstWord in join or string at p2.pl line 19, <BOOK> chunk 4667.


Using the following code:


Code
#!/usr/bin/perl 
use warnings;
use strict;

my $bookTitle = "anna.book";
my $outFile = "anna.csv";

open BOOK, "<", $bookTitle or die "Could not open file: $bookTitle!. $!";
open FILE, ">", $outFile or die("Could not open file: $outFile. $!");

my $pID = 1;

local $/ = "";
while (<BOOK>){
chomp;
my ($firstWord) = /(\w+)/;
my $wordCt = () = /\w+/g;
my $lineCt = s/\n/ /g || 0;
$lineCt++;
print FILE join(",", $pID++, $firstWord, $wordCt, $lineCt, "\"$_\""), "$
}

close BOOK or die "Could not close $bookTitle. $!";
close FILE or die "Could not close $outFile. $!";



Chris Charley
User

Nov 23, 2011, 2:17 PM

Post #17 of 21 (2004 views)
Re: [mykolg] Using search and replace [In reply to] Can't Post


Quote
Use of uninitialized value $firstWord in join or string at p2.pl line 19, <BOOK> chunk 4667.


I'm not sure why - there must be something in the file you're reading like spaces inbetween newlines possibly.

You could check for this condition with the following.

Code
while (<BOOK>){ 
chomp;
next if /^\W+$/; # check for embedded spaces or non word items
.....



mykolg
User

Nov 24, 2011, 8:54 AM

Post #18 of 21 (1982 views)
Re: [Chris Charley] Using search and replace [In reply to] Can't Post

Well I changed my code into a function and have it running multiple books, also I have it removing commas because I'm trying to make it into a .csv file, but I'm getting the same error as before:

Quote
Use of uninitialized value $firstWord in join or string at p2.pl line 30, <BOOK> chunk 4667.


I'm not even sure what it's complaining about.

I think it might be the face that the value is NULL and doesn't join well... Any way to fix the following line of code to default to NULL if nothing is loaded?

Code
my ($firstWord) = /(\w+)/;



(This post was edited by mykolg on Nov 24, 2011, 9:17 AM)


Chris Charley
User

Nov 24, 2011, 11:46 AM

Post #19 of 21 (1975 views)
Re: [mykolg] Using search and replace [In reply to] Can't Post


Code
next if /^\W+$/;

Did you try adding this where I suggested? If this were there, you wouldn't get that msg. I believe. There is something odd about your file that I can't find from here. Your file, anna.book, parsed ok here, so I don't why it isn't working for you.

As I was thinking about this, comma is a poor choice of separator for anna.csv. It would be better to chose a character that isn't in any of the text here. A caret, '^', or a tilde, '~'. There are a few more thoughts, but I don't have the time now. I'll update this post later.

Chris


mykolg
User

Nov 24, 2011, 2:28 PM

Post #20 of 21 (1970 views)
Re: [Chris Charley] Using search and replace [In reply to] Can't Post

I put that line of code in there, also I have "s around the actual text file, so that should be fine... I'm not sure what the glitch is...
anna.book is WAY longer so it might be failing elsewhere. It's like 1.6Mb. But I think I have done everything correctly


Chris Charley
User

Nov 26, 2011, 6:06 PM

Post #21 of 21 (1945 views)
Re: [mykolg] Using search and replace [In reply to] Can't Post


Quote
also I have it removing commas because I'm trying to make it into a .csv file

If there are quotes at the beginning and end of the string, then any commas in the text would not be treated as commas but just part of the text instead. The parser I used was http://search.cpan.org/~hmbrand/Text-CSV_XS-0.85/CSV_XS.pm Text::CSV_XS. It was necessary to escape the " s as documented for Text::CSV. That's just one line in the program.

Code
   #!/usr/bin/perl    
use warnings;
use strict;

my $bookTitle = "anna.book";
my $out_file = "anna.csv";

open BOOK, "<", $bookTitle or die "Could not open $bookTitle!. $!";
open FILE, ">", $out_file or die "Could not open $out_file for writing. $!";
my $posID = 1;

{
local $/ = "";
while (<BOOK>) {
chomp;
my ($first_word) = /(\w+)/ or next;
my $word_count = () = /\w+/g;
my $line_count = 1 + s/\n/ /g;
s/"/""/g; # escape embedded quotes
print FILE join(",",
$posID++, $first_word, $word_count, $line_count, "\"$_\""), "\n";
}
}

close BOOK or die "Could not close $bookTitle. $!";
close FILE or die "Could not close $out_file. $!";

Then later when I want to get the data I just use the parsing program.

Code
   #!/usr/bin/perl    
use strict;
use warnings;
use Text::CSV_XS;

open my $fh, "<", 'anna.csv' or die $!;

# there is an , so need binary to parse.
# Without binary turned on, this program quit running when
# it encountered the (somewhere in the middle).
my $csv = Text::CSV_XS->new({binary => 1});

while (my $row = $csv->getline ($fh)) {
#if (5 != @$row) {
# die "bad csv: number of fields not = 5. $!";
#}
print join("\t", @$row), "\n";
}

close $fh or die $!;

I did use the comma as my separator and the program performed well. (There was no need to remove commas embedded in the text).

The .csv file may also be treated as a database.

Code
 #!/usr/bin/perl  
use strict;
use warnings;
use DBI;

my $dbh = DBI->connect(qq{DBI:CSV:});
$dbh->{'csv_tables'}->{'books'} = { 'file' => 'anna.csv',
'col_names' => [qw/posID first_word word_count line_count paragraph/]};

my $sth = $dbh->prepare(qq{
SELECT posID, word_count, paragraph
FROM books
WHERE word_count < 12
ORDER BY word_count DESC
});

$sth->execute() or die "Cannot execute: " . $sth->errstr();
while(my @row = $sth->fetchrow_array) {
print "@row\n";
}
$sth->finish();
$dbh->disconnect();


Provides output like

Code
 783 9 Kitty perceived that Anna knew what answer would follow.  
34 8 "Darya Alexandrovna?" Matvey repeated, as though in doubt.
112 8 "Dolly, one word more," he said, following her.
281 8 "I'll put them on directly," he said.
354 8 "Oh, yes, Parmesan. Or would you like another?"
390 8 "I think it's possible. Why not possible?"
417 8 "No, I don't. Why do you ask?"
437 8 Stepan Arkadyevitch's eyes sparkled more than usual.
506 8 The Countess Nordston pounced upon Levin at once.
542 8 "Oh, then you don't believe in it?"
578 8 "Why, you've..." The prince was crying wrathfully.
644 8 "You got my telegram? Quite well? Thank God."
649 8 "Well, well, allow me to kiss your hand."
652 8 Vronsky understood now that this was Madame Karenina.
702 8 "It's an omen of evil," she said.
709 8 And Stepan Arkadyevitch began to tell his story.
718 8 "Dolly, how glad I am to see you!"
781 8 "How can _you_ be dull at a ball?"
787 8 "Are you coming to this ball?" asked Kitty.
791 8 "I imagine you at the ball in lilac."
30 7 "Alone, or with her husband?" inquired Matvey.
39 7 "Eh, Matvey?" he said, shaking his head.
65 7 "Mamma? She is up," answered the girl.
114 7 And she went out, slamming the door.
136 7 And the sitting of the board began.
142 7 "To be sure we shall!" said Nikitin.
166 7 Stepan Arkadyevitch gave a scarcely perceptible smile.

Hope this provides additional help,
Chris


(This post was edited by Chris Charley on Nov 26, 2011, 6:55 PM)

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives