CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Intermediate:
Case insensitive postmodifier "i" makes pl crash?!

 



garrafa
Novice

Oct 20, 2006, 8:28 AM

Post #1 of 5 (458 views)
Case insensitive postmodifier "i" makes pl crash?! Can't Post

Hi, I have a problem whith the following cod:

Code
sub busca_styles{ 
my $htm=shift;
open HTM, $htm or return;
while(<HTM>){
my $linea1=$_;
while($linea1=~s/class=\"([^\"]+)\"//i){
$styles{$1}=1;
}
}
close HTM;
}

The idea is to catch into a global hash (%styles{}) all the styles I can found from a huge repository of *.htm files witch this sub recieves as $htm.
The script worked perfectly without the "i" postmodifier, but then I realised that "class" in the htm files could be "class", "CLASS" or even "Class", so I added the "i". And there comes the problem: It just crashes (actually it keeps on running, but doing nothing).
I used SysInternals' Process Explorer and checked on perl.exe and found out that it was consuming all my fisical momory ass well as my virtual memory, and that was why it was hanged.
The thing is, that the only modification made to the script was that "i", and before its addition, it worked as a charm, and now, it just crashes my computer.
Any hint, tips, or ideas? All replies will be really apreciated.
Thanks a lot,
Garrafa


KevinR
Veteran


Oct 20, 2006, 1:36 PM

Post #2 of 5 (452 views)
Re: [garrafa] Case insensitive postmodifier "i" makes pl crash?! [In reply to] Can't Post

Adding the 'i' modifier probably caused the while loop to never end because the substitution is always true. 'while' loops need a condition to cause them to end otherwise they keep going and going until the script is terminated by the operating system or the server or something worse. Yu probably need the 'g' modifier so the while loop will end and to also capture multiple matches for the search pattern. Use m// instead of s/// if all you need to do is find the patterns:


Code
sub busca_styles{  
my $htm=shift;
open HTM, $htm or return;
while(<HTM>){
my $linea1=$_;
while( $linea1=~ m/class="([^"]+)"/ig){
$styles{$1}=1;
}
}
close HTM;
}


Double-quotes do not need escaping in a regexp.
-------------------------------------------------


davorg
Thaumaturge / Moderator

Oct 23, 2006, 4:24 AM

Post #3 of 5 (444 views)
Re: [garrafa] Case insensitive postmodifier "i" makes pl crash?! [In reply to] Can't Post

It's a bad idea to describe something as "crashing" when it doesn't crash. You're probably going to confuse the people who are trying help you.

As KevinR points out, the problem is probably that your match is going into an infinite loop.

But I'm puzzled as to why you'd use the substitution operator (s///). Wouldn't it make more sense to use the match operator (m//) instead.


Code
while(<HTM>){  
if (my @styles = /class="([^"]+)"/ig) {
$styles{@styles}=(1) x @styles;
}
}


But really, if you're parsing HTML, you should do it with an HTML parser.

Update: Added /i option to match.

--
Dave Cross, Perl Hacker, Trainer and Writer
http://www.dave.org.uk/
Get more help at Perl Monks


(This post was edited by davorg on Oct 23, 2006, 7:37 AM)


garrafa
Novice

Oct 23, 2006, 7:05 AM

Post #4 of 5 (439 views)
Re: [davorg] Case insensitive postmodifier "i" makes pl crash?! [In reply to] Can't Post

Thanks for your answers. It's true, it's not really crashing, it's just doing nothing.Frown
I apreciate the solution you both (Davorg and KevinR) gave me, I'll try them now. In the case of what Davorg wrote, I'll have to add the "i" to make it insensitive I guess.
I'd never used the "m//g" because it always worked for me what I did with the "s///" in the while cicle.

Still, I don't think the script enters into an endless loop, because for every line it matches, it replaces what has captured, so the next loop it won't match. No matter if there is 1 match or 1000 matches per line, eventually it will replace all matches till it will match no more.
Besides, the process hangs after processing about 749100 files correctly, so the only way it enters into an endless loop, is under very particular conditions of an especific file (which I can't really imagine). Before you ask me, I'm not logging the processed files, so I don't know if every time it hangs on the same file Unsure. (may be I should do that Blush)

Anyway, I'll try your suggestions, so thanks a lot, really Smile.
Garrafa


(This post was edited by garrafa on Oct 23, 2006, 7:09 AM)


garrafa
Novice

Oct 23, 2006, 10:17 AM

Post #5 of 5 (422 views)
Re: [garrafa] Case insensitive postmodifier "i" makes pl crash?! [In reply to] Can't Post

It worked as a charm as davorg suggested me, but with this little modification:

Code
while(<HTM>){   
if (my @styles = /class="([^"]+)"/ig) {
@styles{@styles}=(1) x @styles;
}
}

Thaks a lot
Garrafa

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives