CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Regular Expressions:
match, but do not match

 



JasperD
Novice

Dec 16, 2010, 4:59 AM

Post #1 of 6 (3814 views)
match, but do not match Can't Post

Hi everyone,

Trying to substitute some matched strings does not seem to work out for me.

Consider the following text: ABC"DEF"GHI"JKL"

I remove the "DEF" and "JKL" as follows:

s/".*?"//g or s/"[^"]*"//g nicely gives: ABCGHI

However, my problem is more complicated, it has a @ to tell the " should not match:

eg: ABC@"DEF@"GHI"JKL"

The result I am looking for is: ABC@"DEF@"GHI (or better if possible ABC"DEF"GHI)

I have tried stuff like:

s/[^@]".*?"//g (ungreedy) or s/([^@]"|")[^"]*"//g or s/([^@]"&")[^"]*"//g

One of the main problems with the [^@] notation in combination with s///g is that the matching character (not @) will be substituted with nothing as well.

How do I tell the system to match " but not match @" (and possibly to remove the @) ?

Thanks


shawnhcorey
Enthusiast


Dec 16, 2010, 5:59 AM

Post #2 of 6 (3809 views)
Re: [JasperD] match, but do not match [In reply to] Can't Post

You need to use one of the lookbehind patterns. This may not work if there's a double quote at the start of a line.


Code
#!/usr/bin/perl 

use strict;
use warnings;

while( <DATA> ){
chomp;
print "$_ -> ";
s{ (?<!\@) \" [^"]* \" }{}gmsx;
print "$_\n";
}

__DATA__
ABC"DEF"GHI"JKL"
ABC@"DEF@"GHI"JKL"


See `perldoc perlre`.

__END__

I love Perl; it's the only language where you can bless your thingy.

Perl documentation is available at perldoc.perl.org. The list of standard modules and pragmatics is available in perlmodlib.

Get Markup Help. Please note the markup tag of "code".


JasperD
Novice

Dec 17, 2010, 2:16 AM

Post #3 of 6 (3736 views)
Re: [shawnhcorey] match, but do not match [In reply to] Can't Post

Thanks for the answer already,

The look-behind pattern is probably going to help me and the code you gave seems to work for me.
However, your post leads to ask me a couple of more questions since I would like to know what I am doing.
I am not familiar with this syntax:


Code
 s{ (?<!\@) \" [^"]* \" }{}gmsx;



Where can I find info about "gmsx"? A search on preldoc does not seem to give any results,
What is the {}?
The syntax you use, can this be optimized (disable backtracking, disable buffering)?

Like the double quotes string match example from perlre:


Code
 /"([^"\\]+|\\.)*"/



Which according to perldoc can be most efficiently performed when written as:


Code
 /"(?:[^"\\]++|\\.)*+"/



So, can my original question also be written as a optimized single regex pattern to be used in s///g?


shawnhcorey
Enthusiast


Dec 17, 2010, 6:32 AM

Post #4 of 6 (3722 views)
Re: [JasperD] match, but do not match [In reply to] Can't Post


In Reply To
Where can I find info about "gmsx"? A search on preldoc does not seem to give any results,


See `perldoc perlre` and search for /Character Classes and other Special Escapes/



In Reply To
What is the {}?


They are delimiters for the pattern and replacement is s


In Reply To
The syntax you use, can this be optimized (disable backtracking, disable buffering)?



Shawn's second Rule of Programming: First you make it work, and only then you make it better.

__END__

I love Perl; it's the only language where you can bless your thingy.

Perl documentation is available at perldoc.perl.org. The list of standard modules and pragmatics is available in perlmodlib.

Get Markup Help. Please note the markup tag of "code".


JasperD
Novice

Dec 17, 2010, 7:28 AM

Post #5 of 6 (3717 views)
Re: [shawnhcorey] match, but do not match [In reply to] Can't Post

Your replies sincerely appreciated,


In Reply To
Shawn's second In Reply ToRule of Programming: First you make it work, and only then you make it better.

Disagreeing, I am not here to make it work, but to learn. (Besides, it already works).

My last question remains unanswered.
I would like to know how to write this as a single statement and how to optimize it.

Does anyone else have thoughts on this?


JasperD
Novice

Dec 19, 2010, 4:38 AM

Post #6 of 6 (3657 views)
Re: [JasperD] match, but do not match [In reply to] Can't Post

After a bit more puzzling I have decided to use the following:


Code
#!/usr/bin/perl 
use strict;
use warnings;
while( <DATA> ) {
chomp ;
print "$_ -> " ;
s/(?:(?<!\@)"[^"]*+"|@(?="))//g ;
print "$_\n";
}
__DATA__
"ABC"DEF"GHI"JKL"MNO
ABC"DEF"GHI"JKL"
ABC@"DEF@"GHI"JKL"
A@BC@@"D@EF@"G@HI"JKL"


Which also removes @":

Code
"ABC"DEF"GHI"JKL"MNO   ->  DEFJKL"MNO 
ABC"DEF"GHI"JKL" -> ABCGHI
ABC@"DEF@"GHI"JKL" -> ABC"DEF"GHI
A@BC@@"D@EF@"G@HI"JKL" -> A@BC@"D@EF"G@HI


Thanks for pointing out to me to use Look-Around Assertions (which I now also applied to remove the @ sign).

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives