CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Regular Expressions:
Match characters in middle and end string

 

First page Previous page 1 2 Next page Last page  View All


Stefanik
User

Jan 10, 2013, 1:54 AM

Post #1 of 31 (14683 views)
Match characters in middle and end string Can't Post

Hi,

I've a file as following:


Code
 anystring1 
anystring2SUB:anystring3:anystring4;
anystring5SUB:
:anystring6:
anystring7;



I have to perform two kind of matching:

1) check all the lines contains "SUB:" and as the end character ";". print them

2) check all the lines contains "SUB", from here remove all the "\n" at the end until I find out the line with ";" at the end.

The second point is to "normalize" the lines as the one at point 1.



Now, I start to write regexp for point 1:


Code
 if ($qpar =~ /^\w+SUB:\w+\;$/) {  

print $qpar;

}



Stefanik
User

Jan 10, 2013, 4:43 AM

Post #2 of 31 (14675 views)
Re: [Stefanik] Match characters in middle and end string [In reply to] Can't Post

I modify the regexp, and now works:

Quote
if ($qpar =~ (/^.*SUB:.*\;$/m)){print $qpar;}



What's the difference between "\w" and "." ?

Both of them represent any alphanumeric character?


(This post was edited by Stefanik on Jan 10, 2013, 5:05 AM)


Stefanik
User

Jan 10, 2013, 6:07 AM

Post #3 of 31 (14665 views)
Re: [Stefanik] Match characters in middle and end string [In reply to] Can't Post

I try to match the second point:

Code
($qpar =~ (/^.*SUB:.*[^;]$/m))


I find all the lines contain "SUB", but doesn't end with ";".
But the code seems to doesn't manages "^;" and print all the lines contain SUB also the one ending with ";".
Any suggests?


(This post was edited by Stefanik on Jan 10, 2013, 6:08 AM)


BillKSmith
Veteran

Jan 10, 2013, 6:18 AM

Post #4 of 31 (14663 views)
Re: [Stefanik] Match characters in middle and end string [In reply to] Can't Post

We usually think of /./ as meaning "match any character". /\w/ means match any word charcter (/[a-zA-Z_0-9]/).

In your example, this would not make a difference. Note that in your second case, you use .* rather than \w+. The "+" requires atleast one match. The "*" does not. That is the difference.
Good Luck,
Bill


Stefanik
User

Jan 10, 2013, 8:08 AM

Post #5 of 31 (14658 views)
Re: [BillKSmith] Match characters in middle and end string [In reply to] Can't Post

I also try "\w*" but I didn't get any printout again.
Anyway I solved with ".*"

Can you help me with:


Code
($qpar =~ (/^.*SUB:.*[^;]$/m))


Thanks again


rovf
Veteran

Jan 10, 2013, 11:07 PM

Post #6 of 31 (14584 views)
Re: [Stefanik] Match characters in middle and end string [In reply to] Can't Post

The pattern ^.* at the beginning of a regexp is redundant, so you basically match

SUB:.*[^;]$

Since you are using the m-modifier for your regexp, the $ changes its meaning from matching end of the string to matching end of the line. That is, your pattern matches, if $qpar contains the text SUB:, and somewhere later a \n which is not immediately preceeded by a semicolon. For instance, the following string would match:

"xxxxSUB:yyyy\n\nSUB:\nbbbbbb"

In this case, the matched substring would be

SUB:yyyy\n\nSUB:\n

If you would have used .*? instead of .*, the matched substring would be

SUB:yyyy\n

Does this answer your question?


Stefanik
User

Jan 11, 2013, 6:15 AM

Post #7 of 31 (14574 views)
Re: [rovf] Match characters in middle and end string [In reply to] Can't Post

Hi rovf, thanks you're right about my question.
I try to change the regexp in the way you suggest me:


Code
if ($qpar =~ (/SUB:.?[^;]$/m)){print $qpar;}


But no lines are printout.

The file I check in contains following lines:





Code
SET:TESTSUB:TRANSID,t1:NUM,428:PARAMETERS,other; 
GET:TESTSUB:TRANSID,t2:

SET:TESTSUB:TRANSID,t3:NUM,428:PARAMETERS,other;no
NUM,327
:PARAMETERS,other;

<?xml version='1.0' encoding='ISO-8859-1' standalone='no'?><Request MO="OSUB" O
peration="get"> <num>456</num></Request>
<?xml version='1.0' encoding='ISO-8859-1' standalone='no'?>
<Response>
<errorid>051</errorid>
</Response>



(This post was edited by Stefanik on Jan 11, 2013, 6:26 AM)


rovf
Veteran

Jan 11, 2013, 6:18 AM

Post #8 of 31 (14572 views)
Re: [Stefanik] Match characters in middle and end string [In reply to] Can't Post

You wrote .?, while I suggested .*?


Stefanik
User

Jan 11, 2013, 6:23 AM

Post #9 of 31 (14568 views)
Re: [rovf] Match characters in middle and end string [In reply to] Can't Post

Sorry..

Code
if ($qpar =~ (/SUB:.*?[^;]$/m)){print $qpar;}


In this way I match:

Code
SET:TESTSUB:TRANSID,t1:NUM,428:PARAMETERS,other;  
GET:TESTSUB:TRANSID,t2:
SET:TESTSUB:TRANSID,t3:NUM,428:PARAMETERS,other;no

while the first line shouldn't be printed out


rovf
Veteran

Jan 11, 2013, 6:32 AM

Post #10 of 31 (14563 views)
Re: [Stefanik] Match characters in middle and end string [In reply to] Can't Post

Write the print statement like this:


Code
print "FOUND: <$qpar>\n";



Stefanik
User

Jan 11, 2013, 12:32 PM

Post #11 of 31 (14546 views)
Re: [rovf] Match characters in middle and end string [In reply to] Can't Post

The output:


Code
FOUND: <SET:TESTSUB:TRANSID,t1:NUM,458:PARAMETERS,other; 
>
FOUND: <GET:TESTSUB:TRANSID,t2:
>
FOUND: <SET:TESTSUB:TRANSID,t3:NUM,458:PARAMETERS,other;NO
>



rovf
Veteran

Jan 12, 2013, 12:46 AM

Post #12 of 31 (14527 views)
Re: [Stefanik] Match characters in middle and end string [In reply to] Can't Post

I see, my way to modify the print statement was not wise (I wanted to verify that there is no white space before the semicolon), so maybe you better do:


Code
use Data::Dumper qw(Dumper);


and then


Code
print(Dumper($qpar),"\n");



Stefanik
User

Jan 12, 2013, 7:22 AM

Post #13 of 31 (14510 views)
Re: [rovf] Match characters in middle and end string [In reply to] Can't Post

The new code:

Code
if ($qpar =~ (/SUB:.*?[^;]$/m)){print(Dumper($qpar),"\n");}


here the new output:

Code
$VAR1 = 'SET:TESTSUB:TRANSID,t1:NUM,458:PARAMETERS,other; 
';

$VAR1 = 'GET:TESTSUB:TRANSID,t2:
';

$VAR1 = 'SET:TESTSUB:TRANSID,t3:NUM,458:PARAMETERS,other;NO
';



(This post was edited by Stefanik on Jan 12, 2013, 7:27 AM)


rovf
Veteran

Jan 12, 2013, 8:41 AM

Post #14 of 31 (14500 views)
Re: [Stefanik] Match characters in middle and end string [In reply to] Can't Post

In the input file, there *MUST* be a space after the semicolon in the first line, otherwise your regexp wouldn't have matched. Maybe you should hexdump your input?


Stefanik
User

Jan 13, 2013, 12:45 PM

Post #15 of 31 (14441 views)
Re: [rovf] Match characters in middle and end string [In reply to] Can't Post

I've just execute hexdump on the log file, but no space is present.

Frown


rovf
Veteran

Jan 14, 2013, 1:25 AM

Post #16 of 31 (14424 views)
Re: [Stefanik] Match characters in middle and end string [In reply to] Can't Post

Oops, I must have been absent-minded. Forget my silly argument with the space. This is not relevant.

Of course the reason is that your regexp requires that no semicolon comes just before the newline. But your t1 line ends in a semicolon, and hence doesn't match.


Stefanik
User

Jan 14, 2013, 5:44 AM

Post #17 of 31 (14414 views)
Re: [rovf] Match characters in middle and end string [In reply to] Can't Post

My intention is to select just the line contains "SUB:", but dosn't end with ";".

So, do you mean is there a problem with regexp? Is it wrong?


rovf
Veteran

Jan 14, 2013, 6:03 AM

Post #18 of 31 (14412 views)
Re: [rovf] Match characters in middle and end string [In reply to] Can't Post

If you want to select only lines containing SUB and not ending with a semicolon, you can equally well match against


Code
/(SUB.*[^;]$)/m


since the dot doesn't match a newline.


Stefanik
User

Jan 15, 2013, 5:27 AM

Post #19 of 31 (14379 views)
Re: [rovf] Match characters in middle and end string [In reply to] Can't Post

Thanks a lot for your support rovf Smile


Stefanik
User

Jan 16, 2013, 2:02 PM

Post #20 of 31 (14297 views)
Re: [rovf] Match characters in middle and end string [In reply to] Can't Post

I try the solution but it doesn't work.
Seems regex continues to match the "\n"

It works if I write:

Code
 
/SUB.*[^;]\n/m



(This post was edited by Stefanik on Jan 16, 2013, 2:16 PM)


rovf
Veteran

Jan 16, 2013, 11:37 PM

Post #21 of 31 (14271 views)
Re: [Stefanik] Match characters in middle and end string [In reply to] Can't Post

Of course it matches the newline. After all, you wrote the newline into the regexp.

Your regexp says: Line containing SUB and which has no semicolon in front of the newline.


Stefanik
User

Jan 17, 2013, 12:04 AM

Post #22 of 31 (14269 views)
Re: [rovf] Match characters in middle and end string [In reply to] Can't Post

Ok, but in this way regexp doesn't match last line if it has a "\n".

So, what I need is to match "....SUB....[^;]" , independent if there is a \n at the end. I don't know if it's possible in just one regexp or I should check it in two regexp (one with "\n" at the end, another without "\n").


FishMonger
Veteran / Moderator

Jan 17, 2013, 8:29 AM

Post #23 of 31 (14262 views)
Re: [Stefanik] Match characters in middle and end string [In reply to] Can't Post

/SUB.*[^;]\n?/m


(This post was edited by FishMonger on Jan 17, 2013, 8:29 AM)


Stefanik
User

Jan 17, 2013, 1:37 PM

Post #24 of 31 (14248 views)
Re: [FishMonger] Match characters in middle and end string [In reply to] Can't Post

doesn't work Frown

Input:

Code
meSUBstring1; 
youSUBstring2;
ISUBstring3
string4;
noSUBstring5;


code:

Code
#!/usr/bin/perl 

use strict;
use warnings FATAL => qw(all);
use diagnostics;

my $qnso="C:/Users/me/Desktop/Perl_Test/temp/test.log";
my $qpar="x";

open (NSOFILE, "<", $qnso) or die "No file!";
while ($qpar = <NSOFILE>){
if ($qpar =~ /SUB.*[^;]\n?/m){print $qpar;}
}
close (NSOFILE);


Output:

Code
meSUBstring1; 
youSUBstring2;
ISUBstring3
noSUBstring5;



(This post was edited by Stefanik on Jan 17, 2013, 1:38 PM)


Laurent_R
Veteran / Moderator

Jan 18, 2013, 3:44 PM

Post #25 of 31 (14177 views)
Re: [Stefanik] Match characters in middle and end string [In reply to] Can't Post

Just a quick try, syntax could be cleaner, but it works and might show you the way:


Code
#!/usr/bin/perl  

use strict;
use warnings FATAL => qw(all);
use diagnostics;

my $qnso="C:/Users/me/Desktop/Perl_Test/temp/test.log";

# open (NSOFILE, "<", $qnso) or die "No file!";
while (my $qpar = <DATA>){
chomp $qpar;
$qpar .= <DATA> and chomp $qpar while ($qpar !~ /;\s*$/);
if ($qpar =~ /SUB.*[^;]\n?/m){print $qpar, "\n";}
}
# close (NSOFILE);

__DATA__
meSUBstring1;
youSUBstring2;
ISUBstring3
string4;
noSUBstring5;


This prints out this:


Code
$ perl  qpar.pl 
meSUBstring1;
youSUBstring2;
ISUBstring3 string4;
noSUBstring5;


First page Previous page 1 2 Next page Last page  View All
 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives