CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Regular Expressions:
how to get exact match

 



kukkelikuu
New User

Dec 9, 2012, 11:45 AM

Post #1 of 5 (9190 views)
how to get exact match Can't Post

All my functions are form ( at least one underscore ) "aa_yy_zz_ttt (" and i should find those from file. Each line can also have several functions.

why this :

$str =~ /^[a-z]+\_{1,}+\w+\s{0,}\(/;
doesn't work to get the exact match..

e.g

When input value is
$str =" return ( *( int* ) MACRO_FUNCTION("; -> output is the same as input, which is wrong -> it should notice that return doesn't have underscore. And it should return only "MACRO_FUNCTION(" and not " return ( *( int* ) MACRO_FUNCTION("

$str =" main_sub_function ("; - this should be OK to be found, but it doesnt work for me.


Laurent_R
Veteran / Moderator

Dec 9, 2012, 3:05 PM

Post #2 of 5 (9174 views)
Re: [kukkelikuu] how to get exact match [In reply to] Can't Post

You are not saying enough about how you use your regex, but, clearly, '/^[a-z]+\_{1,}+\w+\s{0,}\(/' cannot match the expression " return ( *( int* ) MACRO_FUNCTION(" for several reasons.

To start with, your regex starts with '/^[a-z]+', meaning that a match has to start with one or several lower case characters. But your expression starts with a space, not a [a-z] character, so no match.

Even if you remove the '^' at the beginning of the regex, then it might start a match anywhere in the string. But it still will not match anything in your expression, because it looks for a bunch of lower case characters, followed by one of several '_', followed by a bunch of word characters. That will not match "MACRO_FUNTION" because MACRO is not in lower case.

Concerning the " main_sub_function (" expression, it will again not match because of the leading space in the expression, whereas your regex says the string has to start with a [a-z] character, not a space. Again, if you remove either the leading space in the expression or the leading '^' in the regex, this will still not match because your regex declares two groups of letters separated by one or several '_', not three groups of letters semarated each by a '_' character.


kukkelikuu
New User

Dec 10, 2012, 11:16 AM

Post #3 of 5 (9147 views)
Re: [Laurent_R] how to get exact match [In reply to] Can't Post

I managed to get the if- check working, but when i am reading the match, its not completely match, only part of it.

$test =" mainsub_funct_ion (";

if( $test =~ /([a-z]+\_{1}){1,}[a-z]+\s{0,1}\($/ )
{
print "FOUND \n";
print "$1 \n";
}

prints "funct_" ...so it misses "mainsub_" and "ion (" , but the match should work correctly.


BillKSmith
Veteran

Dec 10, 2012, 1:16 PM

Post #4 of 5 (9141 views)
Re: [kukkelikuu] how to get exact match [In reply to] Can't Post

This will match all of your examples. I am still not sure of your real requirement.

Code
use strict; 
use warnings;
use Readonly;
Readonly::Scalar my $FUNCTION => qr /
( # Start function name
(?: # Start field
\w+ # word characters
_ # match underscore
)+ # end field
\w+ # Final field
) # end function name
\s* # Optional space
\( # Match req'd paren
/x;
while (my $test = <DATA>) {
chomp $test;
my (@functions) = $test =~ /$FUNCTION/g;
print "@functions\n";
}
__DATA__
aa_yy_zz_ttt(arg) xxx_yyy()
return ( *( int* ) MACRO_FUNCTION(
mainsub_funct_ion (

Good Luck,
Bill


Laurent_R
Veteran / Moderator

Dec 10, 2012, 2:24 PM

Post #5 of 5 (9140 views)
Re: [kukkelikuu] how to get exact match [In reply to] Can't Post


In Reply To
$test =" mainsub_funct_ion (";

if( $test =~ /([a-z]+\_{1}){1,}[a-z]+\s{0,1}\($/ )



Just change your regex to:


Code
print $1 if $test =~  /(([a-z]+_){1,}[a-z]+\s{0,1}\()$/;


and it will print :


Code
mainsub_funct_ion (


The changes I made:
- No need for the '\' before the _, as '\' is not a special character for regexes;
- No need for the {1} quantifier after the "_", just useless here, as if you don't purt one, the regex will look for one occurrence;
- I have added a pair of parentheses to enclose the whole regular expression, so that $1 captures the whole thing rather than only the first part between the first set of parens. Only this 3rd change was really necessary, the other two are just cosmetic simplifications of what you had, with no effect on matched results.

BTW, speaking of simplifications, {1,} could be replaced by '+' and {0,1} by '?', so that the regex can be further simplified into:


Code
print $1 if $test =~  /(([a-z]+_)+[a-z]+\s?\()$/;

and still print what you want.

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives