CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Regular Expressions:
using variable to select for match length

 



jcolosi
stranger

Sep 27, 2001, 4:26 PM

Post #1 of 5 (6135 views)
using variable to select for match length Can't Post

Imagine the following:

$x = 5aaaaa;
$x =~ /(\d)a{\1}/; # This does NOT match

I would like $x to match only if there are as many 'a' characters as the number in the beginning of the string. I can print out the number in the string like so:

$x = "5aaaaa";
$x =~ /(\d)(?{print "$1\n";})/

... but I can't seem to use the number to select the length of the following characters. Any help or pointers are greatly appreciated.

-- John







mhx
Enthusiast

Sep 29, 2001, 10:28 AM

Post #2 of 5 (6130 views)
Re: using variable to select for match length [In reply to] Can't Post

I could have come up with a quick solution using two pattern matches, but I didn't like that. Wink
So I played around with some regex features I haven't been using up to now and I found a way to express what you want in a single regex:

Code
$str = '5aaaaa'; 
if( $str =~ /^(\d+)(??{"a{$1}"})$/ )
{ print "MATCH!!!\n" }
else
{ print "No Match...\n" }

The (??{...}) will evaluate the enclosed code and use the result as a pattern to match.
Hope this helps.

-- Marcus


Code
s$$ab21b8d15c3d97bd6317286d$;$"=547269736;split'i',join$,,map{chr(($*+= 
($">>=1)&1?-hex:hex)+0140)}/./g;$"=chr$";s;.;\u$&;for@_[0,2];print"@_,"



jcolosi
stranger

Sep 30, 2001, 6:52 AM

Post #3 of 5 (6127 views)
Re: using variable to select for match length [In reply to] Can't Post

Pretty cool Marcus. I gave this a try and it worked on my example... BUT (there's always a but) I found a couple of problems:

1) Perl 5.005_03 would not run the code. Perl 5.6.0 had no problem. Which version of Perl were you using?

2) I'm writing a report parser. The number of records in the report appears at the top of the report. I thought it would be great if I could check the real number of records against the record number at the top of the report ALL from within a single regexp. Ultimately, when I placed the more complicated record regexp in the (??{}) expression the match failed. The record regexp has things like \s* and grouped expressions like (a|b|c)...

I've been trying to sort of "factor out" the a expression as I think it would make your technique more powerful. I've tried things like:

(\d+)a(??{"{$1}"})
(\d+)a{(??{"$1"})}

If there was a way to evaluate just the part inside the brackets which actually denotes the multiplier then we'd have it... so close.

This internal check would be a real bonus to my little program, but I'm running out of things to try. Thanks again for your help. Let me know if you have any other ideas.

-- John



mhx
Enthusiast

Sep 30, 2001, 8:06 AM

Post #4 of 5 (6126 views)
Re: using variable to select for match length [In reply to] Can't Post


In Reply To
1) Perl 5.005_03 would not run the code. Perl 5.6.0 had no problem. Which version of Perl were you using?

I'm using 5.6.0, 5.6.1 and 5.7.2. I've tested the code with 5.6.0. Perl 5.6.0 introduced some new (experimental) regex features, two of which are (?{...}) and (??{...}). So this won't work with any version older than 5.6.0.

In Reply To
2) I'm writing a report parser. The number of records in the report appears at the top of the report. I thought it would be great if I could check the real number of records against the record number at the top of the report ALL from within a single regexp. Ultimately, when I placed the more complicated record regexp in the (??{}) expression the match failed. The record regexp has things like \s* and grouped expressions like (a|b|c)...

Sorry, I don't really get it. Crazy
Perhaps you could post the source data and the regex (or even a greater code snippet) you're trying to use.

In Reply To
I've been trying to sort of "factor out" the a expression as I think it would make your technique more powerful. I've tried things like:

Code
(\d+)a(??{"{$1}"}) 
(\d+)a{(??{"$1"})}

If there was a way to evaluate just the part inside the brackets which actually denotes the multiplier then we'd have it... so close.

There's no such way to dynamically insert a repetition count. Only complete regexes can be inserted dynamically, like I did in my last post. I don't know how familiar you are with regular expressions, but perhaps you find some debug output of the regex engine useful while playing around with regexes. Just insert

Code
use re 'debug';

on top of your script and you will see lots of additional information about the regexes you're using.

Hope this helps.

-- Marcus


Code
s$$ab21b8d15c3d97bd6317286d$;$"=547269736;split'i',join$,,map{chr(($*+= 
($">>=1)&1?-hex:hex)+0140)}/./g;$"=chr$";s;.;\u$&;for@_[0,2];print"@_,"



jcolosi
stranger

Oct 1, 2001, 10:47 AM

Post #5 of 5 (6120 views)
Re: using variable to select for match length [In reply to] Can't Post

Marcus,

The debugging technique was pretty helpful. I realized that the expression inside the '(??{})' must use double escapes instead of single ones. This is why my regexp with whitespace was failing. For instance:

$x = "5a xb xc x";
$x =~ /^(\d+)(??{"((a|b|c)\\s*x){$1}"})$;

This regexp works but the '\\' is necessary otherwise the expression tries to match the 's' character.

This works really well. Thanks for your help. Now I just have to see if they'll let me use 5.6.0 in production!?!

thanks,
-- John


 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives