CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Regular Expressions:
PCRE backtracking

 



hwnd
User

May 21, 2014, 12:32 PM

Post #1 of 11 (11454 views)
PCRE backtracking Can't Post

I am curious is there a way to find the length of backtracking in PCRE? I want to match strings that start with letters and followed by numbers, but fail if the length of the numbers is less than the length of the preceding letters.

For example these would return true on a match.

foobar12345
foob123
foo12

These would fail because the length of numbers is more than the letters.

foo1234
fo123


FishMonger
Veteran / Moderator

May 21, 2014, 12:42 PM

Post #2 of 11 (11447 views)
Re: [recruiter] PCRE backtracking [In reply to] Can't Post

Please post your code that demonstrates the problem.

Also, post an example string to be matched and what part of it you need to match.


(This post was edited by FishMonger on May 21, 2014, 12:45 PM)


hwnd
User

May 21, 2014, 12:58 PM

Post #3 of 11 (11438 views)
Re: [FishMonger] PCRE backtracking [In reply to] Can't Post

FishMonger,

I have no desired regex code right now, I could simply use


Code
([a-zA-Z]+)[0-9]+


But this is not what I am asking.

I am wondering in PCRE if you can do backtracking to check the length. I want to match a string that starts with letters, followed by numbers but only if the length of the letters is greater than the length of the numbers.

For example, this would pass.

foo12

Simply because the length of the numbers is 2 and the length of the letters is 3

But this would fail:

foo1234

Because the length of the numbers is greater than the length of the letters.


FishMonger
Veteran / Moderator

May 21, 2014, 2:04 PM

Post #4 of 11 (11410 views)
Re: [recruiter] PCRE backtracking [In reply to] Can't Post

As far as I know the answer would be no, but you could read over the man page to see if I'm wrong.
http://www.pcre.org/pcre.txt

To me, it sounds like you have an XY problem.


BillKSmith
Veteran

May 22, 2014, 5:19 AM

Post #5 of 11 (11093 views)
Re: [recruiter] PCRE backtracking [In reply to] Can't Post

I agree with FishMonger that it is probably not possible to do this with a regular expression. Note however that in native perl, very little additional code is required.


Code
use strict; 
use warnings;
while (my $case = <DATA>) {
if ($case =~ s/([a-zA-Z]+)([0-9]+)\s*/length "$1" gt length "$2"/re) {
print "Pass: $case\n";
}
else {
print "Fail: $case\n";
}
}
__DATA__
foobar12345
foob123
foo12
foo1234
fo123


OUTPUT:

Code
Pass: foobar12345 

Pass: foob123

Pass: foo12

Fail: foo1234

Fail: fo123


UPDATE:
A review of PCRE documentation ( http://www.pcre.org/pcre.txt) shows that capturing parenthesis are available. You should be able to write C code equivalent to my perl code.
Good Luck,
Bill

(This post was edited by BillKSmith on May 22, 2014, 7:47 AM)


Laurent_R
Veteran / Moderator

May 22, 2014, 2:38 PM

Post #6 of 11 (10894 views)
Re: [recruiter] PCRE backtracking [In reply to] Can't Post

PCRE is not used by Perl, it is an emulation package of Perl built-in REs. So, this is not a Perl question. It is therefore sort of off-topic here.

But don't get me wrong, this is really not meant to say that I don't want to help you on your question. But, by definition, Perl developpers don't use PCRE, they have the original built-in version, they don't need a copy. Bill has given you an answer with Perl's original RE built-ins, there is a reasonable chance that this will work under PCRE, but we Perl users don't really know PCRE. Try Bill's solution, and, if it does not work, you should rather ask your question on forums about languages using PCRE such as, I would think, Python, Ruby, PHP, possibly Javascript, Scala and some others.


hwnd
User

May 23, 2014, 9:48 PM

Post #7 of 11 (10248 views)
Re: [BillKSmith] PCRE backtracking [In reply to] Can't Post

Bill, the PCRE manpage was very interesting that it could be done as such.


Code
 (?| (?=[\x00-\x7f])(\C) | 
(?=[\x80-\x{7ff}])(\C)(\C) |
(?=[\x{800}-\x{ffff}])(\C)(\C)(\C) |
(?=[\x{10000}-\x{1fffff}])(\C)(\C)(\C)(\C))


The issue I suppose is I'm not sure how you account for the length of the first group here.


(This post was edited by recruiter on May 23, 2014, 9:55 PM)


BillKSmith
Veteran

May 24, 2014, 1:30 PM

Post #8 of 11 (9940 views)
Re: [recruiter] PCRE backtracking [In reply to] Can't Post

My advice remains the same. Capture the two substrings with parenthesis. Compute their lengths and compare them using whatever language you use to call PCRE.
Good Luck,
Bill


Zhris
Enthusiast

May 24, 2014, 2:12 PM

Post #9 of 11 (9920 views)
Re: [recruiter] PCRE backtracking [In reply to] Can't Post

It could also be done with an "irregular" extended expression (experimental feature). E.g.:


Code
m/^([a-zA-Z]+)(??{my $len = length($1); qr([0-9]{0,$len})})$/


I also noted that there is some information with regards to PCRE support in the Perl regex documentation: http://perldoc.perl.org/perlre.html#PCRE%2fPython-Support

Chris


(This post was edited by Zhris on May 25, 2014, 3:51 PM)


hwnd
User

May 24, 2014, 8:41 PM

Post #10 of 11 (9772 views)
Re: [Zhris] PCRE backtracking [In reply to] Can't Post

I suppose that only Perl can simulate code like this in regular expressions?


Zhris
Enthusiast

May 25, 2014, 3:55 PM

Post #11 of 11 (9375 views)
Re: [recruiter] PCRE backtracking [In reply to] Can't Post

I'm unfamiliar with all but a few programming languages, I have not come across this notation in others. The regex above would inevitably not be portable across languages since it uses Perl specific syntax.

Chris


(This post was edited by Zhris on May 25, 2014, 3:56 PM)

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives