CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Regular Expressions:
Precompiled Regex

 



blackeagle225
Novice

Jul 27, 2011, 10:12 AM

Post #1 of 5 (3725 views)
Precompiled Regex Can't Post

Hello,

I have a program that uses two regex expressions. The two expressions are the same (I'm using them to extract the values from two of the columns). I have tried to precompile the regex to run faster with qr, but either I'm doing it wrong, or it's not actually running faster. I would like to speed up this program and would love any help. Thanks in advance.


Code
$RegexString = qr/\w+\t(\d+)\t(\d+)\t\w+/; 
...
if ($LineSox =~ $RegexString) { #search for pattern and set the results as the indicated variable.
$ChrStartSox = $1;
$ChrEndSox = $2;
...

if ($Exact =~ $RegexString) { #Search for the pattern, and put them in variables.
$ChrStartExact = $1;
$ChrEndExact = $2;
...



BillKSmith
Veteran

Jul 27, 2011, 7:53 PM

Post #2 of 5 (3680 views)
Re: [blackeagle225] Precompiled Regex [In reply to] Can't Post

I do not think you are doing anything wrong. Regular expressions which do not require interpolation are compiled at the same time as the rest of your perl. The qr operator is unlikely to offer any advantage in speed.

The book "Mastering Regular Expressions" addresses the issue of faster regular expressions. In general, I find most of its advice to be to complicated to be practical. If the issue is really important it may be worth the effort.

I would experiment with different approaches. Perhaps you can parse your data by spliting on tabs.
Good Luck,
Bill


blackeagle225
Novice

Jul 28, 2011, 4:51 AM

Post #3 of 5 (3512 views)
Re: [BillKSmith] Precompiled Regex [In reply to] Can't Post

Yeah, I've already tried splitting by \t, and it wasn't much faster. Saved maybe 10 seconds out of ~100 hours. I've also tried removing the unnecessary parts so the regex became


Code
\t(\d+)\t(\d+)\t


That didn't seem to help either. Any other ideas?


BillKSmith
Veteran

Jul 28, 2011, 6:21 AM

Post #4 of 5 (3489 views)
Re: [blackeagle225] Precompiled Regex [In reply to] Can't Post

Sorry, I cannot offer any real help. If most of your strings do not contain useful data, it may be faster to test without capturing parens. Retest only the matching strings with the parens. Specify the possible number of digits with \d{min,max} rather than \d+.
Good Luck,
Bill


blackeagle225
Novice

Jul 28, 2011, 6:29 AM

Post #5 of 5 (3486 views)
Re: [BillKSmith] Precompiled Regex [In reply to] Can't Post

No, I appreciate your help. The thing is all my lines look exactly like that (except maybe a few out of 10,000+ lines). I guess it just needs time to do the job. Thanks for the advice.

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives