
YAPHacker
Deleted
Jan 2, 2001, 10:52 PM
Post #1 of 1
(1032 views)
|
|
using precompiled regexes with the //s modifier
|
Can't Post
|
|
question= HOW DO YOU PRECOMPILE A REGEX SO THAT IT BEHAVES LIKE s/REGEX_REF/s; ? According to Damian Conway, qr/^regex$/; returns a reference to a pre-compiled regular expression; excellent for foreach or while constructs where it may be interpolated without the overhead of a call to the perl parsing mechanism with each iteration through the loop. In one of the modules I'm working on I use regular expressions to parse template files for in-line perl code embedded in these documents. I use tags like so: { $object->{'base'} }% ...in order to interpolate values for variables defined in my programs. Because I often call on the method in this module which is responsible for parsing through files in database-driven programs, using a pre-compiled regular expression can dramatically speed them up. I have a problem though when I do so. To better explain, I start by showing the code without pre-compiled regexes: <perlcode> $_[0] =~ s/%\{(.*?)\}%/$this->evalu(\$1)/gsex; </perlcode> In regexptut - perldocs for regular expressions you can read that where a regular expression is followed by the modifier /regex$/s it is to match everything up to the end of the block in question (because of the '$' anchor) and ignore \n (newline) boundaries. Here's a snippet from the docs: <perldoc snippet> Perl allows us to choose between ignoring and paying attention to newlines by using the (//s) and (//m) modifiers. (//s) and (//m) stand for single line and multi-line and they determine whether a string is to be treated as one continuous string, or as a set of lines. The two modifiers affect two aspects of how the regexp is interpreted: 1) how the '.' character class is defined, and 2) where the anchors ^ and $ are able to match. Here are the four possible combinations: no modifiers (//): Default behavior. '.' matches any character except "\n". ^ matches only at the beginning of the string and $ matches only at the end or before a newline at the end. s modifier (//s): Treat string as a single long line. '.' matches any character, even "\n". ^ matches only at the beginning of the string and $ matches only at the end or before a newline at the end. m modifier (//m): Treat string as a set of multiple lines. '.' matches any character except "\n". ^ and $ are able to match at the start or end of any line within the string. both s and m modifiers (//sm): Treat string as a single long line, but detect multiple lines. '.' matches any character, even "\n". ^ and $, however, are able to match at the start or end of any line within the string. </perldoc snippet> The problem I have when I pre-compile my regex is that I can't compile into the regex the (//s) modifier's functionality. When I try something like... <perlcode> my $regex_ref = qr/%\{(.*?)\}%/; $_[0] =~ s/$regex_ref/$this->evalu(\$1)/gsex; </perlcode> ...the behavior of the (//s) modifier is lost and the regex only matches embedded perl constructs that span one line. Constructs such as the one below are skipped over... <embedded perlcode> %{ ($!) ? qq{<blockquote> <strong>Process number $processes->[$_] was completed, but warnings were generated:</strong> $! </blockquote> } : qq{<blockquote> <strong>Process number $processes->[$_] was sucessfully completed!</strong> </blockquote>} }% </embedded perlcode> ...while ones like this get evaluated just fine... <embedded perlcode> Welcome back, %{ $user->{'name'} }%! </embedded perlcode> I've even tried making the regex reference like qr/regex/s; but of course that causes the perl interpreter to throw an exception stating that there is a syntax error near 'qr/regex/s'. And just in case you are wondering what $this-evalu(); does, here's the code for that, even though I believe it is of no consequence to the current problem... <perlcode> sub evalu { my $this = shift; my $tmp = ${$_[0]}; $tmp =~ s/\$(.*?)->/\$this->{objects}->\{$1\}->/gs; return(eval($tmp)); } </perlcode> All help is very much appreciated. I've looked through all the documentation I can find, but with no luck for this problem. Please excuse the length of this posting. - tommy, yetanother.perlhacker@atrixnet.com
|