
jryan
User
Jul 20, 2002, 2:31 PM
Post #2 of 4
(9119 views)
|
Re: [dmittner] Substitution within tags, multiple times
[In reply to]
|
Can't Post
|
|
First of all, that use of symrefs like that is terrible. Symrefs are evil; they'll do nothing but cause you loads of problems down the road. You probably want to use a hash instead. Please see http://perl.plover.com/varvarname.html for more details. Next, your code above is waaaaay to complex in some respects, and hardly working in others. Lets take a look:
while ( $$text =~ /\[code\].*?\[*?\[\/code\]/i || $$text =~ /\[code\].*?\].*?\[\/code\]/i ){ $$text =~ s!\[code\](.*?)\[(.*?)\](.*?)\[/code\]!\[code\]$1\&\#091;$2\&\#093;$3\[/code\]!i } Notice that you are matching against the same text twice; once in the loop, the next in the substitution. Too much work, if you ask me. Let the /g modifier do most of the work for you. You are also trying to do too much at once. The general gist of what you need to do is: Find some code, then escape the brackets. Thats 2 steps, and you are trying to do it with one. Thats going to lead to some pretty confusing code. Lets break it up a bit; first, we'll find the code, and then do the substitutions on that code. But first, lets add a few regex definitions to make your code more readable:
my $header = qr { \[ code \] }xi; my $footer = qr { \[ \/ code \] }xi; Now lets work on structuring your regex. What we really want to work with is the data inside the code tags, so lets start with that. This "code" is described the following way:
Captured code (preceded by a header) (followed by a footer) or (header) <-- (captured code) --> (footer) translated into a regex:
$text =~ s/ (?<= $header ) # preceded by a header ($code) # captured code (?= $footer ) # followed by a footer /process_code($1)/gex; Next, we need to describe what the code really is. Code is:
A string of: non-brackets or backlashed brackets or bracketed text thats not [ /code] or translated into a regex
my $code = qr { (?: # string of non-brackets (?> [^\[]* ) | # backslashed brackets (?: (?<= \\) . ) | # bracketed text thats not [ /code] (?: (?! $footer ) \[ ) )* }x; The only thing to do now is process the text. Pretty trivial - just a simple substitution of brackets for their escaped values.
sub process_code { my($subst) = @_; $subst =~ s! \[ ! [ !gx; $subst =~ s! \] ! ] !gx; return $subst; } To sum that up, we end up with:
# definitions my $header = qr { \[ code \] }xi; my $footer = qr { \[ \/ code \] }xi; my $code = qr { (?: # string of non-brackets (?> [^\[]* ) | # backslashed brackets (?: (?<= \\) . ) | # bracketed text thats not [ /code] (?: (?! $footer ) \[ ) )* }x; # globally substitute our text $text =~ s/ (?<= $header ) ($code) (?= $footer ) /process_code($1)/gex; # "processing"; substitute escaped values for brackets sub process_code { my($subst) = @_; $subst =~ s! \[ ! [ !gx; $subst =~ s! \] ! ] !gx; return $subst; } Of course, this code doesn't allow for nested [code]tags. Your markup langauge seems very similar to an sgml-type language; you might be better off using something like HTML::Parser or one of its clones.
|