Dec 4, 2001, 5:42 AM

Please, look at this. Why $l starts beeing initialized only first match? (Constractig this re i guessed like this: i need find (mark *x*) everyhting between > and <: ([^<>]+?) which is not newlines and blank spaces: ($l !~ /\w/) ?">$1<":">*$l*<" )


use strict;
#my $str = qq{<html><b> <c><p>aaaaaaa</p> - if blankspace betwean b and c present - aaaaa matches

my $str = qq{<html><b><c><p>aaaaaaa</p>

my $l = '';
$str =~ s{>([^<>]+?)<(?{ $l = $1;})}{($l !~ /\w/) ?">$1<":">*$l*<"}gexsmi;

print "Content-type: text/html\n\n";
print $str."\n";


Dec 4, 2001, 6:34 AM

But you do not make it clear what you are trying to do. Can you clarify your desired result?


$a="c323745335d3221214b364d545a362532582521254c3640504c37292f493759214b3635554c3040606a0",print unpack"u*",pack "h*",$a,"\n\n"


Dec 4, 2001, 6:41 AM

I want to mark all text in html. I.e. to find all in <tag>mark this</tag>
and, yeah, sorry, i meant
Why $l starts beeing initialized only AFTER first match?


Aug 9, 2002, 7:18 AM

you're using the wrong tool for the job.
use HTML::Parser or HTML::TokeParser
or something similar (XML::Parser or what have you)

$l starts being "initialized" only after the
first match because you're assigning $1 to it.
$1 will only contain a value after you match something,
which happens after you match something, so until
you match something, $1 will not be defined
( it will be uninitialized).

use re 'debug';

to the top of your script to see extra debugging output

You've chosen the wrong approach, and that
hairy regex ain't gonna cut it (it can be fooled).
please go to CPAN and get yourself a HTML parser,
they're very easy to use, easier than Regular Expressions