CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Intermediate:
regex involving multiple lines

 



artperl
Novice

Sep 26, 2015, 8:48 AM

Post #1 of 4 (1868 views)
regex involving multiple lines Can't Post

This is not really perl question but regexp & hope someone can advise me...

Say I have below strings in html file:


<td width="160" height="21" bgcolor="#CCECFF"><b>LIST</b></td>
<td width="590" height="21" bgcolor="#F8F8F8">NUMBER
ABCD
EFG
HIJ</td>

lines 2-5 were intentionally (or wrongly) created.
im frying my brain on what regexp i could effectively use to replace ALL the strings, including new lines, between > & <.

so the expected output would be:

<td width="160" height="21" bgcolor="#CCECFF"><b>LIST</b></td>
<td width="590" height="21" bgcolor="#F8F8F8">NEW STRING</td>


BillKSmith
Veteran

Sep 27, 2015, 8:19 AM

Post #2 of 4 (1831 views)
Re: [artperl] regex involving multiple lines [In reply to] Can't Post

Your first example violates your own specification because "<b>LIST</b>" is clearly a string

Quote
between > & <

and you did not replace it. We cannot help you until you tell us exactly what you want.
Good Luck,
Bill


artperl
Novice

Sep 28, 2015, 2:32 AM

Post #3 of 4 (1801 views)
Re: [BillKSmith] regex involving multiple lines [In reply to] Can't Post

Hi Bill,

Apology if I was not so clear in my original post.
What I was trying to achieve is a working regexp that will allow me to replace any string/number in between > < that may include new line.

For example:
INPUT:
<td width="160" height="21" bgcolor="#CCECFF"><b>INFO</b></td>
<td width="160" height="21" bgcolor="#CCECFF"><b>STRING
AND
NUMBER
</b></td>

TARGET OUTPUT:
<td width="160" height="21" bgcolor="#CCECFF"><b>NEWINFO</b></td>
<td width="160" height="21" bgcolor="#CCECFF"><b>NEWSTRINGANDNUMBER</b></td>


BillKSmith
Veteran

Sep 28, 2015, 6:36 AM

Post #4 of 4 (1789 views)
Re: [artperl] regex involving multiple lines [In reply to] Can't Post

This is probably not a job for regular expressions. The following code meets your specification. It does not do what you intend for any of your examples. I post it only to illustrate the problem with your specification. It is impossible to write a correct regex without a correct specification. You have create that, and it is far more difficult than you realize.


Code
use strict; 
use warnings;
my $input_string =
q(<td width="160" height="21" bgcolor="#CCECFF"><b>INFO</b></td>) . "\n"
.q(<td width="160" height="21" bgcolor="#CCECFF"><b>STRING) . "\n"
.q(AND) . "\n"
.q(NUMBER) . "\n"
.q(</b></td>) . "\n"
;
my $output_string = $input_string;
$output_string =~ s/>(.+?)</NEW$1/gms;
print $output_string;

OUTPUT:

Code
<td width="160" height="21" bgcolor="#CCECFF"NEW<b>INFO/bNE 
td width="160" height="21" bgcolor="#CCECFF"NEW<b>STRING
AND
NUMBER
/b></td>


A much better approach is to search CPAN for a module to parse your markup. Perhaps someone else can help you choose a module.
Good Luck,
Bill

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives