Home: Perl Programming Help: Regular Expressions:
Substitutions in an html file



kvn
New User

Jun 25, 2010, 11:36 AM


Views: 7122
Substitutions in an html file

Hi All,

I have an html file which have lines of this sort.

.....

<li><a href="#tab3"><em><?=$_['Daily']?> - <?=$_['RPO Graph']?></em></a></li>

.....

I am in need of replacing the above lines with the following:

<li><a href="#tab3"><em><?=NLS('Daily')?> - <?=NLS('RPO Graph')?></em></a></li>
That is substitution of
<?=$_['.TEXT.']?> To
<?=NLS(.TEXT.)?>

I have used the following code, but it is not working.

Thanks and Regards,
kvn


Zhris
Enthusiast

Jun 26, 2010, 12:55 AM


Views: 7089
Re: [kvn] Substitutions in an html file

Hi,

I think you forgot to include your code. However here is an example (untested):


Code
#!/usr/bin/perl            
use strict;

my $tempfile = 'tempfile.html';
my $htmlfile = 'htmlfile.html';
my $search = "<\?=\$_\['\.TEXT\.'\]\?>";
my $replace = '<?=NLS(.TEXT.)?>';

open (FH1, "<$tempfile") || die "Cannot Open FH1 - $tempfile - $!";
open (FH2, ">$htmlfile") || die "Cannot Open FH2 - $htmlfile - $!";

while (my $line = <FH2>) {
chomp $line;
$line =~ s/$search/$replace/g;
print FH1 "$line\n";
}

close (FH2);
close (FH1);

rename($tempfile, $htmlfile);


Note that special characters which are used for pattern matching must be escaped whilst searching. Or you could use /\Q$search\E/ , however I think the $ will still need to be escaped to ensure $_ isn't interpolated.

Chris


(This post was edited by Zhris on Jun 26, 2010, 12:29 PM)


kvn
New User

Jun 26, 2010, 9:59 PM


Views: 7034
Re: [Zhris] Substitutions in an html file

Hi CHris,
I forgot to include the code. As suggested by you, I have used the following code.

#! /usr/bin/perl
$search = '<\?=\$\_\[(.*)\]\?>' ;
$replace = '<?=NLS($1)?>';
while ($line = <>)
{
if ($line =~ m/($search)/ )
{
print $line;
$line =~ s/$search/$replace/;
print $line;
}
}

In the following, i want to replace whatever .TEXT. with text grouping (.*) and replace it as substitution in the line which has the match using the following
$line =~ s/$search/$replace/;

But I am unable to replace the text with the grouping. For example, log1 is the text content
# cat log1
<li><a href="#tab3"><em><?=$_['Daily']?> - <?=$_['RPO Graph']?></em></a></li>

# perl test.pl log1
<li><a href="#tab3"><em><?=$_['Daily']?> - <?=$_['RPO Graph']?></em></a></li>
<li><a href="#tab3"><em><?=NLS($1)?></em></a></li>

Also if i have two matches in the same line like above it is not matching the next one.

Thanks and Regards,
Vivek


Zhris
Enthusiast

Jun 27, 2010, 12:08 PM


Views: 6996
Re: [kvn] Substitutions in an html file

Hey,

The $search expression isn't escaped properly (i.e. parts which shouldn't be escaped, are) and (.*) will match after the "closing" single quote '. You also shouldn't have put the $replace expression in a variable because $1 is being interpolated before searching/replacing.

I have tested our scripts, made some changes, and here is a working example:


Code
#!/usr/bin/perl             
use strict;

my $search = "\Q<?=\$_['\E([^']*)\Q']?>\E";

while (my $line = <DATA>) {
if ($line =~ m/$search/) {
print "$line\n";
$line =~ s/$search/<?=NLS($1)?>/g;
print "$line\n";
}
}

__DATA__
<li><a href="#tab3"><em><?=$_['Daily']?> - <?=$_['RPO Graph']?></em></a></li>


Output:


Code
<li><a href="#tab3"><em><?=$_['Daily']?> - <?=$_['RPO Graph']?></em></a></li> 
<li><a href="#tab3"><em><?=NLS(Daily)?> - <?=NLS(RPO Graph)?></em></a></li>


Things to note:
- I have used \Q and \E to separate groups where I don't want special characters to be activated, and I have escaped the $ in $_ to stop $_ from being interpolated.
- I have no longer put "replace" in a variable because I don't want $1 to be interpolated yet. I have kept "search" in a variable otherwise the single quotes create an issue.
- Instead of using (.*) I have used ([^']*) to ensure that it only matches anything except a single quote ' (stops any matching after the single quote ').
- To ensure all occurrences are searched and replaced, I used the g switch at the end of the regular expression.

Chris


(This post was edited by Zhris on Jun 27, 2010, 1:45 PM)


kvn
New User

Jun 29, 2010, 7:48 PM


Views: 6877
Re: [Zhris] Substitutions in an html file

Hi Chris,
I have modified the code as suggested by you. It worked as expected.
Many Thanks for your help.

Thanks and Regards,
KVN