Jun 30, 2014, 7:00 AM
Post #1 of 6
Search criteria not matching incase of iterating through multiple files
I am searching for keyword <body> and body> inside the html file and extracting their contents. I have uploaded my sample html file. It is not a real html file. I just created an example with body tags.
I am searching all the files inside the directory.
My search string <data> is successful for the first file in the directory. When the next file is picked, search criteria <data> is not found. I don't know what is wrong in the below program.
My files inside the directory are exactly same but with the different file name. ( Created for test purpose)
HTML File content:
Below is my program.
#Reading a html file
my $dir = 'c:\sunil';
#open (OUTFILE, '>>c:\test\x.xml');
opendir(DIR, $dir) or die $!;
while (my $file = readdir(DIR))
# Use a regular expression to ignore files beginning with a period
next if ($file =~ m/^\./);
$output_filename = "$dir\\output.xml\n";
open OUTFILE, ">>$output_filename" or die $!;
print OUTFILE "Title: $input_filename\n";
if("$_" =~ "/<body>/")
# Moment <body tag is found, extract all the values between
$start_reading = "read";
if ("$_\n" =~ /body>/)
# body> tag reached. Stop reading the file and exit
if ( $start_reading eq "read" )
# writing out to a file
print OUTFILE "$_\n";