CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Intermediate:
NULL bit in a file written with Perl

 



Benjy
New User

Jun 22, 2010, 6:26 AM

Post #1 of 3 (588 views)
NULL bit in a file written with Perl Can't Post

Hi everyone,

I'm writing a script that convert an XLS file to an HTML one, but I got a strange problem.

The generated HTML file can be opened with Firefox without any problem, and I can see it source code with Firefox, but when I'm trying to open it (the generated HTML file) with Geany, it says it can't be opened entirely because of the possible presence of a "NULL bit".

With Gedit, I get the normal error while trying to open a binary file : "Could not open the file name.html using the Unicode (UTF-8) character coding. Please check that you are not trying to open a binary file.
Select a different character coding from the menu and try again."

How can it be possible to get a binary file while I just wrote an HTML file ? I thought it could come from the encoding of an Excel file (I assumed it was ANSI - cp1252), so I used Encode to decode it from cp1252 and then encode it to iso-8859-1, but it didn't change anything.

I give you the main part of my code. Since it is pretty big (well, I bet you saw worse than that), I signaled with ####### the places where I print things in the output file).

Code
#!/usr/bin/perl  

use strict;
use Spreadsheet::ParseExcel qw (new Parse worksheets);
use Encode;

my $fileIn = shift;
my $fileOut = shift;

my $parser = Spreadsheet::ParseExcel -> new();
my $workbook = $parser -> parse($fileIn);
my $worksheet = $workbook -> worksheet(0);
my $ligmax = $worksheet -> row_range();
my $colmax = $worksheet -> col_range();

# #############################################
# #############################################
# File where to write the result
open(FO,">$fileOut") || die "Cannot open file: $!\n";
# #############################################
# #############################################

# #############################################
# #############################################
# Beginning of the HTML file
print FO '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">'."\n".
'<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="fr" >'."\n".
' <head>'."\n".
' <title>'.$filename.'</title>'."\n".
' <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />'."\n".
' </head>'."\n".
' <body>'."\n".
' <table>'."\n";
# #############################################
# #############################################

# Iteration on each line and each column of the worksheet, writing down each cell from left to right and from top to bottom
my $bloc = '';
for(my $i = 1 ; $i <= $ligmax ; $i++)
{
# #############################################
# #############################################
print FO " <tr>\n";
# #############################################
# #############################################
for(my $j = 0 ; $j <= $colmax ; $j++)
{
my $cell = $worksheet -> get_cell($i,$j);
if(defined($cell))
{
# #############################################
# #############################################
# HERE I GET THE CELL VALUE
my $contenu = $cell -> value();

# #############################################
# #############################################

my $format = $cell -> get_format();
my $gras;
if($format->{Font}->{Bold})
{
$gras = "bold";
}
else
{
$gras = "normal";
}
my $italique;
if($format->{Font}->{Italic})
{
$italique = "italic";
}
else
{
$italique = "normal";
}
my %font = ("nom" => $format->{Font}->{Name},
"gras" => $gras,
"italique" => $italique,
"taille" => $format->{Font}->{Height},
"couleur" => $format->{Font}->{Color}
);
my $alignH = $format -> {AlignH};

if($alignH == 3)
{
$alignH = "right";
}
elsif($alignH == 2 || $alignH == 6)
{
$alignH = "center";
}
elsif($alignH == 5 || $alignH == 4)
{
$alignH = "justify";
}
else
{
$alignH = "left";
}
my @fill = $format -> {Fill};
my $couleur = $parser -> ColorIdxToRGB($fill[0][1]);

# #############################################
# #############################################
print FO ' <td style="font-family:'.$font{"nom"}.';'.
'font-size:'.$font{"taille"}.'px;'.
'color:'.$font{"couleur"}.';'.
'font-weight:'.$font{"gras"}.';'.
'font-style:'.$font{"italic"}.';'.
'text-align:'.$alignH.';'.
'background:#'.$couleur.';'.
'">'."\n";
# #############################################
# #############################################
# HERE I PRINT THE CELL VALUE IN THE OUTPUT FILE
print FO ' '.$contenuIso."\n";
# #############################################
# #############################################
print FO " </td>\n";
# #############################################
# #############################################

}
}
# #############################################
# #############################################
print FO " </tr>\n";
# #############################################
# #############################################

}

# #############################################
# #############################################
print FO " </body>\n</html>";

close FO || die "An error while closing $fileIn\n";



Benjy
New User

Jun 22, 2010, 8:56 AM

Post #2 of 3 (580 views)
Re: [Benjy] NULL bit in a file written with Perl [In reply to] Can't Post

Well, it came obviously from my Excel file, created with Gnumeric and not Excel... Problem solved Blush


kencl
User

Jul 3, 2010, 10:35 AM

Post #3 of 3 (509 views)
Re: [Benjy] NULL bit in a file written with Perl [In reply to] Can't Post

Never trust input. I would just strip nulls from the content before writing:

Code
# HERE I GET THE CELL VALUE  
my $contenu = $cell -> value();
$content =~ s/\0//gs;


>> If you can't control it, improve it, correlate it or disseminate it with PERL, it doesn't exist!

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives