Home: Perl Programming Help: Beginner:
parsing html, yet another utf8 problem :(



orange
User

Jan 17, 2013, 4:56 AM


Views: 1508
parsing html, yet another utf8 problem :(

I got another problem with utf8, this part:


Code
use HTML::Parse; 
my $p = HTML::Parser->new(
text_h => [\&text_rtn, 'dtext'],
);
$p->utf8_mode( 1 );
$p->parse_file("$file");

sub text_rtn {
foreach (@_) {
progress ( "\tParsed: >$_<\n");
}


doesn't work. It outputs instead of

:(


(This post was edited by orange on Jan 17, 2013, 4:57 AM)


orange
User

Jan 17, 2013, 5:14 AM


Views: 1502
Re: [orange] parsing html, yet another utf8 problem :(

er, I found the solution, need to open file first as utf8:


Code
 open(my $fh, "<:utf8", "foo.html") || die; 
$p->parse_file($fh);



(This post was edited by orange on Jan 17, 2013, 5:15 AM)