Jul 5, 2002, 3:59 AM
Post #1 of 8
(cgi) UNICODE postings
At my forum, I use a module that escapes the ASCII codes into HTML escape sequences. For example, < becomes < Besides the normal characters, I also convert all other characters, like é Ÿ and £.
This approach clashes with UNICODE language sets, like russian and arab. Those languages use 2-bytes, and perl's s/// only checks one byte each time.
What can I do about this? Limit the replacement (not my favourite idea) to & < and > ? Or is there a way I can see someone posted in UNICODE?, so my s/// handles this.
As attachment, I've included the file converting the characters. (windows users: use wordpad to read the file, to handle unix 'lf' sequences..)
Yet Another Perl Programmer
~~> [url=http://www.codingdomain.com]www.codingdomain.com <~~
More then 3500 X-Forum [url=http://www.codingdomain.com/cgi-perl/downloads/x-forum]Downloads!