
yapp
User
Jul 5, 2002, 3:59 AM
Post #1 of 8
(1182 views)
|
(cgi) UNICODE postings
|
Can't Post
|
|
Hello, At my forum, I use a module that escapes the ASCII codes into HTML escape sequences. For example, < becomes < Besides the normal characters, I also convert all other characters, like é Ÿ and £. This approach clashes with UNICODE language sets, like russian and arab. Those languages use 2-bytes, and perl's s/// only checks one byte each time. What can I do about this? Limit the replacement (not my favourite idea) to & < and > ? Or is there a way I can see someone posted in UNICODE?, so my s/// handles this. As attachment, I've included the file converting the characters. (windows users: use wordpad to read the file, to handle unix 'lf' sequences..) Yet Another Perl Programmer _________________________________ ~~> [url=http://www.codingdomain.com]www.codingdomain.com <~~ More then 3500 X-Forum [url=http://www.codingdomain.com/cgi-perl/downloads/x-forum]Downloads!
|