CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Regular Expressions:
Regex, ShiftJIS and locale

 



whitejm12
New User

Feb 12, 2007, 7:36 PM

Post #1 of 1 (1822 views)
Regex, ShiftJIS and locale Can't Post

My script parses pathnames, some of which contain Shift-JIS characters. (I'm working in native Windows 2000 Japanese, with Active Perl 5.6.1.) I attach an example of the input file to be parsed. We parse with:

if($line=~m/^(\w.+)\b\/\b(\w.+)\b\/\b(\w.+)\b\/\b(\w.+)\b\/\b(\w.+?\.htm)$/)

With ASCII characters, this parses the pathnames with no problems. In fact, it even works fine with Shift-JIS chars in the middle of words in the path (e.g., the first five lines of the attached file).

The problem occurs when there are Shift-JIS chars at the beginning or end of the word (e.g., all of the other lines in the attached file). I believe that using \j with

use ShiftJIS::Regexp qw(:all);

would fix the problem, but I've tried replacing \w and \b with \j
with no luck. Am I using ShiftJIS:: correctly?

It also occurs to me that I could use

use POSIX qw(locale_h);
setlocale(LC_CTYPE, "ja_JA.Shift_JIS");

as I see in perllocale.pod, so that \w behaves appropriately for the Japanese locale. However, I've tried that with no success.

I would be grateful for any ideas you might have.

Many thanks!
Attachments: BREWAPIReferencetoc.txt (3.53 KB)

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives