CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
Search Posts SEARCH
Who's Online WHO'S
Log in LOG

Home: Perl Programming Help: Intermediate:
Regular expressions on text read from UTF-16LE file


New User

Jun 22, 2011, 9:56 AM

Post #1 of 1 (1125 views)
Regular expressions on text read from UTF-16LE file Can't Post


I have recently been working on a script to localize string files for an iPhone application. These files (automatically generated by a localization tool in xcode) are 16-bit little endian unicode encoded. Since I am using a mac, I have updated my perl version to 5.12 to at least support some of the more modern unicode support features. Nonetheless, I am having significant difficulty in matching regular expressions in text read from these files. As an example, I have attached a tiny localized strings file.

The regular expression that I am trying (but failing) to match is:

$result =~ /(.*?)(\/\*|\")/;

In this file, the expected result would be $2 = /*

I used non-greedy quantification for the preceding text and wanted to terminate on either a quote or a /* (whichever comes first), but I am always getting the quote matching for $2. I tried using the same text, but in ascii, and sure enough, the pattern matches as expected. Any help would be appreciated... I have written a nice state machine parser to automate localization but just can't deal with these UTF16LE regexes.

(This post was edited by mamacken on Jun 22, 2011, 10:21 AM)


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives