CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Regular Expressions:
Finding unicode.

 



ferulebezel
New User

Dec 17, 2011, 8:13 PM

Post #1 of 6 (9105 views)
Finding unicode. Can't Post

My googleing has failed me. I can't find out how to search for unicode characters, especially those above 255.

From what I've been able to find


Code
s/\x{e9}/é/g;


or


Code
s/\x{2014}/—/g;


should work

but when I print $_ the substitutions haven't happened.

Clearly, I'm doing something wrong. What is it?


rovf
Veteran

Dec 21, 2011, 3:13 AM

Post #2 of 6 (8914 views)
Re: [ferulebezel] Finding unicode. [In reply to] Can't Post

How did you verify, that your string really contains the correct unicode character you were looking for?

BTW, your goal seems to be to convert the unicode characters to the corresponding HTML entities. Maybe it is easier to use HTML::Entities, which is a standard module in Perl.


ferulebezel
New User

Dec 21, 2011, 10:40 AM

Post #3 of 6 (8891 views)
Re: [rovf] Finding unicode. [In reply to] Can't Post

I used :ascii in Vim to get the value.

I tried using HTML::Entities and had some problems with it. It doesn't distinguish between characters in markup and characters in the text.


rovf
Veteran

Dec 21, 2011, 11:46 AM

Post #4 of 6 (8889 views)
Re: [ferulebezel] Finding unicode. [In reply to] Can't Post

I would dump the string you have in Perl, as hexadecimal value, just to make sure you have the right data. Maybe the problem already occurs when reading the data into your program...


BillKSmith
Veteran

Dec 21, 2011, 1:04 PM

Post #5 of 6 (8887 views)
Re: [ferulebezel] Finding unicode. [In reply to] Can't Post

I think you mean \x{}instead of \X{}. Refer perldoc perlre
Good Luck,
Bill


rickb
New User

Dec 22, 2011, 6:22 AM

Post #6 of 6 (8852 views)
Re: [ferulebezel] Finding unicode. [In reply to] Can't Post

I have run into this same issue and am stumped. I use BBEdit with Lion and find that using \x{} in a search/replace simply doesn't work. I also tried with TextMate; same result.

I can do a manual search/replace in BBEdit, or use a TextFactory and it works perfect. This leads me to believe that it is a Perl issue since BBEdit implements PCRE internally.

Any tips?

In Reply To


 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives