
7stud
Enthusiast
Mar 28, 2010, 4:29 PM
Views: 3776
|
|
Re: [JonathanPool] hex metacharacters for characters below x100
|
|
|
I can assure you it's not. I copy and pasted the UTF-8 character C3 84 into my string. That character's Unicode code point is U+00C4, and it's official name is "LATIN CAPITAL LETTER A WITH DIAERESIS". So, I'm not sure what you are talking about.
If on your system Perl converts characters into UTF-8, then I understand it finds no match. But what makes Perl do that? I believe I haven't seen that behavior. perlunicode:
Regular Expressions The regular expression compiler produces polymorphic opcodes. That is, the pattern adapts to the data and automatically switches to the Unicode character scheme when presented with data that is internally encoded in UTF-8 -- or instead uses a traditional byte scheme when presented with byte data.
(This post was edited by 7stud on Mar 28, 2010, 4:35 PM)
|