CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Replacing accented characters

 



newera
Novice

May 21, 2016, 5:41 AM

Post #1 of 10 (3683 views)
Replacing accented characters Can't Post

I have a script that needs to replace French characters with English equivalents to use in an ID number.

The ID is generated by taking the first and last names, then adding a system generated number.

This is the code I have but it doesn't work:


Code
  $ID = lc(substr($first_name, 0, 1)) . lc(substr($last_name, 0, 1)) . $NEW{'intid'};


Then do the substitution:

Code
         $ID =~ s//e/gi; 
$ID =~ s//u/gi;
$ID =~ s//e/gi;
$ID =~ s//a/gi;
$ID =~ s//a/gi;
$ID =~ s//u/gi;
$ID =~ s//e/gi;
$ID =~ s//i/gi;
$ID =~ s//o/gi;
$ID =~ s//c/gi;
$ID =~ s//e/gi;
$ID =~ s//u/gi;
$ID =~ s//e/gi;
$ID =~ s//a/gi;
$ID =~ s//a/gi;
$ID =~ s//u/gi;
$ID =~ s//e/gi;
$ID =~ s//i/gi;
$ID =~ s//o/gi;



BillKSmith
Veteran

May 21, 2016, 8:52 AM

Post #2 of 10 (3677 views)
Re: [newera] Replacing accented characters [In reply to] Can't Post

You do not say what does not work. Your code for ID does not match your description. The code uses only the first character of the first and last names. It converts them to lower case and appends the number. If this is what you intend, there is no need to include upper case in your regex. (You probably should use the translation operator (tr///) instead of all those regex.)

Please post a small, but complete script that demonstrates your problem.
Good Luck,
Bill


newera
Novice

May 21, 2016, 1:59 PM

Post #3 of 10 (3672 views)
Re: [BillKSmith] Replacing accented characters [In reply to] Can't Post

Yes, I need the first character of the first and last names, then append the number. The complete script is a large one.

When French affiliates join us, their names often have accented first characters. The code I have now does not substitute them the way it is.

If no accented first characters, the script works fine.


Zhris
Enthusiast

May 21, 2016, 4:11 PM

Post #4 of 10 (3667 views)
Re: [newera] Replacing accented characters [In reply to] Can't Post

Could you expand on "doesn't work", how are you determining this and what is the result.

I suspect all you need to do is use the utf8 pragma to tell the compiler that your code contains encoded utf8 characters.


Code
use utf8;


Chris


newera
Novice

May 21, 2016, 4:51 PM

Post #5 of 10 (3664 views)
Re: [Zhris] Replacing accented characters [In reply to] Can't Post

I do use

use utf8;

Doesn't work means no ID is generated because of the accents. No accents in the names, and it generates the ID.

Example:

Bruce Therrien

ID generated is bt2 (2 is the next incremental number)

Example 2:

ve Courneyer

ID should be ec3, but no ID is generated.


Zhris
Enthusiast

May 21, 2016, 5:05 PM

Post #6 of 10 (3661 views)
Re: [newera] Replacing accented characters [In reply to] Can't Post

Taking what code you have provided, I cannot replicate your issue:

http://codepad.org/GWdDjYdq

Code
use utf8; 

my $first_name = 've';
my $last_name = 'Courneyer';
my %NEW = ( 'intid' => 2 );

$ID = lc(substr($first_name, 0, 1)) . lc(substr($last_name, 0, 1)) . $NEW{'intid'};

$ID =~ s//e/gi;
$ID =~ s//u/gi;
$ID =~ s//e/gi;
$ID =~ s//a/gi;
$ID =~ s//a/gi;
$ID =~ s//u/gi;
$ID =~ s//e/gi;
$ID =~ s//i/gi;
$ID =~ s//o/gi;
$ID =~ s//c/gi;
$ID =~ s//e/gi;
$ID =~ s//u/gi;
$ID =~ s//e/gi;
$ID =~ s//a/gi;
$ID =~ s//a/gi;
$ID =~ s//u/gi;
$ID =~ s//e/gi;
$ID =~ s//i/gi;
$ID =~ s//o/gi;

print $ID;


As Bill suggested, could you post a short complete example that demonstrates your problem.

It is highly recommended that you use both the strict and warnings pragmas. They will often help you identify problems that would otherwise be difficult to discover manually. For example, without these pragmas, undefined values are printed as blanks.

Chris


(This post was edited by Zhris on May 21, 2016, 5:11 PM)


BillKSmith
Veteran

May 21, 2016, 8:37 PM

Post #7 of 10 (3642 views)
Re: [newera] Replacing accented characters [In reply to] Can't Post

We did not ask for your production script, but rather, the shortest script that you can make that demonstrates the problem. I cannot reproduce it.

Code
use strict; 
use warnings;
use utf8;
$\="\n";
my $first_name;
my $last_name;
my $ID;
my %NEW = ( 'intid' => 1 );

$NEW{intid}++;
$first_name = 'Bruce';
$last_name = 'Therrien';
$ID = lc(substr($first_name, 0, 1))
. lc(substr($last_name, 0, 1))
. $NEW{'intid'}
;
$ID =~ tr//eueaaueioc/;
print $ID;

$NEW{intid}++;
$first_name = 've';
$last_name = 'Courneyer';
$ID = lc(substr($first_name, 0, 1))
. lc(substr($last_name, 0, 1))
. $NEW{'intid'}
;
$ID =~ tr//eueaaueioc/;
print $ID;


OUTPUT:

Code
bt2 
ec3

Good Luck,
Bill


newera
Novice

May 22, 2016, 3:09 PM

Post #8 of 10 (3626 views)
Re: [BillKSmith] Replacing accented characters [In reply to] Can't Post

Bill's code works OK, it prints the correct ID to the screen before the rest of the script continues. The problem is occuring when the data is entered into the MySQL database.
Nothing is entered at all.
So I'll need to look at my code for database entry of the ID.


BillKSmith
Veteran

May 23, 2016, 12:24 PM

Post #9 of 10 (3612 views)
Re: [newera] Replacing accented characters [In reply to] Can't Post

My best guess is that you do not have a perl problem. Are you sure that your DB is configured to accept utf-8 data? Check its documentation. If everything seems ok, write a small program to INSERT one record of ASCII data and one of utf-8. (Do not expect Cris or I to do it for you again.) Did one work and not the other? Enable all error messages for perl and for the data base. Read them! If you still need help, post your example, your expected output, the actual output, and all error messages in the database section of this forum.
Good Luck,
Bill


newera
Novice

May 23, 2016, 4:40 PM

Post #10 of 10 (3607 views)
Re: [BillKSmith] Replacing accented characters [In reply to] Can't Post

I fixed it by adding the following code that I found with a little research:


Code
  $first_name = Encode::decode_utf8( $first_name ); 
$last_name = Encode::decode_utf8( $last_name );


Now I get both the ID and the names entered in the database correctly. Thanks for all the help.

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives