CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Regular Expressions:
New to Perl Need Help

 



Jester_48
New User

Mar 26, 2002, 11:50 AM

Post #1 of 10 (11577 views)
New to Perl Need Help Can't Post

Unsure I am having trouble understanding regular expressions, I just can't seem to get them into my brain and comprehend them so I was hoping some one could look these over and give me some direction as to whether I am right, wrong, close, not a freakin' clue... etc.

this should match a name in the formats J, Joe, John, John Paul or John-Paul with a minimum of 1 character entered and a maximum of 30

if (param("name") !~ /([A-Z][a-z]*-? ?[A-Z]*[a-z]*){1,30}/) { then output error msg }

this is supposed to match an email address, I am trying to allow for a 4 character name for the extension (ie new .name addresses)

if (param("email") !~ /^\w*\@\w*\.[A-Za-z]{3,4}$/) { then output error msg }



Please help if you can as I am really terrible at regexp and would like to know what I am doing right/wrong.

TIA


mhx
Enthusiast

Mar 26, 2002, 11:11 PM

Post #2 of 10 (11572 views)
Re: [Jester_48] New to Perl Need Help [In reply to] Can't Post


In Reply To
this should match a name in the formats J, Joe, John, John Paul or John-Paul with a minimum of 1 character entered and a maximum of 30

if (param("name") !~ /([A-Z][a-z]*-? ?[A-Z]*[a-z]*){1,30}/) { then output error msg }


That's not what it does. You can't (at least I believe you can't ;) accomplish the task of checking the total number of characters and matching against your pattern in a single regex. So, that's what I would do:

[perl]
#!/usr/bin/perl -w
use strict;

my $name = '(?:[A-Z][a-z]*)';
my $regex = qr/^$name(?:[\s-]$name)?$/;

while( <DATA> ) {
chomp;
s/^\s+//; s/\s+$//; # remove leading/trailing blanks
print "'$_' => FAIL\n" if length > 30 or $_ !~ $regex;
}

__DATA__

Thisnamematches Butitisfartoooooolong
J
Joe
John
John Paul
John-Paul
JASON
Judy J J
[/perl]

where the essential part is the regular expression that only tests if the name itself is valid, while the length check is done with a separate length call. The check for a minimum length of 1 is implicitly included in the regex.


In Reply To
this is supposed to match an email address, I am trying to allow for a 4 character name for the extension (ie new .name addresses)

if (param("email") !~ /^\w*\@\w*\.[A-Za-z]{3,4}$/) { then output error msg }


Don't try to invent your own email-matching-regex. The only true regex that will check an email-address for validity is over 6000 bytes and looks more or less like this:


Code
[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\ 
xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xf
f\n\015()]*)*\)[\040\t]*)*(?:(?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\x
ff]+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff])|"[^\\\x80-\xff\n\015
"]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015"]*)*")[\040\t]*(?:\([^\\\x80-\


... so I guess you won't hack it yourself. Wink

Fortunately, it is available through the [url=http://search.cpan.org/search?dist=Email-Valid]Email::Valid module with a nice interface.

-- mhx

At last with an effort he spoke, and wondered to hear his own words, as if some other will was using his small voice. "I will take the Ring," he said, "though I do not know the way."

-- Frodo



Paul
Enthusiast

Mar 27, 2002, 2:27 AM

Post #3 of 10 (11570 views)
Re: [mhx] New to Perl Need Help [In reply to] Can't Post

Forget that Wink

This is good enough for a basic check:

\S+@\S+\.\S+


(This post was edited by RedRum on Mar 27, 2002, 2:28 AM)


mhx
Enthusiast

Mar 27, 2002, 3:27 AM

Post #4 of 10 (11567 views)
Re: [RedRum] New to Perl Need Help [In reply to] Can't Post


In Reply To
Forget that

This is good enough for a basic check:

\S+@\S+\.\S+


Well, then I'll give you my perfectly valid email address:

Code
!@#@#$#%.2414


Besides, Email::Valid can do a lot more than just testing if a string matches a certain pattern.

-- mhx

At last with an effort he spoke, and wondered to hear his own words, as if some other will was using his small voice. "I will take the Ring," he said, "though I do not know the way."

-- Frodo



Paul
Enthusiast

Mar 27, 2002, 4:00 AM

Post #5 of 10 (11565 views)
Re: [mhx] New to Perl Need Help [In reply to] Can't Post

I did say basic check. People won't know what regex is being used within a script and will just enter:

sdufsdfusd

...or something like that for an email address and when it tells them it is invalid they'll enter a real one or give up.

>>Besides, Email::Valid can do a lot more than just testing if a string matches a certain pattern<<

Indeed it can


(This post was edited by RedRum on Mar 27, 2002, 4:03 AM)


mhx
Enthusiast

Mar 27, 2002, 4:56 AM

Post #6 of 10 (11560 views)
Re: [RedRum] New to Perl Need Help [In reply to] Can't Post


In Reply To
People won't know what regex is being used within a script


At least I know now. Wink

But since I usually use foo@aol.com when I don't want to enter my address, it really doesn't make a difference...

-- mhx

At last with an effort he spoke, and wondered to hear his own words, as if some other will was using his small voice. "I will take the Ring," he said, "though I do not know the way."

-- Frodo



freddo
User

Mar 31, 2002, 5:02 AM

Post #7 of 10 (11548 views)
Re: [mhx] New to Perl Need Help [In reply to] Can't Post

Hehe,


In Reply To
But since I usually use foo@aol.com when I don't want to enter my address, it really doesn't make a difference...


Same for me i generally use em@i.l and when that one isnt accepted, i just enter webmaster@127.0.0.1 Laugh

freddo
;---


yapp
User

Apr 2, 2002, 1:11 AM

Post #8 of 10 (11539 views)
Re: [freddo] New to Perl Need Help [In reply to] Can't Post

In the book "CGI programming with Perl", by "O'Reilly", a regexp is discussed that validates common used e-mail addresses.

The 400-chars link e-mail check regexp is a experiment of the author from the book "Mastering Regular Expressions", to find a regexp that validates e-mail addresses allowed through the SMTP RFC specs. That also includes comments, whitespaces and user-group addresses, like:

Alfred Neuman <Neuman@BBN-TENEXA>
":sysmail" @ Some-Group . Some-Org
Nuhammed.(I am the Greatest) Ali @(the)Vegas.WBA

Not something you want to accept. The internet has changes since ARPA net. Wink

Here is the version I used in my [url=http://www.codingdomain.com/perl/downloads/x-modules/test_input.html]Test::Input module:

[perl]
my $EmailTestRegExp;
sub IsEmail ($)
{ my($Email) = @_;

if(! defined $EmailTestRegExp)
{
# This regualar expression is more useful that Email::Valid,
# E-mail valid allows all types of e-mail addresses allowed
# by the RFC's. That includes grouping, and meta-characters.
# This regexp however, only allows 'normal' names, followed
# by a domainname or ipaddress,

my $esc = '\\\\';
my $space = '\040';
my $ctrl = '\000-\037';
my $dot = '\.';
my $nonASCII = '\x80-\xff';
my $CRlist = '\012\015';
my $letter = 'a-zA-Z';
my $digit = '\d';

my $atom_char = qq{ [^$space<>\@,;:".\\[\\]$esc$ctrl$nonASCII] };
my $atom = qq{ $atom_char+ };
my $byte = qq{ (?: 1?$digit?$digit |
2[0-4]$digit |
25[0-5] ) };

my $qtext = qq{ [^$esc$nonASCII$CRlist"] };
my $quoted_pair = qq{ $esc [^$nonASCII] };
my $quoted_str = qq{ " (?: $qtext | $quoted_pair )* " };

my $word = qq{ (?: $atom | $quoted_str ) };
my $ip_address = qq{ \\[ $byte (?: $dot $byte ){3} \\] };
my $sub_domain = qq{ [$letter$digit]
[$letter$digit-]{0,61} [$letter$digit]};
my $top_level = qq{ (?: $atom_char ){2,4} };
my $domain_name = qq{ (?: $sub_domain $dot )+ $top_level };
my $domain = qq{ (?: $domain_name | $ip_address ) };
my $local_part = qq{ $word (?: $dot $word )* };

$EmailTestRegExp = qq{ $local_part \@ $domain };
}

$Email =~ s/("(?:[^"\\]|\\.)*"|[^\t "]*)[ \t]*/$1/g;

return 0 if($Email !~ /^$EmailTestRegExp$/ox);
return 1;
}
[/perl]

Yet Another Perl Programmer

_________________________________
~~> [url=http://www.codingdomain.com]www.codingdomain.com <~~
More then 3500 X-Forum [url=http://www.codingdomain.com/cgi-perl/downloads/x-forum]Downloads! Cool


mhx
Enthusiast

Apr 2, 2002, 5:03 AM

Post #9 of 10 (11534 views)
Re: [yapp] New to Perl Need Help [In reply to] Can't Post


In Reply To
The 400-chars link e-mail check regexp is a experiment of the author from the book "Mastering Regular Expressions", to find a regexp that validates e-mail addresses allowed through the SMTP RFC specs.


Jeffrey Friedl's (the author of MRE) regex is also the one that's used in the Email::Valid module I mentioned earlier in this thread. So there's no need build your own code for regex'ing an email address...

-- mhx

At last with an effort he spoke, and wondered to hear his own words, as if some other will was using his small voice. "I will take the Ring," he said, "though I do not know the way."

-- Frodo



yapp
User

Apr 2, 2002, 9:23 AM

Post #10 of 10 (11531 views)
Re: [mhx] New to Perl Need Help [In reply to] Can't Post

Sorry if this was my mistake... PirateTongue

The regexp you entered; I throught it was that MRE expression, and not the "common internet" expression I posted here. Blush

Yet Another Perl Programmer

_________________________________
~~> [url=http://www.codingdomain.com]www.codingdomain.com <~~
More then 3500 X-Forum [url=http://www.codingdomain.com/cgi-perl/downloads/x-forum]Downloads! Cool

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives