CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Being selective with email addresses.

 



AlanBell
Deleted

Jan 8, 2000, 5:27 AM

Post #1 of 18 (4342 views)
Being selective with email addresses. Can't Post

I need a form that checks the validity of email addresses. I have seen scripts that check to ensure a "@" is present and spaces and others forbidden characters are not included.
I also want to be able to stop "hotmail", "eudoramail" etc addresses.
Maybe also check that they end in either
".com", ".com.au", ".net", ".net.au", ".org" or ".org.au".
Can anyone point me in the right direction?
Thanks,
Alan


Borderline
Deleted

Jan 8, 2000, 5:39 AM

Post #2 of 18 (4342 views)
Re: Being selective with email addresses. [In reply to] Can't Post

Checking an email address is a verry complicated thing. Have you looked up the RFC for email formats yet?
Emails can have things you would never expect in them.
For example this is a valid email address.
"Scott Beck"@mydomain.com
Can you believe it? You can actualy put spaces in an email address if the left side is quoted!
I have not honestly sat down and attempted to write an email checker. It would probably be like a page long regex.
Here is a direct quote from the Perl FAQ entitled "How do I check a valid email address"

Without sending mail to the address and seeing whether it bounces (and even then you face the halting problem), you cannot
determine whether an email address is valid. Even if you apply the email header standard, you can have problems, because
there are deliverable addresses that aren't RFC-822 (the mail header standard) compliant, and addresses that aren't
deliverable which are compliant.


Hope this helps...
Scott


AlanBell
Deleted

Jan 13, 2000, 2:34 AM

Post #3 of 18 (4342 views)
Re: Being selective with email addresses. [In reply to] Can't Post

I have a line of code that checks to see if the E-mail field is blank or if the "@" is present. However, I don't understand the format of all the other bits and pieces.

if ($data{'E-mail'} ne ""
&& $data{'E-mail'} !~ /^[\w\.-]+@[\w\.-]+$/)

The "." and "\w\" interest me.

How could I easily add "hotmail" to the exclusion list while ensuring that "mail" by itself was still acceptable.



darian
Deleted

Jan 14, 2000, 3:33 AM

Post #4 of 18 (4342 views)
Re: Being selective with email addresses. [In reply to] Can't Post

You can also try a module called Email::Valid
It is available at search.cpan.org
I think this is more of what you are looking for.


AlanBell
Deleted

Jan 15, 2000, 3:58 AM

Post #5 of 18 (4342 views)
Re: Being selective with email addresses. [In reply to] Can't Post

Darian,
Thanks for your link.
I checked it out but it sounds far too complicated for me.
Maybe one day it will all make sense.
Alan


Jasmine
Administrator / Moderator

Jan 15, 2000, 12:11 PM

Post #6 of 18 (4342 views)
Re: Being selective with email addresses. [In reply to] Can't Post

While it is quite difficult (impossible) to use only one single check to ensure an email is valid, here's regex that checks for the most common email addresses.

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


if ($email =~ /^[\s]*[\w-.]+\@[\w-]+([\.]{1}[\w-]+)+[\s]*$/) {
# it's a good email
}
else {
# it's a "bad" email
}
</pre><HR></BLOCKQUOTE>

The above will correctly match you@yourdomain.com as well as you.you@yourdomain.co.uk, the most common email forms. It will not pass addresses with spaces in it or other uncommon email address formats.

As for checking to see if it's a hotmail account, you can use:

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


$domain = (split(/@/,lc($email)))[1];
</pre><HR></BLOCKQUOTE>

$domain now holds the domain name part of the email address (everything after the @).

The lc($email) makes the email address lower case. Now, all you have to do is match $domain to whatever domain you want to disallow. Example:

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


$forbidden = "hotmail.com";

if ($forbidden eq $domain){
# it's a forbidden email domain
}
else {
# it's an allowed email domain
}
</pre><HR></BLOCKQUOTE>

Now the above works just fine for a single forbidden domain. If you have several domains that you want to disallow, you may wish to place them all in a text file, depending on how many you have.

Let's assume that you want to disallow all of the free email addresses you can find (good luck in finding them all! -- check the bottom of this reply for a good start). You don't want to edit your program each time you find a new one, so a data file would be ideal in this situation.

As always, the best way to develop a program or a subroutine is to plan what you want it to do. Planning helps make your program more "logical" and efficient.

So, a visitor submits an email address. First, you want to make sure that it's a "valid" email address. If it is, then you want to check to see if it's a forbidden email address. Consider the following:

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


#!/usr/bin/perl

$email_db = "/path/to/freeemail.txt";
# server path to forbidden email database

my $visitoremail = $FORM{'email'};
# form input from your visitor

my $passedcheck = check_email($visitoremail);
# The above line calls the check_email subroutine, and
# passes the email address to be checked.
# The subroutine will return a 0 for good emails, and 1
# for bad email addresses.

my $passfail = $passedcheck ? "Failed Email Check" : "Passed Email Check";
# This line takes the results from $passedcheck, and assigns
# $passfail with an English pass/fail message (instead of 0
# for pass and 1 for fail)

print "$passfail\n\n";
# Simply prints if the address passed or failed. Add more code here.

exit;


############################
# SUB - CHECK EMAIL
#
# Usage - check_email($emailtobechecked);
#
# Description: This subroutine takes the
# email address passed to it, and checks
# for validity of common email formats.
#
# If the address doesn't pass the common
# email format, it returns a 1 to the
# line calling the subroutine.
#
# If the address passes the common email
# format, it grabs the domain name from
# the email address, and invokes the
# freemailcheck subroutine to prevent
# defined forbidden email addresses from
# passing.

sub check_email {
my $email = shift;
#get the email address that was passed to the subroutine

my $error = 0;
# innocent until proved guilty

if ($email =~ /^[\s]*[\w-.]+\@[\w-]+([\.]{1}[\w-]+)+[\s]*$/) {
$domain = (split(/@/,$email))[1];
$error = freemailcheck($domain);
}
else {
$error = 1;
}
return $error;
}

############################
# SUB - FREE EMAIL CHECK
#
# Usage - freemailcheck($domaintobechecked);
#
# Description: This subroutine takes the
# domain name address passed to it, and
# checks the domain name against a database
# of free email address. The path to this
# database needs be defined in a variable
# named $email_db.
#
# If the domain is listed in the database,
# it returns a 1 to the line calling the
# subroutine.
#
# If the domain is not listed in the
# database, the initialized free_error = 0
# is returned to the calling subroutine.

sub freemailcheck {
my $domain = lc(shift);
my $free_error = 0; #innocent until proven guilty

open (DB,"<$email_db") or die "Couldn't open database $email_db - $!\n";
while (<DB> ) {
chomp;
if (lc($_) eq $domain){
$free_error = 1;
last;
}
}
close (DB);
return $free_error;
}
</pre><HR></BLOCKQUOTE>

If your goal is to prohibit all of the free email addresses you can find, you're welcome to grab our free email address file to get "started" -- there's already over 4,600 free email address domains in that database. If you choose to grab it, please be patient -- it's a 70k file.

If you'd like any clarification on the code above, please feel free to post your question. This topic will be addressed in painstaking detail in the March issue of the Learning Center with line-by-line explanations of the code.

Please let me know if this helped.


brian.hayes
User

Jan 15, 2000, 1:50 PM

Post #7 of 18 (4342 views)
Re: Being selective with email addresses. [In reply to] Can't Post

What would be the reason for such denial? I'm not tring to know why , but what would be a benifit of something like this? The only thing I can think of is to prevent online email addresses from being alowed to use a program that allows someone to check there POP3 email account from a web site.

Brian Hayes


Jasmine
Administrator / Moderator

Jan 15, 2000, 2:39 PM

Post #8 of 18 (4342 views)
Re: Being selective with email addresses. [In reply to] Can't Post

Easy...

You're an online vendor and don't want customers to hide behind a free email address.

Because a significant portion of credit card fraud originates from those using free email addresses, it's in the merchant's best interest to require "real" email addresses.

You wouldn't feel comfortable selling your products/services to someone who walked into your office with a ski mask on.

You wouldn't accept a credit card in your store if the customer didn't have a photo id, proving their identity (if you're diligent).

So then it's logical to ask why would you accept an online payment from someone who's intentionally masking their identity.

It's just called "risk reduction".

There's a few other good reasons, but that probably the best.

For the "average" web site, there's no obvious reason to disallow free email.


brian.hayes
User

Jan 15, 2000, 9:41 PM

Post #9 of 18 (4342 views)
Re: Being selective with email addresses. [In reply to] Can't Post

Never looked at it that way.........But wouldn't some people not have a POP email acct. I know about 20 or so right know that do not have a computer at home and use online email, becuase for some reason or another a company would not pruchase and support a mail server. Just Proxy.....So given this, wouldn't denying online email addresses be like saying "you do not what there buisness". Thus not good for any buisness..

Also there are companies out there who will verify credit cards as a service for a % of sale...

But you do have a valid point...


Brian Hayes


brian.hayes
User

Jan 16, 2000, 6:36 AM

Post #10 of 18 (4342 views)
Re: Being selective with email addresses. [In reply to] Can't Post

Now that you mention it, those people that I know are even afrade to turn there computer on when they hear of a virus on the news... or get an email about one.


That has to be the best example I have ever read...You have my vote...Not to mention I will be doing this as well NOW...

By the way what is IMHO?

Thanks for the informative answere on this..

Brian Hayes


brian.hayes
User

Jan 16, 2000, 6:39 AM

Post #11 of 18 (4342 views)
Re: Being selective with email addresses. [In reply to] Can't Post

In case you need it brian.hayes@devitcom.com


Jasmine
Administrator / Moderator

Jan 16, 2000, 7:42 AM

Post #12 of 18 (4342 views)
Re: Being selective with email addresses. [In reply to] Can't Post

Thanks for the vote, and for questioning it at all -- I think the article will be well rounded now Smile

IMHO = in my humble opinion

Sorry about that -- I slip every once in a while Smile


Jasmine
Administrator / Moderator

Jan 16, 2000, 10:42 AM

Post #13 of 18 (4342 views)
Re: Being selective with email addresses. [In reply to] Can't Post

At what cost is increased sales worth? It's a matter of weighing the pros and the cons.

Our parent company's hosting division, YourDomainHost, used to accept free email accounts for new clients. They also looked for credit card address matches, etc. 90% of all credit card chargebacks and bounced checks were by clients with free email accounts. Once real email accounts were required (along with other validation methods), fraud dropped significantly.

Chargebacks and bounced checks can do serious damage to a business. Merchants should definitely read the fine print on their merchant account. Chargebacks in excess of x% of total monthly sales flag your company as a risk.

I personally know one company whose significant chargebacks caused them to pay a $5,000+ "interest-free deposit" to the merchant provider to continue their merchant account. I also know one company who had one large-sum check bounce -- their bank completely froze the checking account for 10 business days.

So, would you rather make it difficult for someone who doesn't have a real email address to buy something from you? Or would you rather make it difficult for you to do business at all?

So yes, some people do not have real email accounts -- it's a sad fact that probably will change slowly. But that's why you want to conveniently provide telephone numbers on all order forms Smile

Also, it may be interesting to know how many "new" users actually make online purchases. IMHO, many people who are new to computers and/or the internet don't seem comfortable in giving out their credit card information online, anyway.

Opinions?


AlanBell
Deleted

Jan 21, 2000, 2:58 AM

Post #14 of 18 (4342 views)
Re: Being selective with email addresses. [In reply to] Can't Post

Jasmine,
Thanks for your detailed explanation. I thought this question had died a natural death so was quite surprised to see the new discussions. I appreciate the data file on free email services. 4,600 names with most not even "sounding" like freebies. One question that springs to mind is why do you need to have the data in a separate file? Does the whole file need to be read to make a check or is an alphabetical search sufficient? If you check for "hotmail.com" is all the text file read or does it go straight to the "h"'s? I thought that if the data was included in the one cgi file, it may be quicker to execute rather than call another file. Is there a limit to the size of a cgi file?
Funny thing about this forum. The more answers given means more questions to be asked.
Once again, thanks for your help and I look forward to the March edition of the Learning Center.
Alan


Jasmine
Administrator / Moderator

Jan 21, 2000, 9:10 AM

Post #15 of 18 (4342 views)
Re: Being selective with email addresses. [In reply to] Can't Post

The first question is easy... the reason I put the emails in a separate data file is because I have a database manager for it. Also, it's much easier to distribute a single data file than to redistribute an entire program, which people may have customized for their specific needs.

The freemail subroutine opens the data file and goes through it line by line using the while statement to try to find a match -- it doesn't take the whole file into memory. Once a line has been checked and the next line is being checked, the first line has already been removed from memory.

However, if you assigned the contents of the file to an array @blah = <FILE> (instead of using a while statement), that puts the entire contents of the file into memory -- it's stored in @blah.

Without knowing the exact line number the h's start at (using your hotmail example), it the program wouldn't know where to start looking for them without checking at least the first character of each line anyway.

I agree that there's most likely more efficient methods to check a file of this size. One way may be to split up the 4,600 line data file into separate files according to the first letter/number of the domain, such as a.txt, b.txt, c.txt, etc.

Then, when you pass a domain to be checked to the freemail subroutine, you could extract the first letter of the domain name and open the appropriate (smaller) data file.

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


$start = substr($domain,0,1);
open (DB,"<$start.txt") or die "Couldn't open database $start.txt - $!\n";
</pre><HR></BLOCKQUOTE>

I'll do some playing around with this when I have more time... maybe I'll bite the bullet, split the data file and do some benchmarking Smile


AlanBell
Deleted

Feb 14, 2000, 1:57 AM

Post #16 of 18 (4342 views)
Re: Being selective with email addresses. [In reply to] Can't Post

Jasmine,
Just thought I'd let you know that I've been using the list (freemail.txt) that you referred to earlier to filter out anonymous email addresses. Low and behold I just received one not on this list. Just in case someone is updatiing this list (I pity the person doing this task), it is "zandd.com".
Maybe someone should write a Perl program that allows automatic updates to this file from differnet sources.
I'm finding this site an invaluable learning experience. Keep up the good work.
Alan


AlanBell
Deleted

Mar 3, 2000, 2:08 PM

Post #17 of 18 (4342 views)
Re: Being selective with email addresses. [In reply to] Can't Post

Jasmine,
I was looking forward to your article in the March edition of TLC. Is it still coming?
Thanks,
Alan

Quote from 15th January

"If you'd like any clarification on the code above, please feel free to post your question. This topic will be addressed in painstaking detail in the March issue of the Learning Center with line-by-line explanations of the code."


Jasmine
Administrator / Moderator

Mar 4, 2000, 12:54 AM

Post #18 of 18 (4342 views)
Re: Being selective with email addresses. [In reply to] Can't Post

Alan:

As you can imagine, it's rather busy here and I don't have as much time to write as I'd like to Frown If you have any questions in particular about this topic, please feel free to post it here, where I can give a "quick" answer (as in not an article-length answer) and code, or send an email to admin@perlarchive.com.

Unfortunately, the promised article needs to be postponed for a little while.

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives