How to validate an email address ?

Having worked on various web projects, I often encounter a very well known problem : finding an effective regular expression (regexp) to check the validity of user submitted email addresses.

emailatHaving worked on various web projects, I often encounter a very well known problem : finding an effective regular expression (regexp) to check the validity of user submitted email addresses.

In his blog, Fighting for a lost cause, Ian Dunn has compiled various regular expressions which try to address this problem. The editor's idea is great: using a set of valid/invalid emails and a simple unit test, he can provide a good comparison of some of the most used regexps.

His philosophy is simple : "It's better to accept a few invalid addresses than reject any valid ones, so I'm looking for 0 false-positives and as few false-negatives as possible."
But I've noticed 2 problems :

  1. His "best" regexp doesn't work in JavaScript (JS doesn't support advanced features like negative lookbehind ...)
  2. The method used to validate IP addresses is not correct (doesn't take care of 0-255 range)

So i've decided to improve another existing regex, created by Warren Gaebel and already enhanced by Guillaume Arluison, by adding another test criteria : also check the "real" validity of the IP address.

Here is my solution :
/^[-a-z0-9~!$%^&*_=+}{\'?]+(\.[-a-z0-9~!$%^&*_=+}{\'?]+)*@([a-z0-9]([-a-z0-9_]?[a-z0-9])*(\.[-a-z0-9_]+)*\.(aero|arpa|biz|com|coop|edu|gov|info|int|mil|museum|name|net|org|pro|travel|mobi|[a-z]{2})|([1]?\d{1,2}|2[0-4]{1}\d{1}|25[0-5]{1})(\.([1]?\d{1,2}|2[0-4]{1}\d{1}|25[0-5]{1})){3})(:[0-9]{1,5})?$/i

This one works very well (found 18/18 valid mails + deep IP address check, and found 19/20 invalid mails - there is a problem checking global length)

There's just a small problem, each time a new TLD > 2 chars will be added, you'll need to append it to the list in the regex, if you want a more generic solution, you can use this variant (note that this version will not check if the TLD really exists) :

/^[-a-z0-9~!$%^&*_=+}{\'?]+(\.[-a-z0-9~!$%^&*_=+}{\'?]+)*@([a-z0-9]([-a-z0-9_]?[a-z0-9])*(\.[-a-z0-9_]+)*\.([a-z]{2,6})|([1]?\d{1,2}|2[0-4]{1}\d{1}|25[0-5]{1})(\.([1]?\d{1,2}|2[0-4]{1}\d{1}|25[0-5]{1})){3})(:[0-9]{1,5})?$/i

Those 2 solutions should be usable in all languages providing PCRE (Perl Compatible Regular Expressions), on server & client side (such as Javascript, PHP, Perl, Python, Ruby etc...)

Digital news!

Are you as digitally addicted as we are? We can supply you with a regular dose of digital news. Simply sign-up, or click to Follow us online.

About the author

Alexandre De Dommelin
Alexandre De Dommelin

I joined Blue Infinity in 2008 integrating the OpenSource Solutions Division where I'm working on various projects for international firms or NGOs as Developer and Sysadmin.