使用正则表达式验证电子邮件地址是否会造成损害? [英] Can it cause harm to validate email addresses with a regex?

查看:88
本文介绍了使用正则表达式验证电子邮件地址是否会造成损害?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我听说用正则表达式验证电子邮件地址是一件坏事,实际上会造成伤害。这是为什么?我认为验证数据永远不是一件坏事。如果您正确执行验证,则可能没有必要,但绝不是一件坏事。你能告诉我为什么这是对还是错?如果可能造成伤害,请举一个例子。

I've heard that it is a bad thing to validate email addresses with a regex, and that it actually can cause harm. Why is that? I thought it never could be a bad thing to validate data. Maybe unnecessary, but never a bad thing provided that you perform the validation correctly. Could you explain to me why this is right or wrong? If it can cause harm, please give an example.

推荐答案

通常,是的-使用正则表达式验证电子邮件地址是有害的。这是由于正则表达式的作者有错误的假设。

In general, yes - using regular expressions to validate email addresses is harmful. This is because of bad (incorrect) assumptions by the author of the regular expression.

如@klutt所示,电子邮件地址分为两部分,本地部分。值得一提的是,这些部分尚不立即显而易见:

As @klutt indicated, an email address has two parts, the local-part and the domain. It's worth noting some things about these parts that aren't immediately obvious:


  • 本地部分可以包含转义字符,甚至可以包含其他 @ 字符。

  • 本地部分可以区分大小写,但是要区分大小写取决于该特定域的邮件服务器。

  • 部分可以包含零个或多个以句点()分隔的标签,尽管实际上没有对应于根的MX记录(零标签)或

  • The local-part can contain escaped characters and even additional @ characters.
  • The local-part can be case sensitive, however it is up to the mail server at that specific domain how it wants to distinguish case.
  • The domain part can contain zero or more labels separated by a period (.), though in practice there are no MX records corresponding to the root (zero labels) or on the gTLDs (one label) themselves.

因此,您可以进行一些检查,而不会拒绝与之相对应的有效电子邮件地址以上内容:

So, there are some checks that you can do without rejecting valid email addresses that correspond with the above:


  • 地址至少包含一个 @

  • 本地部分(最右边的 @ 左侧的所有内容)是非空的

  • 部分(所有最右边的 @ 的右边包含至少一个句点(再次,严格来说这不是真的,但很实用)

  • Address contains at least one @
  • The local-part (everything to the left of the rightmost @) is non-empty
  • The domain part (everything to the right of the rightmost @) contains at least one period (again, this isn't strictly true but pragmatic)

就是这样。正如其他人指出的那样,最好的方法是测试该地址的可交付性。这将建立两件重要的事情:

That's it. As others have pointed out, it's best practice to test deliverability to that address. This will establish two important things:


  1. 电子邮件当前是否存在;

  2. 用户有权访问电子邮件地址(是合法用户或所有者)

如果您将电子邮件激活流程构建到业务流程中,则无需担心有问题的复杂正则表达式。

If you build email activation processes into your business process, you don't need to worry about complicated regular expressions that have issues.

一些进一步的参考资料:

Some further reading for reference:

RFC 5321:简单邮件传输协议

OWASP:输入验证备忘单

这篇关于使用正则表达式验证电子邮件地址是否会造成损害?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆