实际用户验证(敏感性和特异性)? [英] Practical user validation (sensitivity and specificity)?

查看:37
本文介绍了实际用户验证(敏感性和特异性)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我第一次学习如何使用正则表达式时,我们被教导如何解析诸如电话号码(显然总是 5 位数字、一个可选空格和另外 6 位数字)、电子邮件地址(显然总是字母数字,然后是单个 '@',然后是字母数字后跟一个 '.' 和三个字母),我们应该始终这样做以验证用户输入的数据.

When I was first learning how to use regular expressions we were taught how to parse things like phone numbers (obviously always 5 digits, an optional space and a further 6 digits), email addresses (obviously always alphanumerics, then a single '@', then alphanumerics followed by a '.' and three letters) which we should always do to validate the data that the user enters.

当然,随着我的发展,我了解到基本方法是多么愚蠢,但是我看的越多,我就越质疑这个概念,通过正则表达式对诸如电子邮件地址之类的东西进行最开放、仔细的正确验证最终是数百甚至数千个字符长,以便既接受所有合法案例又正确拒绝非法案例.更糟糕的是,所有这些努力对于实际有效性完全没有任何作用,用户可能不小心添加了一个a",或者可能根本不使用该电子邮件地址,甚至正在使用其他人的地址,甚至可能使用了+"' 被不当标记的符号.

Of course as I've developed I've learned how silly the basic approach can be, but the more I look, the more I question the concept altogether, the most open careful correct validation of something like an email address through regexes ends up being hundreds if not thousands of characters long in order to both accept all the legal cases and correctly reject only the illegal ones. Even worse, all that effort does absolutely nothing for the actual validity, the user may have accidentally added an 'a', or may not use that email address at all, or even is using someone else's address, or may even use a '+' symbol which is being flagged inappropriately.

但与此同时,似乎我遇到的每个网站仍然会进行这种技术检查,以防止我在电子邮件地址或名称中添加更多晦涩的字符,或者反对某人拥有更多或更少单个标题,然后是单个名字和单个姓氏,所有这些都完全由拉丁字符组成,但没有任何形式的检查以确认这是我的真名.

Yet at the same time seemingly every site I come across still does this kind of technical checking, preventing me from putting more obscure characters in an email address or name, or objecting to the idea that someone would have more or less than a single title, then a single firstname and a single lastname, all made purely from latin characters yet without any form of check that it's my real name.

这样做有好处吗?一旦处理注入攻击(应该通过除菌输入以外的方法),这些检查还有其他意义吗?

Is there a benefit to this? Once injection attacks are handled (which should be through methods other than sterilizing the input) is there any other point to these checks?

或者另一方面,除了在上下文中以任何有意义的方式使用"用户详细信息并查看它是否失败之外,实际上是否有一种可靠的方法来实际验证用户详细信息?

Or on the other hand, is there actually a sure fire way to actually validate user details other than to 'use' them in whatever way makes sense contextually and see if it falls over?

推荐答案

这些检查还有其他意义吗?

is there any other point to these checks?

当然.知道您的数据是有效的非常重要.例如,就电子邮件地址而言,向您尚未验证的地址发送电子邮件至少会导致退回邮件.足够多的退回邮件主机可能会阻止您发送垃圾邮件.如果您的应用尝试向他们发送短信,不验证电话号码可能会导致不必要的成本.这个列表不胜枚举.

Certainly. Knowing that your data is valid is very important. In the case of email addresses, for example, sending an email to an address you haven't validated will, at the very least, lead to bounces. Enough bounces and your mailhost might block you for spamming. Not validating a phone number could lead to unnecessary costs if your app tries to send SMS to them. The list goes on and on.

或者另一方面,除了在上下文中以任何有意义的方式使用"用户详细信息并查看它是否失败之外,实际上是否有一种可靠的方法来实际验证用户详细信息?

Or on the other hand, is there actually a sure fire way to actually validate user details other than to 'use' them in whatever way makes sense contextually and see if it falls over?

是的,但正则表达式通常是验证数据的糟糕方式.如果电话号码应该是5 位数加空格然后是 6 位数",那么如果我输入5 位数加空格加 6 位数"或5 位数加破折号加 6 位数"或11",您的检查将失败数字".使用常识,并期待用户提供的任何疯狂的格式.知道绝对最低要求是什么.例如,如果您总共需要 11 位数字,则首先去除不是数字的所有内容.那么格式无关紧要.

Yes, but regex is generally bad way to validate data. If a phone number is supposed to be "5 digits a space then 6 digits", then your check is going to fail if I type "5 digits two spaces then 6 digits" or "5 digits a dash then 6 digits" or "11 digits". Use common sense, and expect any crazy format the user provides. Know what the absolute minimal requirement is. For example, if you need 11 digits total, then strip everything that's not a digit first. Then formatting doesn't matter.

此外,请阅读 RFC.我无法计算我的电子邮件地址被拒绝的次数,因为它有一个加号.那些技术型大公司,程序员应该更懂的数量相当令人失望.

Also, read the RFCs. I can't count the number of times my email address has been rejected because it has a plus sign in it. The amount of those that were large tech-oriented company with programmers that should know better was rather disappointing.

这篇关于实际用户验证(敏感性和特异性)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆