国际字符的 JavaScript 验证问题 [英] JavaScript validation issue with international characters

查看:26
本文介绍了国际字符的 JavaScript 验证问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们在 Stack Overflow 上使用出色的 验证器插件 jQuery在将输入提交给服务器之前对输入进行客户端验证.

We use the excellent validator plugin for jQuery here on Stack Overflow to do client-side validation of input before it is submitted to the server.

它通常运行良好,但是,这个让我们摸不着头脑.

It generally works well, however, this one has us scratching our heads.

以下验证器方法用于用户名字段的询问/回答表单(请注意,您必须注销才能在实时站点上查看此字段;它位于每个 /question 页面和 /ask 页面)

The following validator method is used on the ask/answer form for the user name field (note that you must be logged out to see this field on the live site; it's on every /question page and the /ask page)

$.validator.addMethod("validUserName",
  function(value, element) {
  return this.optional(element) || 
  /^[w-sdÀÈÌÒÙàèìòùÁÉÍÓÚÝáéíóúýÂÊÎÔÛâêîôûÃÑÕãñõÄËÏÖÜäëïöüçÇßØøÅåÆæÞþÐð]+$/.test(value); },
  "Can only contain A-Z, 0-9, spaces, and hyphens.");  

现在这个正则表达式看起来很奇怪,但其实很简单:

Now this regex looks weird but it's pretty simple:

  • 匹配字符串的开头 (^)
  • 匹配任何这些..
    • 单词字符 (w)
    • 破折号 (-)
    • 空格 (s)
    • 数字 (d)
    • 疯月语字(àèìòù 等)

    是的,我们遇到了国际化正则表达式问题.JavaScript 对单词字符"的定义根本不包括国际字符.

    Yes, we ran into the Internationalized Regular Expressions problem. JavaScript's definition of "word character" does not include international characters.. at all.

    这是奇怪的部分:即使我们手动将大量有效的国际字符添加到正则表达式中遇到了麻烦,它不起作用.没有得到..

    Here's the weird part: even though we've gone to the trouble of manually adding tons of the valid international characters to the regex, it doesn't work. You cannot enter these international characters in the input box for user name without getting the..

    只能包含 A-Z、0-9、空格和连字符

    Can only contain A-Z, 0-9, spaces, and hyphens

    ...验证返回!

    显然验证适用于正则表达式的其他部分..那么..结果是什么?

    Obviously the validation is working for the other parts of the regex.. so.. what gives?

    另一个奇怪的部分是此验证在浏览器的 JavaScript 控制台中有效,但在作为我们标准 *.js 包含的一部分执行时无效.

    The other strange part is that this validation works in the browser's JavaScript console but not when executed as a part of our standard *.js includes.

    /^[w-sÀÈÌÒÙàèìòùÁÉÍÓÚÝáéíóúýÂÊÔÛâêîôûÃÑÕãñõÄËÏÖÜäëïöüçÇßØþÐð]+$.test('ÓBill de hÓra') === 真

    /^[w-sÀÈÌÒÙàèìòùÁÉÍÓÚÝáéíóúýÂÊÎÔÛâêîôûÃÑÕãñõÄËÏÖÜäëïöüçÇßØøÅåÆæÞþÐð]+$/ .test('ÓBill de hÓra') === true

    我们之前在 JavaScript 代码中遇到过一些非常奇怪的国际字符问题,导致了一些非常非常讨厌的黑客攻击.我们想了解这里发生了什么以及为什么.请赐教!

    We've run into some really bizarre international character issues in JavaScript code before, resulting in some very, very nasty hacks. We'd like to understand what's going on here and why. Please enlighten us!

    推荐答案

    我认为电子邮件和 url 验证方法在这里是一个很好的参考,例如.电子邮件方式:

    I think the email and url validation methods are a good reference here, eg. the email method:

    email: function(value, element) {
        return this.optional(element) || /^((([a-z]|d|[!#$%&'*+-/=?^_`{|}~]|[u00A0-uD7FFuF900-uFDCFuFDF0-uFFEF])+(.([a-z]|d|[!#$%&'*+-/=?^_`{|}~]|[u00A0-uD7FFuF900-uFDCFuFDF0-uFFEF])+)*)|((x22)((((x20|x09)*(x0dx0a))?(x20|x09)+)?(([x01-x08x0bx0cx0e-x1fx7f]|x21|[x23-x5b]|[x5d-x7e]|[u00A0-uD7FFuF900-uFDCFuFDF0-uFFEF])|(\([x01-x09x0bx0cx0d-x7f]|[u00A0-uD7FFuF900-uFDCFuFDF0-uFFEF]))))*(((x20|x09)*(x0dx0a))?(x20|x09)+)?(x22)))@((([a-z]|d|[u00A0-uD7FFuF900-uFDCFuFDF0-uFFEF])|(([a-z]|d|[u00A0-uD7FFuF900-uFDCFuFDF0-uFFEF])([a-z]|d|-|.|_|~|[u00A0-uD7FFuF900-uFDCFuFDF0-uFFEF])*([a-z]|d|[u00A0-uD7FFuF900-uFDCFuFDF0-uFFEF]))).)+(([a-z]|[u00A0-uD7FFuF900-uFDCFuFDF0-uFFEF])|(([a-z]|[u00A0-uD7FFuF900-uFDCFuFDF0-uFFEF])([a-z]|d|-|.|_|~|[u00A0-uD7FFuF900-uFDCFuFDF0-uFFEF])*([a-z]|[u00A0-uD7FFuF900-uFDCFuFDF0-uFFEF]))).?$/i.test(value);
    },
    

    编译正则表达式的脚本.

    换句话说,用这个替换你的任意疯狂月亮"字符列表可能会有所帮助:

    In other words, replacing your arbitrary list of "crazy moon" characters with this could help:

    [u00A0-uD7FFuF900-uFDCFuFDF0-uFFEF]
    

    基本上,这通过用更通用的定义替换需要编码的字符来避免您在其他地方遇到的字符编码问题.虽然不一定更具可读性,但到目前为止它比您的完整列表要短.

    Basically this avoids the character encoding issues you have elsewhere by replacing the needs-encoding characters with more general definitions. While not necessarily more readable, so far it's shorter than your full list.

    这篇关于国际字符的 JavaScript 验证问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆