国际字符的JavaScript验证问题 [英] JavaScript validation issue with international characters

查看:170
本文介绍了国际字符的JavaScript验证问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们在Stack Overflow上使用优秀的 jQuery验证器插件在将输入提交给服务器之前对输入进行客户端验证。

We use the excellent validator plugin for jQuery here on Stack Overflow to do client-side validation of input before it is submitted to the server.

它通常运行良好,但是,这个让我们摸不着头脑。

It generally works well, however, this one has us scratching our heads.

在用户名字段的提问/答案表格中使用以下验证方法(请注意,您必须已注销才能在实际网站上查看此字段;它出现在每个 / question 页面和 / ask 页面上)

The following validator method is used on the ask/answer form for the user name field (note that you must be logged out to see this field on the live site; it's on every /question page and the /ask page)

$.validator.addMethod("validUserName",
  function(value, element) {
  return this.optional(element) || 
  /^[\w\-\s\dÀÈÌÒÙàèìòùÁÉÍÓÚÝáéíóúýÂÊÎÔÛâêîôûÃÑÕãñõÄËÏÖÜäëïöüçÇßØøÅåÆæÞþÐð]+$/.test(value); },
  "Can only contain A-Z, 0-9, spaces, and hyphens.");  

现在这个正则表达式看起来很奇怪但很简单:

Now this regex looks weird but it's pretty simple:


  • 匹配字符串的开头(^)

  • 匹配其中任何一个..


    • 字符(\ w)

    • 破折号( - )

    • 空格(\ s)

    • 数字(\d)

    • 疯狂月亮语言字符(àèìòù等)

    • match the beginning of the string (^)
    • match any of these..
      • word character (\w)
      • dash (-)
      • space (\s)
      • digit (\d)
      • crazy moon language characters (àèìòù etc)

      是的,我们遇到了国际化正则表达式问题。 JavaScript对单词字符的定义根本不包括国际字符..

      Yes, we ran into the Internationalized Regular Expressions problem. JavaScript's definition of "word character" does not include international characters.. at all.

      这是奇怪的部分:尽管我们已经遇到了手动添加吨的麻烦对于正则表达式的有效国际字符,它不起作用。您无法在输入框中输入这些国际字符而无需获取..

      Here's the weird part: even though we've gone to the trouble of manually adding tons of the valid international characters to the regex, it doesn't work. You cannot enter these international characters in the input box for user name without getting the..


      只能包含AZ,0-9,空格和连字符

      Can only contain A-Z, 0-9, spaces, and hyphens

      ..验证返回!

      显然验证 正在为正则表达式的其他部分工作 ..所以..给出了什么?

      Obviously the validation is working for the other parts of the regex.. so.. what gives?

      另一个奇怪的部分是此验证在浏览器的JavaScript控制台中有效,但在作为我们的标准* .js包含的一部分执行时无效。

      The other strange part is that this validation works in the browser's JavaScript console but not when executed as a part of our standard *.js includes.


      / ^ [\\ \\ w-\sÀÈÌÒÙàèìòùÁÉÍÓÚÝáéíóúýÂÊÎÔÛâêîôûÃÑÕãñõÄËÏÖÜäëïöüçÇßØøÅåÆæÞþÐð] + $ /
      .test('ÓBilldehÓra')=== true

      /^[\w-\sÀÈÌÒÙàèìòùÁÉÍÓÚÝáéíóúýÂÊÎÔÛâêîôûÃÑÕãñõÄËÏÖÜäëïöüçÇßØøÅåÆæÞþÐð]+$/ .test('ÓBill de hÓra') === true

      我们之前在JavaScript代码中遇到了一些非常奇怪的国际角色问题,导致一些非常非常讨厌的黑客攻击。我们想了解这里发生了什么,为什么。请启发我们!

      We've run into some really bizarre international character issues in JavaScript code before, resulting in some very, very nasty hacks. We'd like to understand what's going on here and why. Please enlighten us!

      推荐答案

      我认为电子邮件和网址验证方法在这里是一个很好的参考,例如。电子邮件方法:

      I think the email and url validation methods are a good reference here, eg. the email method:

      email: function(value, element) {
          return this.optional(element) || /^((([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+(\.([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+)*)|((\x22)((((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(([\x01-\x08\x0b\x0c\x0e-\x1f\x7f]|\x21|[\x23-\x5b]|[\x5d-\x7e]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(\\([\x01-\x09\x0b\x0c\x0d-\x7f]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]))))*(((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(\x22)))@((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?$/i.test(value);
      },
      

      编译该正则表达式的脚本

      换句话说,替换你的任意列表疯狂的月亮字符有助于:

      In other words, replacing your arbitrary list of "crazy moon" characters with this could help:

      [\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]
      

      基本上,这可以通过替换需求来避免您在其他地方遇到的字符编码问题 - 具有更一般定义的字符编码。虽然不一定更具可读性,但到目前为止它比您的完整列表更短。

      Basically this avoids the character encoding issues you have elsewhere by replacing the needs-encoding characters with more general definitions. While not necessarily more readable, so far it's shorter than your full list.

      这篇关于国际字符的JavaScript验证问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆