Javascript正则表达式拒绝非ASCII-US字符 [英] Javascript regex to reject non ASCII-US characters
问题描述
^[^\x00-\x1F\x7F-\xFF]+$
此正则表达式将正确失败以匹配包含非打印(十六进制00-1f)或 ASCII的字符串扩展字符(十六进制80-FF),但与PHP不同,允许非ASCII utf-8字符通过。 (例如,日本واستقرارهहिन्दीދިވެހިބަސްગુજરાતી한)
This regex will properly fail to match a string that contains non-printing (hex 00-1f) or ASCII extended characters (hex 80-FF), but, unlike PHP, lets non-ASCII utf-8 characters pass. (eg. 日本واستقرارهहिन्दीދިވެހިބަސްગુજરાતી한)
查看维基百科页面所有这些都应该在80-ff范围内。有谁知道我错过了什么?
Looking at the wikipedia page on UTF-8 all of those should fall in the 80-ff range. Does anyone know what I'm missing?
另外,如果你能解释如何忽略引用的文字,你将永远是我的英雄。
Also, if you could explain how to ignore quoted text, you would be my hero forever.
推荐答案
嗯......不是拒绝字节范围,而是尝试匹配实际的Unicode字符,例如:
Hmm... instead of rejecting byte ranges, try matching actual Unicode characters, e.g.:
^[\u0020-\u007e]+$
这篇关于Javascript正则表达式拒绝非ASCII-US字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!