任何可见的unicode字母字符的JavaScript正则表达式模式 [英] JavaScript regex pattern for any visible unicode letter characters
问题描述
我正在开发一个JavaScript应用程序,它要求我识别任何可见的Unicode字母字符,数字(0-9),空格,下划线和句点的集合。建议的正则表达式模式是 ^ [0-9 \\p {L} _ \\。] + $
,但这似乎不起作用JavaScript的。给我带来麻烦的部分是任何可见的Unicode字母字符,因为它包含非英文字符。是否有一些JavaScript正则表达式模式可以识别Unicode字母字符集?
I'm working on a JavaScript application that requires me to identify the set of "any visible Unicode letter characters, digits (0-9), spaces, underscores, and periods". The suggested regex pattern is ^[0-9\\p{L} _\\.]+$
, but that doesn't seem to work in JavaScript. The part that is giving me trouble is "any visible Unicode letter characters" because that includes non-English characters. Is there some JavaScript regex pattern that can identify the Unicode letter character set?
推荐答案
使用 XRegExp
解析当前正则表达式的库:
Use XRegExp
library to parse your current regular expression:
var pattern = new XRegExp("^[0-9\\p{L} _.]+$");
var s = "123 Московская Street.";
if (XRegExp.test(s, pattern)) {
console.log("Valid");
}
<script src="https://cdnjs.cloudflare.com/ajax/libs/xregexp/3.2.0/xregexp-all.min.js"></script>
请注意 ^ [0-9 \\p {L} _ \\\\ 。] + $
匹配
-
^
- 开始of string -
[0-9 \\p {L} _\\。 +
- 一个或多个字符:
-
0-9
- ASCII数字 -
\\p {L}
- 字母 -
-
_
- 下划线 -
。
- 一个点(在一个字符类中,。
匹配一个文字点,无需转义)
^
- start of string[0-9\\p{L} _\\.]+
- one or more chars tha are:0-9
- ASCII digits\\p{L}
- letters_
- an underscore.
- a dot (inside a character class,.
matches a literal dot, no need to escape)
如果您还想包括以下条件:
If you want to also include the following conditions:
- 名称必须为至少3个字符长,不超过16个字符。
- 没有玩家名称可以包含单词Riot。
您可以将模式扩展到以下内容:
You may extend the pattern to the following:
var pattern = new XRegExp("^(?!.*\\bRiot\\b)[0-9\\p{L} _\\.]{3,16}$"); ^^^^^^^^^^^^^^^^ ^^^^^^
其中
+
(1次或更多次出现)被替换为{3,16}
限制量词( 3到16次出现)和(?!。* \\bRiot\\b)
如果有一个完整的单词,则负向前瞻将使比赛失败(由于\\b
单词边界)Riot
是字符串内的任何位置(或行,因为。
匹配任何char但是换行符。)where
+
(1 or more occurrences) is replaced with{3,16}
limiting quantifier (3 to 16 occurrences) and(?!.*\\bRiot\\b)
negative lookahead will fail the match if there is a whole word (due to\\b
word boundaries)Riot
is anywhere inside the string (or line, since.
matches any char but line break chars).这篇关于任何可见的unicode字母字符的JavaScript正则表达式模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
-