为什么`/\:/u` 会抛出“无效转义"错误? [英] Why does `/\:/u` throw an “invalid escape” error?
问题描述
我有这样的代码:
url.match(/^https?\:\/\/([^\/:?#]+)(?:[\/:?#]|$)/ui)
ESLint 说 Parsing error: Invalid regular expression:/^https?\:\/\/([^\/:?#]+)(?:[\/:?#]|$)/: 转义无效
.
我不明白为什么这个正则表达式是错误的.我该如何解决?
I don’t see why this regular expression is wrong. How should I fix it?
推荐答案
不必要的转义序列对 u
标志无效
\:
是一个不必要的转义序列.当使用 u
标志时,这些是无效.只需使用 :
代替.
Unnecessary escape sequences are invalid with the u
flag
\:
is an unnecessary escape sequence. Those are invalid when using the u
flag. Just use :
instead.
这些是字符类之外的特殊字符的有效且必要的转义序列:\$
, \(
, \)
, <代码>\*、\+
、\.
、\?
、\[
、\\
, \]
, \^
, \{
, \|
, <代码>\}(所有语法字符")和\/
(身份转义的特例).
These are the valid and necessary escape sequences of special characters outside of character classes: \$
, \(
, \)
, \*
, \+
, \.
, \?
, \[
, \\
, \]
, \^
, \{
, \|
, \}
(all "syntax characters"), and \/
(special case of an identity escape).
其他转义序列,如 \
、\!
、\"
、\#
、\%
、\&
、\'
、\、
、\-
、\:
, \;
, \<
, \=
, \>
,\@
、\_
、\`
、\~
是不必要的,因此对于 u 无效
标志.
Other escape sequences like \
, \!
, \"
, \#
, \%
, \&
, \'
, \,
, \-
, \:
, \;
, \<
, \=
, \>
, \@
, \_
, \`
, \~
are unnecessary and thus invalid with the u
flag.
查看规范,了解所有转义规则的详细信息.1
Look into the specification for all the escaping rules in detail.1
RegEx101 之类的工具报告了这一点 — 虽然有点神秘:
Tools like RegEx101 report this — a bit cryptic, though:
/\:/u
:
\:
— 此记号没有特殊含义,因此被渲染为错误
\:
— This token has no special meaning and has thus been rendered erroneous
至于文档,我刚刚在 MDN 上的正则表达式备忘单:
请注意,某些字符,例如 :
、-
、@
等,在转义和未转义时都没有特殊含义.\:
、\-
、\@
等转义序列将等价于它们在正则表达式中的文字、未转义字符等价物.但是,在带有 unicode 标志,这些将导致无效身份转义错误.
Note that some characters like
:
,-
,@
, etc. neither have a special meaning when escaped nor when unescaped. Escape sequences like\:
,\-
,\@
will be equivalent to their literal, unescaped character equivalents in regular expressions. However, in regular expressions with the unicode flag, these will cause an invalid identity escape error.
基本原理
注释继续:
这样做是为了确保与使用新转义序列(如 \p
或 \k
)的现有代码向后兼容.
This is done to ensure backward compatibility with existing code that uses new escape sequences like
\p
or\k
.
当该功能被提出和引入时,这就是 提案常见问题 不得不说:
When the feature was proposed and introduced, this is what the proposal’s FAQ had to say:
在没有u
标志的正则表达式中,模式\p
是p
的(不必要的)转义序列.\p{Letter}
形式的模式可能已经存在于没有 u
标志的现有正则表达式中,因此我们不能在不破坏向后兼容性的情况下为这些模式分配新的含义.
What about backwards compatibility?
In regular expressions without the
u
flag, the pattern\p
is an (unnecessary) escape sequence forp
. Patterns of the form\p{Letter}
might already be present in existing regular expressions without theu
flag, and therefore we cannot assign new meaning to such patterns without breaking backwards compatibility.
因此,ECMAScript 2015 制作了不必要的转义序列,例如 \p
和 \P
在设置 u
标志时抛出异常.这使我们能够使用 u
标志更改正则表达式中 \p{…}
和 \P{…}
的含义,而不会向后中断兼容性.
For this reason, ECMAScript 2015 made unnecessary escape sequences like \p
and \P
throw an exception when the u
flag is set.
This enables us to change the meaning of \p{…}
and \P{…}
in regular expressions with the u
flag without breaking backwards compatibility.
此页面也链接自此 ES 讨论线程 提出这个问题的地方:
This page is also linked from this ES Discuss thread where this question has been raised:
JSLint 之前曾警告过 RegExp 中未转义的文字 -
.但是,将 -
与 unicode 标志 u
一起转义会导致 Chrome、Firefox 和 Edge 出现语法错误(并且 JSLint 已删除该警告).只是好奇上述边缘情况是语法错误的原因.
Why is RegExp
/\-/u
a syntax error?JSLint previously warned against unescaped literal
-
in RegExp. However, escaping-
together with unicode flagu
causes a syntax error in Chrome, Firefox, and Edge (and JSLint has since removed the warning). Just curious about the reason why the above edge-case is a syntax error.
(我对语法稍作调整.)
响应链接到上述 GitHub 存储库与提案,但也以不同的方式解释了基本原理:
The responses link to the above GitHub repo with the proposal, but also explain the rationale in a different way:
将 u
标志视为正则表达式的严格模式.
Think of the
u
flag as a strict mode for regular expressions.
因此,每当您使用 u
标志时,请记住这一点.一旦您使用 u
,RegExp 的行为就会开始有所不同.某些新事物变得有效,但某些其他事物也变得无效.例如,另请参阅 为什么 /[\w-+]/
是一个有效的正则表达式但 /[\w-+]/u
无效?.
So, whenever you use the u
flag, keep this in mind.
RegExps begin to behave a little differently as soon as you use u
.
Certain new things become valid, but certain other things become invalid, too.
For example, also see Why is /[\w-+]/
a valid regex but /[\w-+]/u
invalid?.
1:你会发现一些带有 [U]
的产生式规则,它是一个代表 Unicode 模式的参数.请参阅语法符号参考以解码这些内容.
1: You’ll find certain production rules with [U]
which is a parameter that represents Unicode patterns.
See the grammar notation reference for decoding these.
这篇关于为什么`/\:/u` 会抛出“无效转义"错误?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!