Chrome 75正则表达式,"S"匹配奇怪的unicode范围 [英] Chrome 75 regexp, 'S' matches strange unicode range

查看:133
本文介绍了Chrome 75正则表达式,"S"匹配奇怪的unicode范围的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们在最新版本的Chrome(75)上遇到了一个奇怪的错误,该错误将S替换为S

We have a strange bug on the latest version of Chrome (75) that replace S to S

console.log(
  'AZERTYUIOPQSDFGHJKLMWXCVBN'.replace(/[\u00A0-\u9999<>&]/gim, char => `&#${char.charCodeAt(0)};`)
)  

//AZERTYUIOPQ&#83;DFGHJKLMWXCVBN

有人知道代码是问题还是Chrome是问题?

Do someone have any idea if the code is the problem or Chrome is the problem?

推荐答案

固定于75.0.3770.142.

Fixed in 75.0.3770.142.

您发现了一个有趣的错误:

You have found an interesting bug:

由于某些原因取决于不相关的字符范围,因此这两个测试是正确的:

These two tests are true for some reason that hinges on the unrelated character range:

> /[\u0178-\u017F]/i.test('s')
true
> /[\u0178-\u017F]/i.test('S')
true

https://chromium-review.googlesource.com /c/v8/v8/+/1478710 (四月).

https://chromium-review.googlesource中的修复程序. com/c/v8/v8/+/1648098 似乎与此有关,但是带有v8 7.7.27的Canary 77.0.3818.0仍然表现出这种行为.这是一个单独的错误: https://crbug.com/971636

The fix in https://chromium-review.googlesource.com/c/v8/v8/+/1648098 seems related, but Canary 77.0.3818.0 with v8 7.7.27 still exhibits this behavior. This is a separate bug: https://crbug.com/971636

引起问题的错误( https://bugs. chrome.org/p/v8/issues/detail?id=8348 )讨论了ECMAScript如何区别对待iu:

The bug that introduced the issue (https://bugs.chromium.org/p/v8/issues/detail?id=8348) discusses how ECMAScript treats i and u differently:

  • i单独调用toUpperCase,其使用大小写映射
  • iu调用Unicode情况 folding
  • i alone calls toUpperCase, which uses case mapping
  • iu invokes Unicode case folding

这些稍有不同(尽管有此错误).

These are slightly different (this bug notwithstanding).

我还发现了一个似乎与众不同的错误:

I also found what seems to be a different bug:

这是一个小的测试用例,尽管v8中的修复程序涉及土耳其的情况下可折叠:

Here's a small test case, although the fix in v8 refers to Turkish case folding:

> text='ſ';
"ſ"
> new RegExp(text, 'i').test(text.toUpperCase())
true
> new RegExp(text, 'i').test('S')
false

它是在同一修订版中引入的,但它不是一个完全相同的错误-它是ſ字符的特定 ,,字符的大写版本位于ASCII范围内,因此触发了不同的代码路径在V8的regexp编译器中.分别在 https://chromium-review.googlesource.com/c中进行了修复/v8/v8/+/1827683

It was introduced in the same revision, but it isn't quite the same bug — it's specific to the ſ character, whose uppercase version lies in the ASCII range and therefore triggers a different code path in V8's regexp compiler. Fixed separately at https://chromium-review.googlesource.com/c/v8/v8/+/1827683

这篇关于Chrome 75正则表达式,"S"匹配奇怪的unicode范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆