.NET Core 2.1 - 循环中的正则表达式比 2.0 慢 200 倍(简单基准测试中的 3 倍) [英] .NET Core 2.1 - Regex in loop 200x slower than 2.0 (3x in simple benchmark)
问题描述
我有以下正则表达式:
var regex = new Regex(
@"^ActiveMQ[d.-]*$",
RegexOptions.Compiled | RegexOptions.IgnoreCase | RegexOptions.CultureInvariant);
它运行超过 1000 个字符串(IsMatch
调用).在 .NET Core 2.0 中,大约需要 10ms
.迁移到 .NET Core 2.1 后,对相同数据的处理时间超过 2 秒
.
It runs over ~1000 strings (IsMatch
call). In .NET Core 2.0 it takes around 10ms
. After migrating to .NET Core 2.1 it takes over 2 seconds
on the same data.
知道发生了什么吗?2.1 中有任何行为变化吗?
Any idea what's going on? Any behavior changes in 2.1?
======================
======================
更新:BenchmarkDotNet
可重现的 3 倍下降(只需运行,将 csproj
文件中的 netcoreapp2.1
更改为 netcoreapp2.0
,再次运行).https://github.com/ptupitsyn/netcore2.1-regex-perf/tree/master/src
Reproducible 3x drop (just run, change netcoreapp2.1
to netcoreapp2.0
in csproj
file, run again).
https://github.com/ptupitsyn/netcore2.1-regex-perf/tree/master/src
- 尽可能简化实际应用减少了下降,但它仍然非常明显.
- 在
GetPackageInfos2
中翻转嵌套循环将性能下降减少到仅25%
,但它仍然存在.在现实世界的代码中改变这一点并非易事,我想避免这种重构. - 在一个循环中执行了多个 RegEx,我无法仅用一个 RegEx 重现 drop
- Simplifying actual application as much as possible has reduced the drop, but it is still very much visible.
- Flipping the nested loops in
GetPackageInfos2
reduces the perf drop to only25%
, but it is still there. Changing this in real-world code is not trivial and I would like to avoid this kind of refactoring. - There are multiple RegEx executed in a loop, and I could not reproduce the drop with only one RegEx
更新 2
删除 RegexOptions.Compiled
解决问题!
推荐答案
RegexOptions.Compiled
在 .NET Core 2.0 中未实现,但已实现 在 .NET Core 2.1 中.
RegexOptions.Compiled
is not implemented in .NET Core 2.0, but is implemented in .NET Core 2.1.
编译涉及初始开销,对于某些使用模式,此开销超过编译正则表达式的收益.
Compilation involves initial overhead, and for certain usage patterns this overhead outweighs the gains of compiled regex.
我的情况有些复杂,似乎 .NET 中可能存在错误,因为即使使用适当的基准测试(带预热),Compiled
模式也较慢.在 Corefx 问题中查看详细信息:https://github.com/dotnet/corefx/issues/30131一个>
My case is somewhat complex, and it seems like there might be a bug in .NET, because even with a proper benchmark (with warm-up), Compiled
mode is slower. See details in Corefx issue: https://github.com/dotnet/corefx/issues/30131
这篇关于.NET Core 2.1 - 循环中的正则表达式比 2.0 慢 200 倍(简单基准测试中的 3 倍)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!