.NET Core 2.1 - 循环中的正则表达式比 2.0 慢 200 倍(简单基准测试中的 3 倍) [英] .NET Core 2.1 - Regex in loop 200x slower than 2.0 (3x in simple benchmark)

查看:17
本文介绍了.NET Core 2.1 - 循环中的正则表达式比 2.0 慢 200 倍(简单基准测试中的 3 倍)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下正则表达式:

    var regex = new Regex(
        @"^ActiveMQ[d.-]*$",
        RegexOptions.Compiled | RegexOptions.IgnoreCase | RegexOptions.CultureInvariant);

它运行超过 1000 个字符串(IsMatch 调用).在 .NET Core 2.0 中,大约需要 10ms.迁移到 .NET Core 2.1 后,对相同数据的处理时间超过 2 秒.

It runs over ~1000 strings (IsMatch call). In .NET Core 2.0 it takes around 10ms. After migrating to .NET Core 2.1 it takes over 2 seconds on the same data.

知道发生了什么吗?2.1 中有任何行为变化吗?

Any idea what's going on? Any behavior changes in 2.1?

======================

======================

更新:BenchmarkDotNet

可重现的 3 倍下降(只需运行,将 csproj 文件中的 netcoreapp2.1 更改为 netcoreapp2.0,再次运行).https://github.com/ptupitsyn/netcore2.1-regex-perf/tree/master/src

Reproducible 3x drop (just run, change netcoreapp2.1 to netcoreapp2.0 in csproj file, run again). https://github.com/ptupitsyn/netcore2.1-regex-perf/tree/master/src

  • 尽可能简化实际应用减少了下降,但它仍然非常明显.
  • GetPackageInfos2 中翻转嵌套循环将性能下降减少到仅 25%,但它仍然存在.在现实世界的代码中改变这一点并非易事,我想避免这种重构.
  • 在一个循环中执行了多个 RegEx,我无法仅用一个 RegEx 重现 drop
  • Simplifying actual application as much as possible has reduced the drop, but it is still very much visible.
  • Flipping the nested loops in GetPackageInfos2 reduces the perf drop to only 25%, but it is still there. Changing this in real-world code is not trivial and I would like to avoid this kind of refactoring.
  • There are multiple RegEx executed in a loop, and I could not reproduce the drop with only one RegEx

更新 2

删除 RegexOptions.Compiled 解决问题!

推荐答案

RegexOptions.Compiled 在 .NET Core 2.0 中未实现,但已实现 在 .NET Core 2.1 中.

RegexOptions.Compiled is not implemented in .NET Core 2.0, but is implemented in .NET Core 2.1.

编译涉及初始开销,对于某些使用模式,此开销超过编译正则表达式的收益.

Compilation involves initial overhead, and for certain usage patterns this overhead outweighs the gains of compiled regex.

我的情况有些复杂,似乎 .NET 中可能存在错误,因为即使使用适当的基准测试(带预热),Compiled 模式也较慢.在 Corefx 问题中查看详细信息:https://github.com/dotnet/corefx/issues/30131

My case is somewhat complex, and it seems like there might be a bug in .NET, because even with a proper benchmark (with warm-up), Compiled mode is slower. See details in Corefx issue: https://github.com/dotnet/corefx/issues/30131

这篇关于.NET Core 2.1 - 循环中的正则表达式比 2.0 慢 200 倍(简单基准测试中的 3 倍)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆