.NET正则表达式"内存泄漏"调查 [英] .NET RegEx "Memory Leak" investigation

查看:274
本文介绍了.NET正则表达式"内存泄漏"调查的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近我看着一些.NET内存泄漏(即意外,挥之不去的GC根植对象)的WinForms应用程序。加载,然后关闭了巨大的汇报后,内存使用量并没有下降,即使一对夫妇的第二代收藏的预期。假设报告控制正在维持生命流浪事件处理我破获开放的WinDbg,看看发生了什么事......

I recently looked into some .NET "memory leaks" (i.e. unexpected, lingering GC rooted objects) in a WinForms app. After loading and then closing a huge report, the memory usage did not drop as expected even after a couple of gen2 collections. Assuming that the reporting control was being kept alive by a stray event handler I cracked open WinDbg to see what was happening...

使用的WinDbg,在!dumpheap -stat 命令报告了大量的内存消耗由字符串实例。进一步完善下来与!dumpheap型System.String 命令我找到了罪魁祸首,用于报告一个90MB的字符串,地址03be7930。最后一步就是调用!gcroot 03be7930 来查看哪些对象(S)的保持它活着。

Using WinDbg, the !dumpheap -stat command reported a large amount of memory was consumed by string instances. Further refining this down with the !dumpheap -type System.String command I found the culprit, a 90MB string used for the report, at address 03be7930. The last step was to invoke !gcroot 03be7930 to see which object(s) were keeping it alive.

我的期望是不正确的 - 这不是一个脱钩的事件处理程序挂到报告控制(和报告字符串),而是它举行了一个 System.Text.RegularEx pressions。 RegexInter preTER 实例,这本身就是一个后代一个 System.Text.RegularEx pressions.Cached $ C $开元。现在,Regexs的缓存(有点)常识,因为这有助于降低其每次使用的时间来重新编译正则表达式的开销。但是,那么这是否都与保持我的字符串活着?

My expectations were incorrect - it was not an unhooked event handler hanging onto the reporting control (and report string), but instead it was held on by a System.Text.RegularExpressions.RegexInterpreter instance, which itself is a descendant of a System.Text.RegularExpressions.CachedCodeEntry. Now, the caching of Regexs is (somewhat) common knowledge as this helps to reduce the overhead of having to recompile the Regex each time it is used. But what then does this have to do with keeping my string alive?

根据使用反射分析,事实证明,输入字符串存储在RegexInter preTER每当一个正则表达式的方法被调用。该RegexInter preTER保存到这个字符串引用,直到一个新的字符串被送入它通过随后的正则表达式的方法调用。我想攀爬Regex.Match情况下,也许别人期望的类似行为。链是这样的:

Based on analysis using Reflector, it turns out that the input string is stored in the RegexInterpreter whenever a Regex method is called. The RegexInterpreter holds onto this string reference until a new string is fed into it by a subsequent Regex method invocation. I'd expect similar behaviour by hanging onto Regex.Match instances and perhaps others. The chain is something like this:

  • Regex.Split,Regex.Match,Regex.Replace等
    • Regex.Run
      • RegexScanner.Scan(RegexScanner是基类,RegexInter preTER是上述的子类)。
      • Regex.Split, Regex.Match, Regex.Replace, etc
        • Regex.Run
          • RegexScanner.Scan (RegexScanner is the base class, RegexInterpreter is the subclass described above).

          有问题的正则表达式仅用于报告,很少使用,并且因此不太可能被再次使用,以清除现有报告串。而且,即使在正则表达式用在以后的话,大概可以处理其他大的报告。这是一个比较显著的问题,只是普通的感觉很脏。

          The offending Regex is only used for reporting, rarely used, and therefore unlikely to be used again to clear out the existing report string. And even if the Regex was used at a later point, it would probably be processing another large report. This is a relatively significant problem and just plain feels dirty.

          所有这一切说,我发现如何解决几个选项,或至少四周的工作,这种情况下。我会首先让社会各界作出反应,如果没有买家前来我将填补在一两天内没有任何间隙。

          All that said, I found a few options on how to resolve, or at least work around, this scenario. I'll let the community respond first and if no takers come forward I will fill in any gaps in a day or two.

          推荐答案

          您使用正则表达式或采取字符串模式的静态正则表达式方法的实例? <一href="http://blogs.msdn.com/bclteam/archive/2006/10/19/regex-class-caching-changes-between-net-framework-1-1-and-net-framework-2-0-josh-free.aspx">According这个帖子,正则表达式的情况下,不参与缓存。

          Are you using instances of Regex or the static Regex methods which take a string pattern? According to this post, Regex instances do not participate in the caching.

          这篇关于.NET正则表达式&QUOT;内存泄漏&QUOT;调查的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆