如何调试托管堆中的损坏 [英] How to debug corruption in the managed heap

查看:79
本文介绍了如何调试托管堆中的损坏的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的程序抛出了一个catch(Exception e)块无法处理的错误,然后它崩溃了:

<块引用>

访问冲突损坏状态异常.

这很奇怪,因为据我所知,非托管代码会抛出损坏的状态异常,而在这里我在调用 StringBuilder 方法.

代码在后台线程中运行,并且不时崩溃,无法轻松重现.所以我将 WinDbg 附加到进程中,并有以下异常堆栈:

000000001dabd8c8 000007feea129a1d [HelperMethodFrame: 000000001dabd8c8]000000001dabda00 000007fee90cfce8 System.Text.StringBuilder.ExpandByABlock(Int32)000000001dabda40 000007fee90cfba4 System.Text.StringBuilder.Append(Char*, Int32)000000001dabdaa0 000007fee9102955 System.Text.StringBuilder.Append(System.String, Int32, Int32)000000001dabdaf0 000007ff00bf5ce3 MineUtils.Common.Strings.Strings.Replace(System.String, System.String, System.String, Boolean, Boolean)000000001dabdb90 000007ff00bf5a59 MineUtils.Common.Strings.Strings.RemoveSubstrings(System.String, System.String, System.String, Boolean) [D:ProgramsVisual Studio 2005 ProjectsMineUtils.CommonCommonStrings-Mainc.@1481

WinDbg 显示发生此异常:

EXCEPTION_RECORD: ffffffffffffffff -- (.exr 0xffffffffffffffff)异常地址:000007feea129a1d(clr!WKS::gc_heap::find_first_object+0x0000000000000092)ExceptionCode:c0000005(访问冲突)异常标志:00000000数量参数:2参数[0]:0000000000000000参数[1]:0000000000003d80尝试从地址 0000000000003d80 读取

我读到可以使用方法属性 [HandleProcessCorruptedStateExceptions] 处理此类异常,但是如果我只使用 StringBuilder,为什么会发生此异常?

这是之前的WinDbg分析(StringBuilder.ToString()导致异常):

***************************************************************************************** ** 异常分析 ** ***************************************************************************************FAULTING_IP:clr!WKS::gc_heap::find_first_object+92000007fe`ea129a1d f70100000080 测试双字 ptr [rcx],80000000hEXCEPTION_RECORD: ffffffffffffffff -- (.exr 0xffffffffffffffff)异常地址:000007feea129a1d(clr!WKS::gc_heap::find_first_object+0x0000000000000092)ExceptionCode:c0000005(访问冲突)异常标志:00000001数量参数:2参数[0]:0000000000000000参数[1]:0000000000001c98尝试从地址 0000000000001c98 读取ERROR_CODE: (NTSTATUS) 0xc0000005 - 0x%08lx 处的指令引用了 0x%08lx 处的内存.内存不能为 %s.EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - 0x%08lx 处的指令引用了 0x%08lx 处的内存.内存不能为 %s.EXCEPTION_PARAMETER1:0000000000000000EXCEPTION_PARAMETER2:0000000000001c98读取地址:0000000000001c98FOLLOWUP_IP:clr!WKS::gc_heap::find_first_object+92000007fe`ea129a1d f70100000080 测试双字 ptr [rcx],80000000hMOD_LIST:<分析/>NTGLOBALFLAG: 0APPLICATION_VERIFIER_FLAGS: 0MANAGED_STACK:(过渡MU)000000001AB7DFC0 000007FEE90CFE07 mscorlib_ni!System.Text.StringBuilder.ToString()+0x27000000001AB7E010 000007FF00C750A9 SgmlReaderDll!Sgml.Entity.ScanToken(System.Text.StringBuilder, System.String, Boolean)+0x169000000001AB7E080 000007FF00C760E6 SgmlReaderDll!Sgml.SgmlDtd.ParseParameterEntity(System.String)+0xc6000000001AB7E0F0 000007FF00C76FD8 SgmlReaderDll!Sgml.SgmlDtd.ParseModel(Char, Sgml.ContentModel)+0x298000000001AB7E160 000007FF00C7701C SgmlReaderDll!Sgml.SgmlDtd.ParseModel(Char, Sgml.ContentModel)+0x2dc000000001AB7E1D0 000007FF00C7701C SgmlReaderDll!Sgml.SgmlDtd.ParseModel(Char, Sgml.ContentModel)+0x2dc000000001AB7E240 000007FF00C76BA5 SgmlReaderDll!Sgml.SgmlDtd.ParseContentModel(Char)+0x65000000001AB7E290 000007FF00C763D7 SgmlReaderDll!Sgml.SgmlDtd.ParseElementDecl()+0xe7000000001AB7E320 000007FF00C747A1 SgmlReaderDll!Sgml.SgmlDtd.Parse()+0xc1000000001AB7E370 000007FF00C73EF5 SgmlReaderDll!Sgml.SgmlDtd.Parse(System.Uri, System.String, System.IO.TextReader, System.String, System.String, System.Xml.XmlNameTable)+0x175000000001AB7E410 000007FF00C73B33 SgmlReaderDll!Sgml.SgmlReader.LazyLoadDtd(System.Uri)+0x163000000001AB7E480 000007FF00C737B9 SgmlReaderDll!Sgml.SgmlReader.OpenInput()+0x19000000001AB7E4E0 000007FF00C7334C SgmlReaderDll!Sgml.SgmlReader.Read()+0x1c000000001AB7E530 000007FEE5983C4C System_Xml_ni!System.Xml.XmlLoader.Load(System.Xml.XmlDocument, System.Xml.XmlReader, Boolean)+0xac000000001AB7E590 000007FEE5983730 System_Xml_ni!System.Xml.XmlDocument.Load(System.Xml.XmlReader)+0x90...000000001AB7F0A0 000007FEE97ED792 mscorlib_ni!System.Threading.Tasks.Task.Execute()+0x82000000001AB7F100 000007FEE90A181C mscorlib_ni!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)+0xdc000000001AB7F160 000007FEE97E7F95 mscorlib_ni!System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef)+0x1b5000000001AB7F1E0 000007FEE97E7D90 mscorlib_ni!System.Threading.Tasks.Task.ExecuteEntry(Boolean)+0xb0000000001AB7F220 000007FEE90EBA83 mscorlib_ni!System.Threading.ThreadPoolWorkQueue.Dispatch()+0x193000000001AB7F2C0 000007FEE90EB8D5 mscorlib_ni!System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()+0x35(过渡UM)EXCEPTION_OBJECT: !pe 2a61228异常对象:0000000002a61228异常类型:System.ExecutionEngineException消息:<无>内部异常:<无>堆栈跟踪(生成):<无>堆栈跟踪字符串:<无>HResult: 80131506MANAGED_OBJECT_NAME:System.ExecutionEngineExceptionMANAGED_STACK_COMMAND:_EFN_StackTraceLAST_CONTROL_TRANSFER:从 000007feea12bce4 到 000007feea129a1dADDITIONAL_DEBUG_TEXT:基于属性 [Is_ChosenCrashFollowupThread] 的后续设置来自 Frame:[0] on thread:[PSEUDO_THREAD]FAULTING_THREAD:ffffffffffffffffDEFAULT_BUCKET_ID:INVALID_POINTER_READ_CALLPRIMARY_PROBLEM_CLASS:INVALID_POINTER_READ_CALLBUGCHECK_STR:APPLICATION_FAULT_INVALID_POINTER_READ_WRONG_SYMBOLS_CALL__SYSTEM.EXECUTIONENGINEEXCEPTION

再次更新

这是我启用分页堆后异常的WinDbg堆栈:

 (1480.e84):访问冲突 - 代码 c0000005(第一次机会)ntdll!ZwTerminateProcess+0xa:00000000`77c415da c3 ret0:023>!clrstack操作系统线程 ID:0xe84 (23)子 SP IP 呼叫站点0000000037ded848 0000000077c415da [HelperMethodFrame: 0000000037ded848]0000000037dedab0 000007fee9effd17 System.Text.StringBuilder.ToString()*** 警告:无法验证 C:WindowsassemblyNativeImages_v4.0.30319_64mscorlib8f7f691aa156382100000000000000 校验和0000000037dedb00 000007ff00cceae9 Sgml.Entity.ScanToken(System.Text.StringBuilder, System.String, Boolean)0000000037dedb70 000007ff00cd19b2 Sgml.SgmlDtd.ParseAttDefault(Char, Sgml.AttDef)0000000037dedbc0 000007ff00cd120b Sgml.SgmlDtd.ParseAttDef(字符)0000000037dedc00 000007ff00cd1057 Sgml.SgmlDtd.ParseAttList(System.Collections.Generic.Dictionary`2, Char)0000000037dedc50 000007ff00cd10cd Sgml.SgmlDtd.ParseAttList(System.Collections.Generic.Dictionary`2, Char)0000000037dedca0 000007ff00cd0e9a Sgml.SgmlDtd.ParseAttList()0000000037dedd10 000007ff00cce1f1 Sgml.SgmlDtd.Parse()0000000037dedd60 000007ff00ccd945 Sgml.SgmlDtd.Parse(System.Uri, System.String, System.IO.TextReader, System.String, System.String, System.Xml.XmlNameTable)0000000037dede00 000007ff00ccd582 Sgml.SgmlReader.LazyLoadDtd(System.Uri)0000000037dede70 000007ff00ccd1f9 Sgml.SgmlReader.OpenInput()0000000037deded0 000007ff00cccd8c Sgml.SgmlReader.Read()0000000037dedf20 000007fee67b3bfc System.Xml.XmlLoader.Load(System.Xml.XmlDocument, System.Xml.XmlReader, Boolean)*** 警告:无法验证 C:WindowsassemblyNativeImages_v4.0.63403 的校验和8e4323f5bfb90be4621456033d8b404bSystem.Xml.ni.dll*** 错误:模块加载完成,但无法为 C:WindowsassemblyNativeImages_v4.0.30319_64System.Xml8e4323f5bfb90be4621456033d8b404bSystem.Xml.ni.dll 加载符号0000000037dedf80 000007fee67b36e0 System.Xml.XmlDocument.Load(System.Xml.XmlReader)[已删除]0000000037deea90 000007feea61d432 System.Threading.Tasks.Task.Execute()0000000037deeaf0 000007fee9ed17ec System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)0000000037deeb50 000007feea617c35 System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef)0000000037deebd0 000007feea617a30 System.Threading.Tasks.Task.ExecuteEntry(Boolean)0000000037deec10 000007fee9f1b953 System.Threading.ThreadPoolWorkQueue.Dispatch()0000000037deecb0 000007fee9f1b7a5 System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()0000000037def310 000007feeae4dc54 [DebuggerU2MCatchHandlerFrame: 0000000037def310]0:023>!verifyheap-verify 只会在堆中有错误时产生输出垃圾收集器数据结构未处于遍历的有效状态.它要么处于计划阶段",其中对象正在四处移动,要么我们正处于 gc 堆的初始化或关闭阶段.相关命令显示、查找或遍历对象以及 gc 堆段可能不会好好工作.!dumpheap 和 !verifyheap 可能会错误地抱怨堆一致性错误.对象 000000000e34caf8:坏成员 000000001024b9a0 在 000000000e34cb08curr_object: 000000000e34caf8最后一个好对象:000000000e34cab0----------------0:023>!分析最后一个事件:1480.e84:退出进程 0:1480,代码 80131506调试时间:2011 年 9 月 18 日星期日 14:22:42.592 (UTC + 1:00)0:023>!analyze -v最后一个事件:1480.e84:退出进程 0:1480,代码 80131506调试时间:2011 年 9 月 18 日星期日 14:22:42.592 (UTC + 1:00)0:023>.do e34cab0^ '.do e34cab0' 中的语法错误0:023>!do e34cab0名称:System.String方法表:000007feea026870EEClass: 000007fee9baed58大小:72(0x48) 字节文件:C:WindowsMicrosoft.NetassemblyGAC_64mscorlibv4.0_4.0.0.0__b77a5c561934e089mscorlib.dll字符串:应用过滤器容器领域:MT 字段偏移类型 VT Attr 值名称000007feea02c758 4000103 8 System.Int32 1 实例 23 m_stringLength000007feea02b298 4000104 c System.Char 1 实例 61 m_firstChar000007feea026870 4000105 10 System.String 0 共享静态空>>域:值 00000000021343a0:000000000db21420 <<0:023>!do e34caf8<注意:这个对象有一个无效的CLASS字段>名称:System.Reflection.RuntimeAssembly方法表:000007feea02a128EEClass: 000007fee9baf968大小:48(0x30) 字节文件:C:WindowsMicrosoft.NetassemblyGAC_64mscorlibv4.0_4.0.0.0__b77a5c561934e089mscorlib.dll领域:MT 字段偏移类型 VT Attr 值名称000007feea9ef7f0 4000e14 8 ...solveEventHandler 0 实例 0000000000000000 _ModuleResolve000007feea036338 4000e15 10 ...che.InternalCache 0 实例 000000001024b9a0 m_cachedData000007feea0259c8 4000e16 18 System.Object 0 实例 000000000e3abd18 m_syncRoot000007feea033450 4000e17 20 System.IntPtr 1 实例 37a95f10 m_assembly

它可能是什么?

解决方案

最近,我遇到了托管堆损坏,这对我来说是新事物.我对它感到非常沮丧,必须学习很多东西才能调试它.我要感谢 Seva Titov,他给了我正确的开始方向.他的回答简洁明了,非常有帮助.我想记录我为调试问题而采取的操作以供我自己参考.可能这对其不熟悉的其他人会有所帮助.

.NET 4 中的调试堆损坏:

如何怀疑堆损坏?

简要:

  1. 应用程序随机崩溃,不考虑应用的异常捕获,甚至通过像 catch(Exception) 这样应该捕获所有异常的毯子.

  2. 检查应用程序崩溃转储中的 CLR 堆栈会显示堆栈顶部的垃圾收集器:

    000000001dabd8c8 000007feea129a1d [**HelperMethodFrame**: 000000001dabd8c8]000000001dabda00 000007fee90cfce8 System.Text.StringBuilder.ExpandByABlock(Int32)000000001dabda40 000007fee90cfba4 System.Text.StringBuilder.Append(Char*, Int32)...EXCEPTION_RECORD: ffffffffffffffff -- (.exr 0xffffffffffffffff)异常地址:000007feea129a1d(**clr!WKS::gc_heap**::find_first_object+0x0000000000000092)ExceptionCode:c0000005(访问冲突)异常标志:00000000数量参数:2参数[0]:0000000000000000参数[1]:0000000000003d80...

  3. CLR 堆栈总是显示不同的点.是否发生崩溃或显示的代码显然无关紧要,例如显示的 StringBuilder 方法导致异常.

更多详情请参考.NET 崩溃:托管堆损坏调用非托管代码.

一步一步来.如果前一步没有帮助,则使用下一步的每一步.

第 1 步:检查代码.

检查代码是否存在不安全或本机代码用法:

  1. 查看unsafeDllImport 语句的代码.
  2. 下载 .NET Reflector 并使用它来分析 PInvoke 的应用程序集.同样,分析应用程序使用的第三方程序集.

如果发现不安全或本机代码使用,请特别注意这些.在这种情况下堆损坏的最常见原因是缓冲区溢出或参数类型不匹配.确保提供给本机代码填充的缓冲区足够大,并且传递给本机代码的所有参数都是预期的类型.

第 2 步.检查是否可以捕获此损坏状态异常.

要处理此类异常,需要使用[HandleProcessCorruptedStateExceptions] 属性修饰包含catch(Exception) 语句的方法或在app.config 文件:

<预><代码><配置><运行时><legacyCorruptedStateExceptionsPolicy enabled="true"/></运行时></配置>

如果异常被成功捕获,您可以记录并检查它.这意味着这不是损坏的堆问题.

完全无法处理损坏的堆异常:HandleProcessCorruptedStateExceptions 似乎不起作用.

有关损坏状态异常的更多信息,请参阅所有关于 .NET4 中的损坏状态异常.

第 3 步:实时调试.

在这一步中,我们在生产环境(或可以重现崩溃的地方)中实时调试崩溃的应用程序.

Windows调试工具rel="noreferrer">适用于 Windows 7 和 .NET Framework 4 的 Microsoft Windows SDK(将下载 Web 安装程序,允许选择安装所需的组件 - 标记所有组件).它将安装所需调试工具的 32 位和 64 位(如果您的系统是 x64)版本.

这里需要知道如何将 WinDbg 附加到实时进程、如何获取故障转储并检查它们、如何加载 SOS 扩展(谷歌了解详情).

启用调试助手:

  1. 启动应用程序验证程序(C:Program FilesApplication Verifier - 使用所需的版本,x86 或 x64,具体取决于您的可执行编译模式),将您的可执行文件添加到左窗格并在右窗格中检查一个节点Basics/Heaps".保存更改.

  2. 启动全局标志助手(C:Program FilesDebugging Tools for Windowsgflags.exe - 再次选择正确的版本,x86 或 x64).全局标志 启动后,转到图像文件"选项卡,然后在顶部的文本框中输入不带任何路径的可执行文件的名称(例如,MyProgram.exe").然后按 Tab 键并设置以下框:

    • 启用堆尾检查
    • 启用无堆检查
    • 启用堆参数检查
    • 在调用时启用堆验证
    • 免费禁用堆合并
    • 启用页堆
    • 启用堆标记
    • 启用应用程序验证器
    • Debugger(在右侧的文本框中键入已安装 WinDbg 的路径,例如 C:Program FilesDebugging Tools for Windows (x64)windbg.exe -g).

    有关详细信息,请参阅堆损坏,第 2 部分.

  3. 转到控制面板/系统和安全/系统"(或在开始"菜单中右键单击计算机"并选择属性".单击高级系统设置",在显示的对话框中,转到到高级"选项卡,然后单击环境变量"按钮.在显示的对话框中,添加一个新的系统变量(如果您是系统管理员 - 否则为用户变量 - 在这种情况下您需要注销/登录).必需的变量是COMPLUS_HeapVerify",值为1".更多详细信息可以在堆栈溢出问题.NET/C#:如何设置调试环境变量 COMPLUS_HeapVerify?.

现在我们可以开始调试了.启动应用程序.WinDbg 应该自动启动它.保持应用程序运行直到它崩溃到 WinDgb,然后检查转储.

提示:要快速删除全局标志应用程序验证程序和调试器附件设置,请删除注册表中的以下项:x64 - HKEY_LOCAL_MACHINESOFTWAREWow6432NodeMicrosoftWindows NTCurrentVersionImage File Execution Options*YourAppName*

第 4 步:启用 MDA.

尝试使用托管调试助手.详细信息在堆栈溢出问题 WhatMDA 可用于跟踪堆损坏?.

MDA 必须与 WinDbg 一起使用.我什至将它们与 Global FlagsApplication Verifier 一起使用.

第 5 步:启用 GCStress.

使用 GCStress 是一个极端的选择,因为应用程序变得几乎无法使用,但它仍有一段路要走.更多详情见GCStress:如何在 Windows 7 中开启?.

第 6 步:为 x86 编译.

如果您的应用程序当前正在为任何 CPU"或x64"平台编译,如果您使用哪个平台没有区别,请尝试为x86"编译它.我看到这个报道是为了解决某人的问题.

步骤 7. 禁用并发 GC - 这对我有用

在线程中报告了 .NET 4 中的一个已知问题 gc_heap::garbage_collect 中 .NET 4 运行时的访问冲突,没有非托管模块.这个问题可以通过在app.config文件中禁用并发GC来解决:

<配置><运行时><gcConcurrent enabled="false"/></运行时></配置>

My program throws an error which it cannot handle by a catch(Exception e) block and then it crashes:

Access Violation Corrupted State Exception.

This is the weird thing, because, as I know, corrupted state exceptions are thrown from unmanaged code, while here I get this exception while calling a StringBuilder method.

The code runs in a background thread and crashes from time to time which cannot be easily reproduced. So I attached WinDbg to the process and have the following stack of the exception:

000000001dabd8c8 000007feea129a1d [HelperMethodFrame: 000000001dabd8c8]
000000001dabda00 000007fee90cfce8 System.Text.StringBuilder.ExpandByABlock(Int32)
000000001dabda40 000007fee90cfba4 System.Text.StringBuilder.Append(Char*, Int32)
000000001dabdaa0 000007fee9102955 System.Text.StringBuilder.Append(System.String, Int32, Int32)
000000001dabdaf0 000007ff00bf5ce3 MineUtils.Common.Strings.Strings.Replace(System.String, System.String, System.String, Boolean, Boolean)
000000001dabdb90 000007ff00bf5a59 MineUtils.Common.Strings.Strings.RemoveSubstrings(System.String, System.String, System.String, Boolean) [D:ProgramsVisual Studio 2005 ProjectsMineUtils.CommonStringsStrings.Common-Main.cs @ 1481

WinDbg shows this exception occurred:

EXCEPTION_RECORD:  ffffffffffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 000007feea129a1d (clr!WKS::gc_heap::find_first_object+0x0000000000000092)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 0000000000000000
   Parameter[1]: 0000000000003d80
Attempt to read from address 0000000000003d80

I read such exceptions can be handled with a method attribute [HandleProcessCorruptedStateExceptions], but why does this exception ever occur if I only use StringBuilder?

This is the previous WinDbg analysis (StringBuilder.ToString() causes the exception):

*******************************************************************************
*                                                                             *
*                        Exception Analysis                                   *
*                                                                             *
*******************************************************************************

FAULTING_IP:
clr!WKS::gc_heap::find_first_object+92
000007fe`ea129a1d f70100000080    test    dword ptr [rcx],80000000h

EXCEPTION_RECORD:  ffffffffffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 000007feea129a1d (clr!WKS::gc_heap::find_first_object+0x0000000000000092)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000001
NumberParameters: 2
   Parameter[0]: 0000000000000000
   Parameter[1]: 0000000000001c98
Attempt to read from address 0000000000001c98

ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced memory at 0x%08lx. The memory could not be %s.

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced memory at 0x%08lx. The memory could not be %s.

EXCEPTION_PARAMETER1:  0000000000000000

EXCEPTION_PARAMETER2:  0000000000001c98

READ_ADDRESS:  0000000000001c98

FOLLOWUP_IP:
clr!WKS::gc_heap::find_first_object+92
000007fe`ea129a1d f70100000080    test    dword ptr [rcx],80000000h

MOD_LIST: <ANALYSIS/>

NTGLOBALFLAG:  0

APPLICATION_VERIFIER_FLAGS:  0

MANAGED_STACK:
(TransitionMU)
000000001AB7DFC0 000007FEE90CFE07 mscorlib_ni!System.Text.StringBuilder.ToString()+0x27
000000001AB7E010 000007FF00C750A9 SgmlReaderDll!Sgml.Entity.ScanToken(System.Text.StringBuilder, System.String, Boolean)+0x169
000000001AB7E080 000007FF00C760E6 SgmlReaderDll!Sgml.SgmlDtd.ParseParameterEntity(System.String)+0xc6
000000001AB7E0F0 000007FF00C76FD8 SgmlReaderDll!Sgml.SgmlDtd.ParseModel(Char, Sgml.ContentModel)+0x298
000000001AB7E160 000007FF00C7701C SgmlReaderDll!Sgml.SgmlDtd.ParseModel(Char, Sgml.ContentModel)+0x2dc
000000001AB7E1D0 000007FF00C7701C SgmlReaderDll!Sgml.SgmlDtd.ParseModel(Char, Sgml.ContentModel)+0x2dc
000000001AB7E240 000007FF00C76BA5 SgmlReaderDll!Sgml.SgmlDtd.ParseContentModel(Char)+0x65
000000001AB7E290 000007FF00C763D7 SgmlReaderDll!Sgml.SgmlDtd.ParseElementDecl()+0xe7
000000001AB7E320 000007FF00C747A1 SgmlReaderDll!Sgml.SgmlDtd.Parse()+0xc1
000000001AB7E370 000007FF00C73EF5 SgmlReaderDll!Sgml.SgmlDtd.Parse(System.Uri, System.String, System.IO.TextReader, System.String, System.String, System.Xml.XmlNameTable)+0x175
000000001AB7E410 000007FF00C73B33 SgmlReaderDll!Sgml.SgmlReader.LazyLoadDtd(System.Uri)+0x163
000000001AB7E480 000007FF00C737B9 SgmlReaderDll!Sgml.SgmlReader.OpenInput()+0x19
000000001AB7E4E0 000007FF00C7334C SgmlReaderDll!Sgml.SgmlReader.Read()+0x1c
000000001AB7E530 000007FEE5983C4C System_Xml_ni!System.Xml.XmlLoader.Load(System.Xml.XmlDocument, System.Xml.XmlReader, Boolean)+0xac
000000001AB7E590 000007FEE5983730 System_Xml_ni!System.Xml.XmlDocument.Load(System.Xml.XmlReader)+0x90
...
000000001AB7F0A0 000007FEE97ED792 mscorlib_ni!System.Threading.Tasks.Task.Execute()+0x82
000000001AB7F100 000007FEE90A181C mscorlib_ni!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)+0xdc
000000001AB7F160 000007FEE97E7F95 mscorlib_ni!System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef)+0x1b5
000000001AB7F1E0 000007FEE97E7D90 mscorlib_ni!System.Threading.Tasks.Task.ExecuteEntry(Boolean)+0xb0
000000001AB7F220 000007FEE90EBA83 mscorlib_ni!System.Threading.ThreadPoolWorkQueue.Dispatch()+0x193
000000001AB7F2C0 000007FEE90EB8D5 mscorlib_ni!System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()+0x35
(TransitionUM)

EXCEPTION_OBJECT: !pe 2a61228
Exception object: 0000000002a61228
Exception type:   System.ExecutionEngineException
Message:          <none>
InnerException:   <none>
StackTrace (generated):
<none>
StackTraceString: <none>
HResult: 80131506

MANAGED_OBJECT_NAME:  System.ExecutionEngineException

MANAGED_STACK_COMMAND:  _EFN_StackTrace

LAST_CONTROL_TRANSFER:  from 000007feea12bce4 to 000007feea129a1d

ADDITIONAL_DEBUG_TEXT:  Followup set based on attribute [Is_ChosenCrashFollowupThread] from Frame:[0] on thread:[PSEUDO_THREAD]

FAULTING_THREAD:  ffffffffffffffff

DEFAULT_BUCKET_ID:  INVALID_POINTER_READ_CALL

PRIMARY_PROBLEM_CLASS:  INVALID_POINTER_READ_CALL

BUGCHECK_STR:  APPLICATION_FAULT_INVALID_POINTER_READ_WRONG_SYMBOLS_CALL__SYSTEM.EXECUTIONENGINEEXCEPTION

UPDATED AGAIN

Here is the WinDbg stack of the exception after I enabled paged heap:

 (1480.e84): Access violation - code c0000005 (first chance)
ntdll!ZwTerminateProcess+0xa:
00000000`77c415da c3              ret
0:023> !clrstack
OS Thread Id: 0xe84 (23)
Child SP         IP               Call Site
0000000037ded848 0000000077c415da [HelperMethodFrame: 0000000037ded848]
0000000037dedab0 000007fee9effd17 System.Text.StringBuilder.ToString()*** WARNING: Unable to verify checksum for C:WindowsassemblyNativeImages_v4.0.30319_64mscorlib8f7f691aa155c11216387cf3420d9d1bmscorlib.ni.dll

0000000037dedb00 000007ff00cceae9 Sgml.Entity.ScanToken(System.Text.StringBuilder, System.String, Boolean)

0000000037dedb70 000007ff00cd19b2 Sgml.SgmlDtd.ParseAttDefault(Char, Sgml.AttDef)
0000000037dedbc0 000007ff00cd120b Sgml.SgmlDtd.ParseAttDef(Char)
0000000037dedc00 000007ff00cd1057 Sgml.SgmlDtd.ParseAttList(System.Collections.Generic.Dictionary`2<System.String,Sgml.AttDef>, Char)
0000000037dedc50 000007ff00cd10cd Sgml.SgmlDtd.ParseAttList(System.Collections.Generic.Dictionary`2<System.String,Sgml.AttDef>, Char)
0000000037dedca0 000007ff00cd0e9a Sgml.SgmlDtd.ParseAttList()
0000000037dedd10 000007ff00cce1f1 Sgml.SgmlDtd.Parse()
0000000037dedd60 000007ff00ccd945 Sgml.SgmlDtd.Parse(System.Uri, System.String, System.IO.TextReader, System.String, System.String, System.Xml.XmlNameTable)
0000000037dede00 000007ff00ccd582 Sgml.SgmlReader.LazyLoadDtd(System.Uri)
0000000037dede70 000007ff00ccd1f9 Sgml.SgmlReader.OpenInput()
0000000037deded0 000007ff00cccd8c Sgml.SgmlReader.Read()
0000000037dedf20 000007fee67b3bfc System.Xml.XmlLoader.Load(System.Xml.XmlDocument, System.Xml.XmlReader, Boolean)*** WARNING: Unable to verify checksum for C:WindowsassemblyNativeImages_v4.0.30319_64System.Xml8e4323f5bfb90be4621456033d8b404bSystem.Xml.ni.dll
*** ERROR: Module load completed but symbols could not be loaded for C:WindowsassemblyNativeImages_v4.0.30319_64System.Xml8e4323f5bfb90be4621456033d8b404bSystem.Xml.ni.dll

0000000037dedf80 000007fee67b36e0 System.Xml.XmlDocument.Load(System.Xml.XmlReader)
[deleted]
0000000037deea90 000007feea61d432 System.Threading.Tasks.Task.Execute()
0000000037deeaf0 000007fee9ed17ec System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
0000000037deeb50 000007feea617c35 System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef)
0000000037deebd0 000007feea617a30 System.Threading.Tasks.Task.ExecuteEntry(Boolean)
0000000037deec10 000007fee9f1b953 System.Threading.ThreadPoolWorkQueue.Dispatch()
0000000037deecb0 000007fee9f1b7a5 System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()
0000000037def310 000007feeae4dc54 [DebuggerU2MCatchHandlerFrame: 0000000037def310]
0:023> !verifyheap
-verify will only produce output if there are errors in the heap
The garbage collector data structures are not in a valid state for traversal.
It is either in the "plan phase," where objects are being moved around, or
we are at the initialization or shutdown of the gc heap. Commands related to
displaying, finding or traversing objects as well as gc heap segments may not
work properly. !dumpheap and !verifyheap may incorrectly complain of heap
consistency errors.
object 000000000e34caf8: bad member 000000001024b9a0 at 000000000e34cb08
curr_object:      000000000e34caf8
Last good object: 000000000e34cab0
----------------
0:023> !analyze
Last event: 1480.e84: Exit process 0:1480, code 80131506
  debugger time: Sun Sep 18 14:22:42.592 2011 (UTC + 1:00)
0:023> !analyze -v
Last event: 1480.e84: Exit process 0:1480, code 80131506
  debugger time: Sun Sep 18 14:22:42.592 2011 (UTC + 1:00)
0:023> .do e34cab0
          ^ Syntax error in '.do e34cab0'
0:023> !do e34cab0
Name:        System.String
MethodTable: 000007feea026870
EEClass:     000007fee9baed58
Size:        72(0x48) bytes
File:        C:WindowsMicrosoft.NetassemblyGAC_64mscorlibv4.0_4.0.0.0__b77a5c561934e089mscorlib.dll
String:      appliedFiltersContainer
Fields:
              MT    Field   Offset                 Type VT     Attr            Value Name
000007feea02c758  4000103        8         System.Int32  1 instance               23 m_stringLength
000007feea02b298  4000104        c          System.Char  1 instance               61 m_firstChar
000007feea026870  4000105       10        System.String  0   shared           static Empty
                                 >> Domain:Value  00000000021343a0:000000000db21420 <<
0:023> !do e34caf8
<Note: this object has an invalid CLASS field>
Name:        System.Reflection.RuntimeAssembly
MethodTable: 000007feea02a128
EEClass:     000007fee9baf968
Size:        48(0x30) bytes
File:        C:WindowsMicrosoft.NetassemblyGAC_64mscorlibv4.0_4.0.0.0__b77a5c561934e089mscorlib.dll
Fields:
              MT    Field   Offset                 Type VT     Attr            Value Name
000007feea9ef7f0  4000e14        8 ...solveEventHandler  0 instance 0000000000000000 _ModuleResolve
000007feea036338  4000e15       10 ...che.InternalCache  0 instance 000000001024b9a0 m_cachedData
000007feea0259c8  4000e16       18        System.Object  0 instance 000000000e3abd18 m_syncRoot
000007feea033450  4000e17       20        System.IntPtr  1 instance         37a95f10 m_assembly

What can it be?

解决方案

Recently, I was faced with a managed heap corruption which was something new to me. I was very frustrated with it and had to learn many things to be able to debug it. I want to thank Seva Titov who gave me right direction to start. His answer was concise and very helpful. I want to log the actions I have taken to debug the problem for my own reference. Probably this will be helpful for others who are new to this.

Debug Heap Corruption in .NET 4:

How to suspect the heap corruption?

Briefly:

  1. The application crashes randomly with no regards to the applied exception catching and even goes through blankets like catch(Exception) which are supposed to catch all exceptions.

  2. Examining the CLR stack in the application crash dumps shows the garbage collector on the top of the stack:

    000000001dabd8c8 000007feea129a1d [**HelperMethodFrame**: 000000001dabd8c8]
    000000001dabda00 000007fee90cfce8 System.Text.StringBuilder.ExpandByABlock(Int32)
    000000001dabda40 000007fee90cfba4 System.Text.StringBuilder.Append(Char*, Int32)
    ...
    
    EXCEPTION_RECORD:  ffffffffffffffff -- (.exr 0xffffffffffffffff)
    ExceptionAddress: 000007feea129a1d (**clr!WKS::gc_heap**::find_first_object+0x0000000000000092)
       ExceptionCode: c0000005 (Access violation)
      ExceptionFlags: 00000000
    NumberParameters: 2
       Parameter[0]: 0000000000000000
       Parameter[1]: 0000000000003d80
    ...
    

  3. The CLR stack always shows different points. Whether the crash occurred or the code which is shown is clearly irrelevant, like StringBuilder's method which is shown to cause the exception.

For more details refer to .NET Crash: Managed Heap Corruption calling unmanaged code.

Going step by step. Each next step is used if the previous one doesn't help.

Step 1. Check the code.

Check the code for unsafe or native code usages:

  1. Review the code for unsafe, DllImport statements.
  2. Download .NET Reflector and use it to analyze the application assemblies for PInvoke. In the same way, analyze the third-party assemblies which are used by the application.

If unsafe or native code usage is found, direct extra attention to those. The most common cause of the heap corruption in such cases is a buffer overflow or an argument type mismatch. Ensure that the buffer supplied to the native code to fill is big enough and that all arguments passed to the native code are of the expected type.

Step 2. Check if this corrupted state exception can be caught.

To handle such exceptions, one need to decorate the method which contains the catch(Exception) statement with the [HandleProcessCorruptedStateExceptions] attribute or apply the following in the app.config file:

<configuration>
    <runtime>
        <legacyCorruptedStateExceptionsPolicy enabled="true" />
    </runtime>
</configuration>

In the case the exception was caught successfully, you can log and examine it. This means this is not a corrupted heap issue.

Corrupted heap exceptions cannot be handled at all: HandleProcessCorruptedStateExceptions doesn't seem to work.

More information on corrupted state exceptions, see All about Corrupted State Exceptions in .NET4.

Step 3. Live debugging.

In this step, we debug the crashing application live in the production environment (or where we can reproduce the crash).

Download Debugging Tools for Windows from Microsoft Windows SDK for Windows 7 and .NET Framework 4 (a web installer will be downloaded which will allow selecting the required components to install - mark all components). It will install both 32 and 64 bit (if your system is x64) versions of the required debugging tools.

Here one needs to know how to attach WinDbg to a live process, how to take crash dumps and examine them, how to load SOS extension in WinDbg (google for details).

Enable debugging helpers:

  1. Launch Application Verifier (C:Program FilesApplication Verifier - use the required edition, either x86 or x64, depending on your executable compilation mode), add your executable there in the left pane and in the right pane check one node "Basics / Heaps". Save the changes.

  2. Launch Global Flags helper (C:Program FilesDebugging Tools for Windowsgflags.exe - again select the correct edition, x86 or x64). Once Global Flags is started, go to the "Image File" tab and at the top text box enter the name of your executable file without any paths (for example, "MyProgram.exe"). Then press the Tab key and set the following boxes:

    • Enable heap tail checking
    • Enable heap free checking
    • Enable heap parameter checking
    • Enable heap validation on call
    • Disable heap coalesce on free
    • Enable page heap
    • Enable heap tagging
    • Enable application verifier
    • Debugger (type the path to the installed WinDbg in the text box to the right, for example, C:Program FilesDebugging Tools for Windows (x64)windbg.exe -g).

    For more details, refer to Heap Corruption, Part 2.

  3. Go to "Control Panel/System and Security/System" (or right-click "Computer" in the Start menu and select "Properties". There click "Advanced system settings", in the displayed dialog, go to "Advanced" tab and click the "Environment Variables" button. In the displayed dialog, add a new System variable (if you are an system administrator - a User variable otherwise - you need need to logout/login in this case). The required variable is "COMPLUS_HeapVerify" with a value of "1". More details can be found in Stack Overflow question .NET/C#: How to set debugging environment variable COMPLUS_HeapVerify?.

Now we are ready to start debugging. Start the application. WinDbg should start automatically for it. Leave the application running until it crashes into WinDgb and then examine the dump.

TIP: To quickly remove Global Flags, Application Verifier and the debugger attachment settings, delete the following key in the registry: x64 - HKEY_LOCAL_MACHINESOFTWAREWow6432NodeMicrosoftWindows NTCurrentVersionImage File Execution Options*YourAppName*

Step 4. Enable MDAs.

Try to use the Managed Debugging Assistants. Details are in Stack Overflow question What MDAs are useful to track a heap corruption?.

MDAs must be used along with WinDbg. I used them even along with Global Flags and Application Verifier.

Step 5. Enable GCStress.

Using GCStress is an extreme option, because the application becomes almost unusable, but it is still a way to go. More details are in GCStress: How to turn on in Windows 7?.

Step 6. Compile for x86.

If your application is currently being compiled for "Any CPU" or "x64" platform, try to compile it for "x86" if there is no difference for you which platform to use. I saw this reported to solve the problem for somebody.

Step 7. Disable concurrent GC - this is what worked for me

There is a reported known issue in .NET 4 reported in the thread Access Violation in .NET 4 Runtime in gc_heap::garbage_collect with no unmanaged modules. The problem can be solved by disabling the concurrent GC in the app.config file:

<?xml version="1.0"?>
<configuration>
    <runtime>
        <gcConcurrent enabled="false" />
    </runtime>
</configuration>

这篇关于如何调试托管堆中的损坏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆