帮助与WinDbg的ADPlus的和捕捉StackOverflowException [英] Help catching StackOverflowException with WinDbg and ADPlus

查看:280
本文介绍了帮助与WinDbg的ADPlus的和捕捉StackOverflowException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

精简版

我想要一个ADPlus的脚本,将做一个完整的内存转储的第一次机会StackOverflowException,什么都被清除之前,而忽略其他所有的异常。

I want an ADPlus script that will do a full memory dump on the first-chance StackOverflowException, before anything is cleaned up, and ignore all other exception types.

日志版本

新的ASP.NET code释放后,我们开始变得断断续续的StackOverflowExceptions。我们已经看了无限递归和自最后一次正确的安装添加的所有修订的秋后算账,并不能发现任何东西。该网站将长达一个小时的运行,然后崩溃了。

After a release of new ASP.NET code, we started getting intermittent StackOverflowExceptions. We've looked for infinite recursions and all the usual suspects in the revisions added since the last known good install, and can't find anything. The website will run for up to an hour, and then crash down.

我们已经使用的WinDbg和SOS,试图用ADPlus的崩溃日志,使用这个命令来获得:

We've used WinDbg and SOS and attempted to get crash logs using ADPlus, using this command:

adplus -crash -o D:\Crash -NoDumpOnFirst -iis

原因-NoDumpOnFirst是,我们只能重现生产这个错误在繁忙的服务器上。为了做到每个第一次机会异常(嘿,碰巧)调试器已暂停IIS辅助进程足够长的时间写出一个16兆的文件,所以请求排队和应用程序变得不稳定的转储。因为错误可能需要长达一小时后的丑陋的头,这是有问题的。

The reason for -NoDumpOnFirst is that we can only reproduce this error in production on busy servers. In order to do a minidump on each first-chance exception (hey, it happens) the debugger has to pause the IIS worker process long enough to write out a 16 meg file, so requests queue up and the application becomes unstable. Because the error can take up to an hour to rear it's ugly head, this is problematic.

因此​​,与-NoDumpOnFirst,我得到的WinDbg输出这些线程转储文件:

So with -NoDumpOnFirst, I get a dump file that WinDbg outputs these threads for:

PDB symbol for mscorwks.dll not loaded
ThreadCount: 69
UnstartedThread: 0
BackgroundThread: 69
PendingThread: 0
DeadThread: 0
Hosted Runtime: no
                                      PreEmptive   GC Alloc           Lock
       ID OSID ThreadOBJ    State     GC       Context       Domain   Count APT Exception
XXXX    1  c6c 000fa758  11808221 Disabled 3b49ee4c:3b49efe8 00120888     1 Ukn (Threadpool Worker)
XXXX    2 1294 000fd258      b220 Enabled  00000000:00000000 000df4e0     0 Ukn (Finalizer)
XXXX    3 1eb0 0011cdd0    80a220 Enabled  00000000:00000000 000df4e0     0 Ukn (Threadpool Completion Port)
XXXX    4 1b3c 00120198      1220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX    5 1280 00138118   880a220 Enabled  2633de9c:2633ee08 000df4e0     0 Ukn (Threadpool Completion Port)
XXXX    6 1db8 00158a48  1180a221 Disabled 4b5a7e2c:4b5a82e8 00120888     1 Ukn (Threadpool Worker)
XXXX    9 141c 00162008   180a220 Enabled  00000000:00000000 000df4e0     0 Ukn (Threadpool Worker)
XXXX    7 1574 00174008   180a220 Enabled  4d46b6a8:4d46c158 00120888     2 Ukn (Threadpool Worker)
XXXX    c 16c8 0016b7a8   180a220 Enabled  00000000:00000000 000df4e0     0 Ukn (Threadpool Worker)
XXXX    8 1384 00162878   180a220 Enabled  284e26a4:284e45d8 000df4e0     0 Ukn (Threadpool Worker)
XXXX    b 1c10 0016b3d8   180a220 Enabled  3ed2dae0:3ed2dfe8 00120888     2 Ukn (Threadpool Worker)
XXXX    a 1814 0016b008   180a220 Disabled 28816384:28816638 00120888     1 Ukn (Threadpool Worker)
XXXX    d  1fc 1b4d1ff0       220 Enabled  319f89a4:319fa41c 000df4e0     0 Ukn
XXXX    e 1864 1b4e3d20   180b220 Enabled  4b2c5be0:4b2c6150 000df4e0     0 Ukn (Threadpool Worker)
XXXX    f 13bc 1b57caf8   200b220 Enabled  4cc71584:4cc73414 00120888     1 Ukn
XXXX   10  72c 1f5124a8   180b220 Enabled  3b4b3414:3b4b4fe8 00120888     2 Ukn (Threadpool Worker)
XXXX   11 1fd0 1f526398   180b220 Disabled 4d46f41c:4d470158 00120888     1 Ukn (Threadpool Worker)
XXXX   12 1f10 1f52f1c8   180b220 Enabled  28812c14:28814638 00120888     2 Ukn (Threadpool Worker)
XXXX   13 1b84 1f53a420       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   14 18a4 1f570978   180b220 Enabled  263e18b4:263e2e28 000df4e0     0 Ukn (Threadpool Worker)
XXXX   15 1a98 1f57f0a0   180b220 Enabled  00000000:00000000 000df4e0     0 Ukn (Threadpool Worker)
XXXX   16  1b4 1f583628   180b220 Enabled  495781ec:4957914c 00120888     2 Ukn (Threadpool Worker)
XXXX   17  b90 1f585dc8   180b220 Enabled  265cbe48:265ccba4 000df4e0     0 Ukn (Threadpool Worker)
XXXX   18 1590 1f613c60       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   19 1850 1f5fad90       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   1a  c78 1f60d3f0       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   1c 1bd8 2121f1b0       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   1d  494 1b4a8c10       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   1e  898 2120f120       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   1f 1820 21355ff8       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   20 15b0 3570e120       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   21 18b0 359ca008       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   22  75c 35a58948       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   25 1a18 213ac8f8   880b220 Disabled 3219a830:3219b450 00120888     1 Ukn (Threadpool Completion Port) System.StackOverflowException (0e3200a4)
XXXX   29 1b74 3598e620   180b220 Enabled  00000000:00000000 000df4e0     0 Ukn (Threadpool Worker)
XXXX   2a  9b8 3598dbe0   180b220 Enabled  2880ef2c:28810638 000df4e0     0 Ukn (Threadpool Worker)
XXXX   2b 1eac 1f6f6288   180b220 Enabled  00000000:00000000 000df4e0     0 Ukn (Threadpool Worker)
XXXX   2d  2f4 211759e8   180b220 Disabled 2634eacc:2634ee08 00120888     1 Ukn (Threadpool Worker)
XXXX   2e 1e3c 35c2eb60   880b220 Enabled  4b5a5758:4b5a62e8 000df4e0     0 Ukn (Threadpool Completion Port)
XXXX   30  394 35c394f8   180b220 Enabled  4cef7930:4cef90d4 000df4e0     0 Ukn (Threadpool Worker)
XXXX   31 1e64 35c39128   180b220 Disabled 288110b0:28812638 00120888     1 Ukn (Threadpool Worker)
XXXX   32 1af8 35a58578   180b220 Enabled  3b48e7cc:3b48efe8 000df4e0     0 Ukn (Threadpool Worker)
XXXX   34 1d44 1f6a6c88   180b220 Enabled  00000000:00000000 000df4e0     0 Ukn (Threadpool Worker)
XXXX   35 197c 212088e0   180b220 Enabled  49389ba8:4938af40 000df4e0     0 Ukn (Threadpool Worker)
XXXX   36 1e2c 35c1d980       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   38 1ddc 212d03d8       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   39  288 212d0008       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   3a 1694 212bf958       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   3b  be4 212ccc40       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   37  ccc 35c4d6d0       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   3c 14ec 35c55af0       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   41 1d94 35c38c08   180b220 Enabled  00000000:00000000 000df4e0     0 Ukn (Threadpool Worker)
XXXX   24  130 35746a50   180b220 Enabled  2670ae48:2670cc00 000df4e0     0 Ukn (Threadpool Worker)
XXXX   2f 1404 35c1d350   180b220 Enabled  00000000:00000000 000df4e0     0 Ukn (Threadpool Worker)
XXXX   43 1ae8 35c25cb8   180b220 Disabled 3b4c28e0:3b4c2fe8 00120888     1 Ukn (Threadpool Worker)
XXXX   44 18ac 212cc870   180b220 Disabled 4957e728:4957f14c 00120888     1 Ukn (Threadpool Worker)
XXXX   45 18b4 212bf588   180b220 Disabled 3b4c05dc:3b4c0fe8 00120888     1 Ukn (Threadpool Worker)
XXXX   46 1c0c 21239858       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   47  4fc 21188b68       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   48 1198 35caa2a8       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   49 1f9c 21147af8       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   4a 1adc 35cc6908       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   4b  ce8 35c60e30       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   4d  6f0 35d05aa0       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   4e 1ee8 35c1b6b0       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   42 1d7c 35d9a230       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   3d  7d8 212e1b28       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   23  c0c 503ea010       220 Enabled  00000000:00000000 000df4e0     0 Ukn
XXXX   27 1f44 503cdf08       220 Enabled  00000000:00000000 000df4e0     0 Ukn

要打印的异常显示有任何堆栈跟踪和其他方法抱怨说,它的非托管code。我的猜测是,由于转储在死亡过程中创建的所有线程都被垃圾回收,而且也没有留下来获取信息。

Trying to print the exception shows that there is no stack trace, and other methods complain that it's unmanaged code. My guess is that as the dump is created at process death, all the threads have been garbage collected and there's no information left to get.

我真的很想有调试器的StackOverflowException的第一次机会执行完全转储,忽略所有其他异常类型。我知道ADPlus将可以使用配置文件 - <一个href=\"http://msdn.microsoft.com/en-us/library/cc409304.aspx\">http://msdn.microsoft.com/en-us/library/cc409304.aspx - 但格式是所有希腊给我。谁能告诉我如何做一个ADPlus的脚本,将做到这一点?

I would really like to have the debugger perform a full dump on the first-chance of the StackOverflowException and ignore all other exception types. I know that ADPlus can use a config file - http://msdn.microsoft.com/en-us/library/cc409304.aspx - but the format is all greek to me. Can anyone show me how to make an ADPlus script that will do this?

...当然,如果你看一下上面的线程列表,你知道什么是错的,或者可以计算出来,如果​​我给你更多的信息,你可以告诉我这一点。

...of course if you look at the thread list above and you know exactly what's wrong, or could figure it out if I gave you any more information, you could just tell me that too.

解决尝试1

感谢您deemok下面的答案,这是不完全正确,但把我推在正确的方向。唯一的例外code堆栈溢出是不正确的(它的SBO不SOV),(我这样想着的时候,见下文deemok的编辑),所以我试着用下面的配置调试:

Thank you deemok for the answer below, it wasn't quite right but that pushed me in the right direction. The exception code for Stack Overflow was incorrect (it's sbo not sov), (or so I thought at the time, see deemok's edits below) so I tried debugging with the following configuration:

<ADPlus>
   <!-- Add log entry, log faulting thread stack and dump full on first chance StackOverflow -->
<Exceptions>
	 <Config>
		<!-- This is for the StackOverflow exception -->
	   <Code> sbo </Code>
	   <Actions1> Log;Stack;FullDump </Actions1>
	   <!-- Depending on what you intend - either stop the debugger (Q or QQ) or continue unhandled (GN) -->
	   <ReturnAction1> GN </ReturnAction1>
	 </Config>
  </Exceptions>
</ADPlus>

和使用下面的命令:

adplus -crash -o D:\Crash -NoDumpOnFirst -c D:\Crash\stackoverflow.cfg -iis

我验证了输出的日志文件indictated正确的配置。诀窍是,ADPlus的命令行PARAMS才能得到执行,所以如果你开始与陷阱第一次机会异常的配置,然后应用-NoDumpOnFirst,配置设置将被覆盖。如果将使用-c最后的配置,那么它的设置将胜出。

I verified that the outputted log files indictated the right configuration. The trick is that the command line params of adplus get executed in order, so if you start with a config that traps first-chance exceptions and then apply -NoDumpOnFirst, the configuration settings will be overwritten. If you apply the config with -c last, then its settings will win out.

在结束,然而,堆栈溢出证明不可捕获。堆栈溢出发生,没有内存转储可以接受,然后转储发生第二次机会过程结束事件,并再次一切都已经被垃圾收集,我不能得到任何有用的信息。

In the end, however, the stack overflow proved uncatchable. The stack overflow happened, no memory dump could be received, and then a dump occurred on the second-chance process ending event, and again everything had been garbage collected and I couldn't get any useful information.

我试图短路结束异常过程中,这是参与并覆盖堆栈溢出的情况下,但随后发生异常,我只是没有得到内存转储。

I attempted to short-circuit the process ending exception, in case that was engaging and overriding the stack overflow, but then the exception occurred and I just got no memory dump.

幸运的是,我在回答通过检查code跌跌撞撞。它是圆形的方法调用的情况下,当然。

Luckily, I stumbled upon the answer by examining code. It was a case of circular method calling, of course.

实际分辨率

问题很久以前就解决了,但我迅速做了一个ASP.NET页面,将导致堆栈溢出。 (这并不难毕竟做)并尝试下面的Axl的响应。

The problem had been solved long ago but I quickly made an ASP.NET page that would cause a stack overflow. (It's not hard to do after all) and tried Axl's response below.

中的XML是微客 - Axl的只是忘了关闭&LT; / ADPlus的&GT; 标签(或probaby失去了在复制粘贴),但是这很简单足够的修复和ADPlus的是一种足以告诉我到底什么是错。

The XML was slightly off - Axl just forgot to close the </ADPlus> tag (or probaby lost it in a copy-paste), but that was easy enough to fix and adplus was kind enough to tell me exactly what was wrong.

我设置的脚本抵减我的测试堆栈溢出运动员,在WinDbg中装起来的结果,当我打电话!clrstack我得到的,共有互相调用循环的方法非常明确(长)上市。这会在瞬间找到问题!我会保持这个页面加入书签,下一次堆栈溢出正值敲我的门了。

I set that script off against my test stack overflow thrower, loaded up the result in windbg, and when I called !clrstack I got a very clear (and long) listing of the methods that were calling each other circularly. This would have found the problem in an instant! I'll be keeping this page bookmarked for the next time a stack overflow comes knocking at my door.

推荐答案

以防万一这可能帮助别人,下面是的 ADPlus的配置我想出了文件。现在看它,我不知道!失控有任何影响。连接时抛出一个StackOverflowException正在运行ASP.NET应用程序,这将产生第一次机会计算器全面与第一次机会过程关闭充分.dmp文件中指定的OutputDir。与 WinDBG的打开的第一个文件,并运行.loadby索斯mscorwks其次是!clrstack,看看有什么可能会导致堆栈溢出。

Just in case this might help someone else, below is the ADPlus config file I came up with. Looking at it now, I'm not sure that !runaway has any effect. Attached when an ASP.NET app that throws a StackOverflowException is running, this will generate "1st chance StackOverflow full" and "1st chance Process Shut Down full" .dmp files in the specified OutputDir. Open the first file with Windbg, and run ".loadby sos mscorwks" followed by "!clrstack" to see what might be causing the stack overflow.

<ADPlus>
<Settings>
    <RunMode>CRASH</RunMode>
    <OutputDir>C:\Dumps</OutputDir>
    <ProcessName>w3wp.exe</ProcessName> 
</Settings>
<Exceptions>
    <Option>FullDumpOnFirstChance</Option>
    <Option>MiniDumpOnSecondChance</Option>
    <Option>NoDumpOnFirstChance</Option>
    <Option>NoDumpOnSecondChance</Option>
    <Config>
        <Code>AllExceptions</Code>
        <Actions1>Void</Actions1>
        <Actions2>Void</Actions2>
        <ReturnAction1>GN</ReturnAction1>
        <ReturnAction2>GN</ReturnAction2>
    </Config>       
    <Config>
        <!--
        av = AccessViolation
        ch = InvalidHandle
        ii = IllegalInstruction
        dz =  IntegerDivide
        c000008e = FloatingDivide
        iov = IntegerOverflow
        lsq = InvalidLockSequence
        sov = StackOverflowException
        eh = CPlusPlusEH
        * = UnknownException
        clr = NET_CLR
        bpe = CONTRL_C_OR_Debug_Break
        ld = DLL_Load
        ud = DLL_UnLoad
        epr = Process_Shut_Down
        sbo = Stack_buffer_overflow
        -->
        <Code>sov;sbo</Code>
        <Actions1>Log;Time;Stack;FullDump;EventLog</Actions1>
        <CustomActions1>!runaway</CustomActions1>
        <Actions2>Log;Time;Stack;FullDump;EventLog</Actions2>
        <CustomActions2>!runaway</CustomActions2>
        <!--
        G = go
        GN = go unhandled exception
        GH = go handled exception
        Q = quit
        QD = quit and detach
        -->
        <ReturnAction1>GN</ReturnAction1>
        <ReturnAction2>GN</ReturnAction2>
    </Config>
    <Config>
        <Code>clr</Code>
        <Actions1>Void</Actions1>
        <Actions2>Log;Time;Stack;FullDump;EventLog</Actions2>
        <ReturnAction1>GN</ReturnAction1>
        <ReturnAction2>GN</ReturnAction2>
    </Config>
    <Config>
        <Code>epr</Code>
        <Actions1>Log;Time;Stack;FullDump;EventLog</Actions1>
        <Actions2>Void</Actions2>
        <ReturnAction1>GN</ReturnAction1>
        <ReturnAction2>GN</ReturnAction2>
    </Config>
</Exceptions>
</ADPlus>

这篇关于帮助与WinDbg的ADPlus的和捕捉StackOverflowException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆