如何在运行我的代码时调试JVM中发生的Segfaults? [英] How do I debug Segfaults occurring in the JVM when it runs my code?

查看:136
本文介绍了如何在运行我的代码时调试JVM中发生的Segfaults?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的Java应用程序已经开始定期崩溃,包括SIGSEGV和堆栈数据转储以及文本文件中的大量信息。



我调试了C程序在gdb中,我已经从IDE中调试了Java代码。我不确定如何在正在运行的Java程序中处理类似C的崩溃。



我假设我没有在这里查看JVM错误。其他Java程序运行得很好,Sun的JVM可能比我的代码更稳定。但是,我不知道如何使用Java代码导致段错误。肯定有足够的可用内存,当我上次检查分析器时,堆使用率约为50%,偶尔会出现大约80%的峰值。我可以调查任何启动参数吗?接近这样的错误时,什么是好的清单?



虽然我不能够可靠地重现这个事件,但它似乎并不是完全随机发生的或者,所以测试并非完全不可能。



ETA:一些血腥的细节



(我正在寻找一种通用的方法,因为实际的问题可能非常具体。但是,我已经收集了一些信息并且可能有一些价值。)



前一段时间,升级我的CI服务器后,我遇到了类似的问题(参见这里以获取更多详细信息),但该修复(设置 -XX:MaxPermSize )对此没有帮助。



进一步的调查显示,在崩溃日志文件中,标记为当前线程的线程从来都不是我的,但是一个名为VMThread或一个名为GCTaskThread的线程 - 我是后者,它还标有注释(退出),如果它是前者,则GCTaskThread不在列表中。这让我想到问题可能是在GC操作结束时。

解决方案


I我假设我不是在看这里的JVM错误。其他Java程序
运行得很好,Sun的JVM可能比我的
代码更稳定。


我认为你不应该做出这样的假设。如果不使用 JNI ,您将无法编写导致SIGSEGV的Java代码(尽管我们知道它发生了)。我的观点是,当它发生时,它或者是JVM中的错误(不是闻所未闻)或者是某些JNI代码中的错误。如果你自己的代码中没有任何JNI,那并不意味着你没有使用某个库,所以寻找它。当我之前看到这种问题时,它出现在图像处理库中。如果罪魁祸首不在您自己的JNI代码中,您可能无法修复该错误,但您仍然可以解决它。



<首先,您应该在同一平台上获得备用JVM并尝试重现它。您可以尝试其中一种替代方案



如果你无法重现它,它可能是一个JVM错误。从那里,你可以使用你所知道的如何命令特定的JVM或搜索bug数据库。重现它,并可能得到建议的解决方法。 (即使你可以重现它,许多JVM实现只是对Oracle的Hotspot实现的调整,所以它可能仍然是一个JVM错误。)



如果你可以用它重现它另一种JVM,错误可能是你有一些JNI错误。查看您正在使用的库以及它们可能正在进行的本机调用。有时会为同一个库或替代库提供替代的纯Java配置或jar文件,它们几乎完全相同。



祝你好运!


My Java application has started to crash regularly with a SIGSEGV and a dump of stack data and a load of information in a text file.

I have debugged C programs in gdb and I have debugged Java code from my IDE. I'm not sure how to approach C-like crashes in a running Java program.

I'm assuming I'm not looking at a JVM bug here. Other Java programs run just fine, and the JVM from Sun is probably more stable than my code. However, I have no idea how I could even cause segfaults with Java code. There definitely is enough memory available, and when I last checked in the profiler, heap usage was around 50% with occasional spikes around 80%. Are there any startup parameters I could investigate? What is a good checklist when approaching a bug like this?

Though I'm not so far able to reliably reproduce the event, it does not seem to occur entirely at random either, so testing is not completely impossible.

ETA: Some of the gory details

(I'm looking for a general approach, since the actual problem might be very specific. Still, there's some info I already collected and that may be of some value.)

A while ago, I had similar-looking trouble after upgrading my CI server (see here for more details), but that fix (setting -XX:MaxPermSize) did not help this time.

Further investigation revealed that in the crash log files the thread marked as "current thread" is never one of mine, but either one called "VMThread" or one called "GCTaskThread"- I f it's the latter, it is additionally marked with the comment "(exited)", if it's the former, the GCTaskThread is not in the list. This makes me suppose that the problem might be around the end of a GC operation.

解决方案

I'm assuming I'm not looking at a JVM bug here. Other Java programs run just fine, and the JVM from Sun is probably more stable than my code.

I don't think you should make that assumption. Without using JNI, you should not be able to write Java code that causes a SIGSEGV (although we know it happens). My point is, when it happens, it is either a bug in the JVM (not unheard of) or a bug in some JNI code. If you don't have any JNI in your own code, that doesn't mean that you aren't using some library that is, so look for that. When I have seen this kind of problem before, it was in an image manipulation library. If the culprit isn't in your own JNI code, you probably won't be able to 'fix' the bug, but you may still be able to work around it.

First, you should get an alternate JVM on the same platform and try to reproduce it. You can try one of these alternatives.

If you cannot reproduce it, it likely is a JVM bug. From that, you can either mandate a particular JVM or search the bug database, using what you know about how to reproduce it, and maybe get suggested workarounds. (Even if you can reproduce it, many JVM implementations are just tweaks on Oracle's Hotspot implementation, so it might still be a JVM bug.)

If you can reproduce it with an alternative JVM, the fault might be that you have some JNI bug. Look at what libraries you are using and what native calls they might be making. Sometimes there are alternative "pure Java" configurations or jar files for the same library or alternative libraries that do almost the same thing.

Good luck!

这篇关于如何在运行我的代码时调试JVM中发生的Segfaults?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆