启用了JVM标志-XX:+ UseDynamicNumberOfGCThreads -XX:+ TraceDynamicGCThreads可以看到否. GC期间的GC线程数.请解释输出日志? [英] JVM flags -XX:+UseDynamicNumberOfGCThreads -XX:+TraceDynamicGCThreads enabled to see no. of GC threads during GC. Please explain the output logs?

查看:300
本文介绍了启用了JVM标志-XX:+ UseDynamicNumberOfGCThreads -XX:+ TraceDynamicGCThreads可以看到否. GC期间的GC线程数.请解释输出日志?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一个应用程序以集群模式(6个节点)运行Wildfly应用程序服务器. 当触发GC时,有时我们会看到JVM冻结16秒. 如果15秒钟内未收到心跳响应,则该应用程序对时间敏感,并且群集中的其他节点认为该节点已死(JVM在其中暂停). 因此,JVM冻结导致应用程序不稳定. 为了了解GC期间发生的情况,我们启用了热点,安全点日志,并在GC暂停时查看以下跟踪.

We have an application running Wildfly app server in a clustered mode (6 nodes). We are seeing sometimes JVM freeze of 16secs when there is a GC triggered. The application is time sensitive and other nodes in the cluster thinks that node is dead (in which the JVM pause) if the heartbeat response is not received with in 15secs. So, The JVM freeze causing application instability. To understand what is going on during GC, we enabled hotspot, safepoint logs and see the following traces when there is a GC pause.

任何人都可以解释以下参数的含义.

Can anybody explain what is meant by the following parameters.

1.) active_workers(): 13  
2.) new_acitve_workers: 13  
3.) prev_active_workers: 13
4.) active_workers_by_JT: 3556  
5.) active_workers_by_heap_size: 146

环境详细信息: Linux 64位RHEL 7 OpenJDK的1.8 堆大小:12GB(年轻:4GB,使用年限:8GB) CPU核心数:16 VMware ESX 5.1

Environment details: Linux 64bit RHEL 7 OpenJDK 1.8 Heap size: 12GB (Young:4GB, Tenure:8GB) CPU cores: 16 VMware ESX 5.1

JVM参数:

-XX:ThreadStackSize=512 
-Xmx12288m 
-XX:+UseParallelGC 
-XX:+UseParallelOldGC 
-XX:MaxPermSize=1024m 
-XX:+DisableExplicitGC 
-XX:NewSize=4096m 
-XX:MaxNewSize=4096m 
-XX:ReservedCodeCacheSize=256m 
-XX:+UseCodeCacheFlushing
-XX:+UseDynamicNumberOfGCThreads

有没有关于调整这些JVM参数以减少GC暂停时间的建​​议?

Any suggestions in tuning these JVM parameters to reduce the GC pause time?

GC日志:

GCTaskManager::calc_default_active_workers() : active_workers(): 13  new_acitve_workers: 13  prev_active_workers: 13
 active_workers_by_JT: 3556  active_workers_by_heap_size: 146
GCTaskManager::set_active_gang(): all_workers_active()  1  workers 13  active  13  ParallelGCThreads 13
JT: 1778  workers 13  active  13  idle 0  more 0
2016-10-06T07:38:47.281+0530: 48313.522: [Full GC (Ergonomics) DrainStacksCompactionTask::do_it which = 3 which_stack_index = 3/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 7 which_stack_index = 7/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 2 which_stack_index = 2/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 0 which_stack_index = 0/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 11 which_stack_index = 11/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 6 which_stack_index = 6/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 1 which_stack_index = 1/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 12 which_stack_index = 12/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 4 which_stack_index = 4/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 5 which_stack_index = 5/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 9 which_stack_index = 9/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 8 which_stack_index = 8/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 10 which_stack_index = 10/empty(0) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 3 region_stack = 0x780be610  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 5 region_stack = 0x780be730  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 7 region_stack = 0x780be850  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 11 region_stack = 0x780bea90  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 1 region_stack = 0x780be4f0  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 10 region_stack = 0x780bea00  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 8 region_stack = 0x780be8e0  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 4 region_stack = 0x780be6a0  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 0 region_stack = 0x780be460  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 2 region_stack = 0x780be580  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 6 region_stack = 0x780be7c0  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 12 region_stack = 0x780beb20  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 9 region_stack = 0x780be970  empty (1) use all workers 1
[PSYoungGen: 63998K->0K(4082176K)] [ParOldGen: 8346270K->3657870K(8388608K)] 8410268K->3657870K(12470784K), [Metaspace: 465864K->465775K(1495040K)], 16.0898939 secs] 
[Times: user=180.57 sys=2.46, real=16.09 secs]
2016-10-06T07:39:03.373+0530: 48329.615: Total time for which application threads were stopped: 16.2510644 seconds, Stopping threads took: 0.0036805 seconds

安全点日志:

48313.363: ParallelGCFailedAllocation       [    2384          0              2    ]      [     0     0     3    35 16210    ]  0

预先感谢您的帮助.

推荐答案

根据ParallelGCFailedAllocation[PSYoungGen: 63998K->0K(4082176K)] [ParOldGen: 8346270K->3657870K(8388608K)] 8410268K->3657870K(12470784K), [Metaspace: 465864K->465775K(1495040K)], 16.0898939 secs]判断,我们具有以下条件:

Judging by the ParallelGCFailedAllocation and [PSYoungGen: 63998K->0K(4082176K)] [ParOldGen: 8346270K->3657870K(8388608K)] 8410268K->3657870K(12470784K), [Metaspace: 465864K->465775K(1495040K)], 16.0898939 secs] we have the following conditions:

  1. YoungGen几乎是空的(4G中仅占63М)
  2. OldGen几乎已满(8,3G中仅剩42М)
  3. JVM试图从YoungGen中移走幸存的对象,或者未能在Survivor空间中分配它们,并决定将其移至OldGen
  4. OldGen也没有足够的空间(如上所述只有42M),因此触发了FullGC
  5. 收集完OldGen的FullGC 5G后(8346270K-> 3657870K)

甚至13个并行运行的GC线程都在收集16G的5G线程.由于您只有16个内核,因此添加更多线程不会有太多提高速度的空间.

Even 13 GC threads running in parallel are collecting those 5G for 16 seconds. Since you have only 16 cores there not so many room for speed improvement from adding more threads.

以下可能正在发生:

  • 您的对象对于一个YounGen来说寿命太长了,因此您需要切换到CMS/G1,这样它才能更频繁地收集OldGen,并且总花费的时间更少.您将需要根据需要调整InitiatingHeapOccupancyPercent.从当前的输出来看,您应该在4G左右启动.尽管是否真的需要那些12G的堆会引起疑问,因为这会遇到堆碎片的问题.
  • 您的对象寿命短,但太大而无法容纳在Survivor空间中.,因此您需要调整SurvivorRatio参数以使其更大.诸如SurvivorRatio = 4之类的东西(在这种情况下,它将变为1G).
  • Your objects live too long for a YounGen, therefore you would need to switch to CMS/G1 so it will collect OldGen more frequently and it will take less time in total. You would need to tune InitiatingHeapOccupancyPercent as per your needs. Jugding by current output you should initiate somewhere around 4G. Though it will put in question if you really need those 12G of heap, since it would be a subject to heap fragmentation issues.
  • Your objects are short-lived but too big to be accomodated in Survivor space, therefore you would need to tune SurvivorRatio parameter to make it bigger. Something like SurvivorRatio=4 (it will make it 1G in that case).

因此,这实际上取决于您的对象分配模式.最好的方法是在将其应用于生产之前尝试在其他地方使用.

So it really depends on your object allocation pattern. The best approach would be to try it somewhere else before applying it in production.

JVM GC参数可在此处找到

这篇关于启用了JVM标志-XX:+ UseDynamicNumberOfGCThreads -XX:+ TraceDynamicGCThreads可以看到否. GC期间的GC线程数.请解释输出日志?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆