G1 GC单一,很长的年轻GC发生与ParallelGCThreads = 1 [英] G1 GC single, very long young GC occured with ParallelGCThreads=1

查看:1765
本文介绍了G1 GC单一,很长的年轻GC发生与ParallelGCThreads = 1的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我设置 ParallelGCThreads = 1 ,并使用 G1 GC ,所有其他 JVM 设置为默认。我在火花1.5.1运行的PageRank 有两个EC2节点100 GB每堆

我的堆使用图表低于(红色区域:年轻一代,黑色区域:老一代)。所有年轻的选区小,突然有来自收集60 GB一位年轻的GC,那么年轻的选区再次变小。我的GC日志没有显示出混合选区,没有完整的选区,一个并发标记和数万年轻的GC。我很奇怪,为什么那庞大的年轻GC发生什么呢?

在这里输入的形象描述

我下面的GC日志的一部分。巨大的年轻GC是一个与堆:84.1G

  2015-12-30T06:59:02.488 + 0000:245.088:[GC暂停(年轻)245.089:[G1Ergonomics(的CSet建筑)开始选择的CSet,_pending_cards:1727年,predicted基地时间:24.64毫秒,剩余时间:175.36毫秒,目标暂停时间:200.00毫秒]
 245.089:[G1Ergonomics(的CSet建筑)增加青年地区的CSet,伊甸园:206地区,幸存者:3个地区,predicted年轻的地区,时间:148.87毫秒]
 245.089:[G1Ergonomics(的CSet建筑)选好的CSet,伊甸园:206地区,幸存者:3个区域,老:0的地区,predicted暂停时间:173.51毫秒,目标暂停时间:200.00毫秒]
2015-12-30T06:59:02.531 + 0000:245.131:[SoftReference的,裁判0,0.0000520秒] 2015-12-30T06:59:02.531 + 0000:245.131:[WeakReference的,21裁判,0.0000160秒] 2015-12- 30T06:59:02.531 + 0000:245.131:[FinalReference,9759裁判,0.0084720秒] 2015-12-30T06:59:02.539 + 0000:245.140:[的PhantomReference,0裁判,裁判14,0.0000190秒] 2015-12-30T06 :59:02.539 + 0000:245.140:[JNI弱引用,0.0000130秒] 245.142:[G1Ergonomics(堆大小调整)试图堆扩展,原因是:最近GC开销比GC,最近GC开销后阈值:12.51%,阈值:10.00 %,未提交:0字节,计算出的膨胀量:0字节(20.00%)]
,0.0534140秒]
   [并行时间:42.3毫秒,GC工人:1]
      [GC工人开始(毫秒):245088.6]
      [外部根扫描(MS):14.4]
      [更新RS(MS):1.9]
         [处理的缓冲区:34]
      [扫描RS(毫秒):0.4]
      [code盘根扫描(毫秒):0.0]
      [对象复制(毫秒):25.5]
      [终止(毫秒):0.0]
      [GC员以外(毫秒):0.0]
      [GC工人总数(毫秒):42.3]
      [GC工人结束(毫秒):245130.9]
   [code盘根Fixup时:0.0毫秒]
   [code盘根迁移:0.0毫秒]
   [清除CT:1.6毫秒]
   [其他:9.5毫秒]
      [选择的CSet:0.0毫秒]
      [参考PROC:8.6毫秒]
      [参考ENQ:0.2毫秒]
      [免费的CSet:0.4毫秒]
   [伊甸园:6592.0M(6592.0M) - GT; 00亿(58.8G)幸存者:96.0M-> 128.0M堆:30.6G(100.0) - GT; 24.2克(100.0克)
 [时间:用户= 0.05 SYS = 0.00,实际= 0.06秒]
2015-12-30T06:59:43.451 + 0000:286.051:[GC暂停(年轻)286.054:[G1Ergonomics(的CSet建筑)开始选择的CSet,_pending_cards:392599,predicted基地时间:367.03毫秒,剩余时间:0.00 MS,目标暂停时间:200.00毫秒]
 286.054:[G1Ergonomics(的CSet建筑)增加青年地区的CSet,伊甸园:1884年地区,幸存者:4个地区,predicted年轻的地区,时间:150.18毫秒]
 286.054:[G1Ergonomics(的CSet建筑)选好的CSet,伊甸园:1884年地区,幸存者:4个地区,老:0的地区,predicted暂停时间:517.21毫秒,目标暂停时间:200.00毫秒]
2015-12-30T06:59:47.767 + 0000:290.368:[SoftReference的,裁判0,0.0000570秒] 2015-12-30T06:59:47.768 + 0000:290.368:[WeakReference的,350裁判,0.0000640秒] 2015-12- 30T06:59:47.768 + 0000:290.368:[FinalReference,99336裁判,0.3781120秒] 2015-12-30T06:59:48.146 + 0000:290.746:[的PhantomReference,0裁判,裁判1,0.0000290秒] 2015-12-30T06 :59:48.146 + 0000:290.746:[JNI弱引用,0.0000140秒] 290.767:[G1Ergonomics(堆大小调整)试图堆扩展,原因是:最近GC开销比GC,最近GC开销后阈值:11.74%,阈值:10.00 %,未提交:0字节,计算出的膨胀量:0字节(20.00%)]
,4.7153740秒]
   [并行时间:4313.9毫秒,GC工人:1]
      [GC工人开始(毫秒):286053.9]
      [外部根扫描(MS):15.2]
      [更新RS(毫秒):86.3]
         [加工缓冲区:1557]
      [扫描RS(MS):4.1]
      [code盘根扫描(毫秒):0.2]
      [对象复制(毫秒):4208.1]
      [终止(毫秒):0.0]
      [GC员以外(毫秒):0.0]
      [GC工人总数(毫秒):4313.9]
      [GC工人结束(毫秒):290367.8]
   [code盘根Fixup时:0.0毫秒]
   [code盘根迁移:0.3毫秒]
   [清除CT:15.1毫秒]
   [其他:386.0毫秒]
      [选择的CSet:0.0毫秒]
      [参考PROC:378.4毫秒]
      [参考ENQ:1.7毫秒]
      [免费的CSet:3.3毫秒]
   [伊甸园:58.9G(58.8G) - GT; 00亿(3456.0M)幸存者:128.0M-> 1664.0M堆:84.1G(100.0) - GT; 26.7G(100.0克)
 [时间:用户= 0.78 = SYS 3.94,实际= 4.71秒]


解决方案

  

尝试堆扩展,原因是:最近GC开销比GC,GC最近开销后阈值:11.74%,阈值:10.00%


我的猜测是,这是推动G1的决定。您可以通过设置 -XX放松吧:GCTimeRatio = 4 ,这将允许它占用相对于应用程序时的CPU周期的20%GCing,而不是10%<。 / p>

如果这是太多了就应该把


  • 允许它使用更多的CPU内核 - 这将使它更容易满足其暂停时间目标,这反过来又意味着它可以推迟更长时间收藏,使其更容易满足吞吐量目标结果是,这是否意味着使用更多的内核实际上可以整体使用更少的CPU周期。

  • 放松暂停时间目标,因此有经常收集少

I set ParallelGCThreads=1 and use G1 GC, all other JVM settings are default. I run PageRank on Spark-1.5.1 with two EC2 nodes 100 GB heap each.

My heap usage graph is below (Red area: young generation, black area: old generation). All young GCs are small, all of a sudden there comes one young GC that collects 60 GB, then young GCs become small again. My GC logs shows no mixed GCs, no full GCs, one concurrent marking, and tens of young GCs. I am wondering why that huge young GC happens?

A part of my GC log below. The huge young GC is the one with "Heap: 84.1G"

2015-12-30T06:59:02.488+0000: 245.088: [GC pause (young) 245.089: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 1727, predicted base time: 24.64 ms, remaining time: 175.36 ms, target pause time: 200.00 ms]
 245.089: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 206 regions, survivors: 3 regions, predicted young region time: 148.87 ms]
 245.089: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 206 regions, survivors: 3 regions, old: 0 regions, predicted pause time: 173.51 ms, target pause time: 200.00 ms]
2015-12-30T06:59:02.531+0000: 245.131: [SoftReference, 0 refs, 0.0000520 secs]2015-12-30T06:59:02.531+0000: 245.131: [WeakReference, 21 refs, 0.0000160 secs]2015-12-30T06:59:02.531+0000: 245.131: [FinalReference, 9759 refs, 0.0084720 secs]2015-12-30T06:59:02.539+0000: 245.140: [PhantomReference, 0 refs, 14 refs, 0.0000190 secs]2015-12-30T06:59:02.539+0000: 245.140: [JNI Weak Reference, 0.0000130 secs] 245.142: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: recent GC overhead higher than threshold after GC, recent GC overhead: 12.51 %, threshold: 10.00 %, uncommitted: 0 bytes, calculated expansion amount: 0 bytes (20.00 %)]
, 0.0534140 secs]
   [Parallel Time: 42.3 ms, GC Workers: 1]
      [GC Worker Start (ms):  245088.6]
      [Ext Root Scanning (ms):  14.4]
      [Update RS (ms):  1.9]
         [Processed Buffers:  34]
      [Scan RS (ms):  0.4]
      [Code Root Scanning (ms):  0.0]
      [Object Copy (ms):  25.5]
      [Termination (ms):  0.0]
      [GC Worker Other (ms):  0.0]
      [GC Worker Total (ms):  42.3]
      [GC Worker End (ms):  245130.9]
   [Code Root Fixup: 0.0 ms]
   [Code Root Migration: 0.0 ms]
   [Clear CT: 1.6 ms]
   [Other: 9.5 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 8.6 ms]
      [Ref Enq: 0.2 ms]
      [Free CSet: 0.4 ms]
   [Eden: 6592.0M(6592.0M)->0.0B(58.8G) Survivors: 96.0M->128.0M Heap: 30.6G(100.0G)->24.2G(100.0G)]
 [Times: user=0.05 sys=0.00, real=0.06 secs] 
2015-12-30T06:59:43.451+0000: 286.051: [GC pause (young) 286.054: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 392599, predicted base time: 367.03 ms, remaining time: 0.00 ms, target pause time: 200.00 ms]
 286.054: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 1884 regions, survivors: 4 regions, predicted young region time: 150.18 ms]
 286.054: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 1884 regions, survivors: 4 regions, old: 0 regions, predicted pause time: 517.21 ms, target pause time: 200.00 ms]
2015-12-30T06:59:47.767+0000: 290.368: [SoftReference, 0 refs, 0.0000570 secs]2015-12-30T06:59:47.768+0000: 290.368: [WeakReference, 350 refs, 0.0000640 secs]2015-12-30T06:59:47.768+0000: 290.368: [FinalReference, 99336 refs, 0.3781120 secs]2015-12-30T06:59:48.146+0000: 290.746: [PhantomReference, 0 refs, 1 refs, 0.0000290 secs]2015-12-30T06:59:48.146+0000: 290.746: [JNI Weak Reference, 0.0000140 secs] 290.767: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: recent GC overhead higher than threshold after GC, recent GC overhead: 11.74 %, threshold: 10.00 %, uncommitted: 0 bytes, calculated expansion amount: 0 bytes (20.00 %)]
, 4.7153740 secs]
   [Parallel Time: 4313.9 ms, GC Workers: 1]
      [GC Worker Start (ms):  286053.9]
      [Ext Root Scanning (ms):  15.2]
      [Update RS (ms):  86.3]
         [Processed Buffers:  1557]
      [Scan RS (ms):  4.1]
      [Code Root Scanning (ms):  0.2]
      [Object Copy (ms):  4208.1]
      [Termination (ms):  0.0]
      [GC Worker Other (ms):  0.0]
      [GC Worker Total (ms):  4313.9]
      [GC Worker End (ms):  290367.8]
   [Code Root Fixup: 0.0 ms]
   [Code Root Migration: 0.3 ms]
   [Clear CT: 15.1 ms]
   [Other: 386.0 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 378.4 ms]
      [Ref Enq: 1.7 ms]
      [Free CSet: 3.3 ms]
   [Eden: 58.9G(58.8G)->0.0B(3456.0M) Survivors: 128.0M->1664.0M Heap: 84.1G(100.0G)->26.7G(100.0G)]
 [Times: user=0.78 sys=3.94, real=4.71 secs] 

解决方案

attempt heap expansion, reason: recent GC overhead higher than threshold after GC, recent GC overhead: 11.74 %, threshold: 10.00 %

My guess is that this is driving G1's decisions. You can relax it by setting -XX:GCTimeRatio=4, which would allow it to take up 20% of CPU cycles relative to application time for GCing instead of 10%.

If that's too much you should either

  • allow it to use more CPU cores - that would make it easier to meet its pause time goals which in turn means it can defer collections for longer, making it easier to meet the throughput goal.
    yes, this does mean using more cores can actually use fewer CPU cycles overall.
  • relax pause time goals so it has to collect less often

这篇关于G1 GC单一,很长的年轻GC发生与ParallelGCThreads = 1的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆