G1 GC - 极长的终止时间 [英] G1 GC -- extremely long termination time

查看:802
本文介绍了G1 GC - 极长的终止时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一个看似直截了当的问题:为什么G1 GC在'终止'阶段花了这么多时间?一个非完整的GC需要大约2秒钟的时间,从那以后1.5秒就是终止(盗取工作)阶段。

上下文:
- 我们的应用程序有apprx。 10MB /秒的分配率
- 其中大多数是短期对象
- 终止时间随应用程序正常运行时间线性增加
- 我们有一个大堆(24G)
- 满GC永远不会发生,如果我手动强制它,GC时间会恢复正常
- 应用程序本身执行大量类加载(JSP)
- 我们使用G1 GC和以下参数: p>

  -XX:+ AggressiveOpts -XX:CICompilerCount = 4 -XX:ConcGCThreads = 2 -XX:G1HeapRegionSize = 8388608 -XX:G1NewSizePercent = 35 -XX:InitialHeapSize = 17179869184 -XX:+ ManagementServer -XX:MarkStackSize = 4194304 -XX:MaxGCPauseMillis = 1500 -XX:MaxHeapSize = 23622320128 -XX:MaxNewSize = 14168358912 -XX:MinHeapDeltaBytes = 8388608 -XX:+ ParallelRefProcEnabled -XX: + PrintAdaptiveSizePolicy -XX:+ PrintGC -XX:+ PrintGCDetails -XX:+ PrintGCTimeStamps -XX:+ PrintStringDeduplicationStatistics -XX:+ PrintTenuringDistribution -XX:+ UnlockExperimentalVMOptions -XX:+ UseCompressedClassPointers -XX:+ UseCompressedOops -XX:+ UseG1GC -XX: + UseStringDe复制

122618.508:[GC暂停(G1撤退暂停)(年轻)
希望的幸存者大小645922816字节,新阈值15(最大15)
- 年龄1:11770408字节, 11770408 total
- age 2:4473200 bytes,16243608 total
- age 3:7342504 bytes,23586112 total
- age 4:2872344 bytes,26458456 total
- age 5:3291816字节,29750272 total
- age 6:3174480 bytes,32924752 total
- age 7:3538032 bytes,36462784 total
- age 8:3236984 bytes,39699768 total
- age 9 :3369896 bytes,43069664 total
- age 10:4028056 bytes,47097720 total
- age 11:3598376 bytes,50696096 total
- age 12:3829808 bytes,54525904 total
- 年龄13:3966344字节,58492248总计
- 年龄14:3347280字节,61839528合计
- 年龄15:3887528字节,65727056合计
122618.508:[G1Egonomics(CSet Construction)开始选择CSet, _pending_cards:7 510人,预测基准时间:297.79毫秒,剩余时间:1202.21毫秒,目标暂停时间:1500.00毫秒]
122618.508:[G1人机工程学(CSet施工)增加年轻区域至CSet,伊甸园:1208个地区,幸存者:20个地区,预测的年轻区域时间:157.92 ms]
122618.508:[G1人体工程学(CSet Construction)完成CSet,eden:1208个地区,幸存者:20个地区,旧地区:0个地区,预计暂停时间:455.71 ms,目标暂停时间: [平行时间:2094.7毫秒,GC工作人员:8]
[GC工作人员起始时间(毫秒):最小值:122618508.2,平均值:122618508.3,最大值:122618508.4] [
,2.1182594秒] ,Diff:0.3]
[Ext Root Scanning(ms):Min:3.0,Avg:283.6,Max:2094.1,Diff:2091.1,Sum:2269.0]
[更新RS(ms) 0.0,Avg:2.2,Max:4.3,Diff:4.3,Sum:17.2]
[Processed Buffers:Min:0,Avg:7.4,Max:24,Diff:24,Sum:59]
[扫描RS(ms):最小值:0.1,平均值:0.3,最大值:0.5,差值:0.4,总和值:2.4]
[码根扫描(毫秒):M in:0.0,Avg:0.0,Max:0.2,Diff:0.2,Sum:0.3]
[Object Copy(ms):Min:0.0,Avg:283.7,Max:341.0,Diff:340.9,Sum:2269.5 ]
[终止(毫秒):最小值:0.0,平均值:1524.5,最大值:1754.8,差值:1754.8,总和值:12196.2]
[终止尝试次数:最小值:1,平均值:10989.2,最大值:18096 [GC Worker Other(ms):Min:0.0,Avg:0.1,Max:0.1,Diff:0.1,Sum:0.6]
[GC Worker Total( ms):Min:2094.3,Avg:2094.4,Max:2094.6,Diff:0.3,Sum:16755.2]
[GC Worker End(ms):Min:122620602.7,Avg:122620602.7,Max:122620602.8,Diff:0.1 ]
[代码根修正:0.2 ms]
[代码根净化:0.0 ms]
[String Dedup Fixup:4.5 ms,GC Workers:8]
[Queue Fixup( ms):Min:0.0,Avg:0.0,Max:0.0,Diff:0.0,Sum:0.0]
[Table Fixup(ms):Min:0.0,Avg:3.7,Max:4.3,Diff:总和:29.7]
[清除CT:1.7毫秒]
[其他:17.1毫秒]
[选择CSet:0.0 ms]
[Ref Proc: 14.6毫秒]
[Ref Enq:0.4 ms]
[累加卡:0.2 ms]
[累加寄存器:0.1 ms]
[累加回收:0.0 ms]
[免费CSet:1.5ms]
[Eden:9664.0M(9664.0M) - > 0.0B(9688.0M)幸存者:160.0M-> 136.0M堆:10.5G(16.0G) - > 1101.1M(16.0G)]
[Times:user = 14.38 sys = 0.07,real = 2.12 sec]
122620.627:[GC并发字符串重复数据删除,142.3K-> 26.4K(115.9K ),平均60.3%,0.0010556秒]
[最后执行:0.0010556秒,空闲:450.5150579秒,阻止:0 / 0.0000000秒]
[检查:2027]
[跳过:0 0.0%)]
[散列:1304(64.3%)]
[已知:602(29.7%)]
[新增:1425(70.3%)142.3K]
[重复数据:1240(87.0%)115.9万(81.4%)]
[年轻:1240(100.0%)115.9K(100.0%)]
[旧:0(0.0%)0.0B(0.0% )]
[Tota $执行:277 / 0.4346528秒,空闲:277 / 122619.8157354秒,阻止:62 / 0.0449930秒]
[检查:801984]
[跳过:0(0.0%)]
[ :411526(51.3%)]
[已知:245334(30.6%)]
[新增:556650(69.4%)53.2M]
[重复数:403870(72.6%)32.1M 60.3%)]
[年轻:392678(97.2%)31.1M(96.8%)]
[旧:11192(2.8%)1066.4K(3.2%)]
[表] b $ b [Memory Usage:8812.9K]
[Size:262144,Min:1024,Max:16777216]
[条目:288637,Load:110.1%,Cached:0,Added:358165,Removed :69528]
[Resize Count:8,Shrink Threshold:174762(66.7%),Grow Threshold:524288(200.0%)]
[Rehash Count:0,Rehash Threshold:120,Hash Seed:0x0 ]
[年龄阈值:3]
[队列]
[丢弃:0]


解决在再次查看日志之后,问题不是终止时间 - 它是 Ext Root Scanning - 终止时间很长,因为那些线程正在等待其他工作人员完成RS。这是因为使用代码缓存越来越大。

A seemingly straightforward question: why does G1 GC spend so much time in the 'termination' phase? A non-full GC takes ~2 seconds, and from that, 1.5 sec is the termination (work stealing) phase.

Context: - our app has apprx. 10MB/sec allocation rate - most of those are short-lived objects - termination time is increasing linearly with app uptime - we have a large heap (24G) - full GC never occurs, if I force it manually then GC times are back to normal - the app itself does a lot of classloading (JSP) - we're using G1 GC with the following parameters:

-XX:+AggressiveOpts -XX:CICompilerCount=4 -XX:ConcGCThreads=2 -XX:G1HeapRegionSize=8388608 -XX:G1NewSizePercent=35 -XX:InitialHeapSize=17179869184 -XX:+ManagementServer -XX:MarkStackSize=4194304 -XX:MaxGCPauseMillis=1500 -XX:MaxHeapSize=23622320128 -XX:MaxNewSize=14168358912 -XX:MinHeapDeltaBytes=8388608 -XX:+ParallelRefProcEnabled -XX:+PrintAdaptiveSizePolicy -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintStringDeduplicationStatistics -XX:+PrintTenuringDistribution -XX:+UnlockExperimentalVMOptions -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC -XX:+UseStringDeduplication

122618.508: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 645922816 bytes, new threshold 15 (max 15)
- age   1:   11770408 bytes,   11770408 total
- age   2:    4473200 bytes,   16243608 total
- age   3:    7342504 bytes,   23586112 total
- age   4:    2872344 bytes,   26458456 total
- age   5:    3291816 bytes,   29750272 total
- age   6:    3174480 bytes,   32924752 total
- age   7:    3538032 bytes,   36462784 total
- age   8:    3236984 bytes,   39699768 total
- age   9:    3369896 bytes,   43069664 total
- age  10:    4028056 bytes,   47097720 total
- age  11:    3598376 bytes,   50696096 total
- age  12:    3829808 bytes,   54525904 total
- age  13:    3966344 bytes,   58492248 total
- age  14:    3347280 bytes,   61839528 total
- age  15:    3887528 bytes,   65727056 total
 122618.508: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 7510, predicted base time: 297.79 ms, remaining time: 1202.21 ms, target pause time: 1500.00 ms]
 122618.508: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 1208 regions, survivors: 20 regions, predicted young region time: 157.92 ms]
 122618.508: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 1208 regions, survivors: 20 regions, old: 0 regions, predicted pause time: 455.71 ms, target pause time: 1500.00 ms]
, 2.1182594 secs]
   [Parallel Time: 2094.7 ms, GC Workers: 8]
      [GC Worker Start (ms): Min: 122618508.2, Avg: 122618508.3, Max: 122618508.4, Diff: 0.3]
      [Ext Root Scanning (ms): Min: 3.0, Avg: 283.6, Max: 2094.1, Diff: 2091.1, Sum: 2269.0]
      [Update RS (ms): Min: 0.0, Avg: 2.2, Max: 4.3, Diff: 4.3, Sum: 17.2]
         [Processed Buffers: Min: 0, Avg: 7.4, Max: 24, Diff: 24, Sum: 59]
      [Scan RS (ms): Min: 0.1, Avg: 0.3, Max: 0.5, Diff: 0.4, Sum: 2.4]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.2, Diff: 0.2, Sum: 0.3]
      [Object Copy (ms): Min: 0.0, Avg: 283.7, Max: 341.0, Diff: 340.9, Sum: 2269.5]
      [Termination (ms): Min: 0.0, Avg: 1524.5, Max: 1754.8, Diff: 1754.8, Sum: 12196.2]
         [Termination Attempts: Min: 1, Avg: 10989.2, Max: 18096, Diff: 18095, Sum: 87914]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.6]
      [GC Worker Total (ms): Min: 2094.3, Avg: 2094.4, Max: 2094.6, Diff: 0.3, Sum: 16755.2]
      [GC Worker End (ms): Min: 122620602.7, Avg: 122620602.7, Max: 122620602.8, Diff: 0.1]
   [Code Root Fixup: 0.2 ms]
   [Code Root Purge: 0.0 ms]
   [String Dedup Fixup: 4.5 ms, GC Workers: 8]
      [Queue Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Table Fixup (ms): Min: 0.0, Avg: 3.7, Max: 4.3, Diff: 4.3, Sum: 29.7]
   [Clear CT: 1.7 ms]
   [Other: 17.1 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 14.6 ms]
      [Ref Enq: 0.4 ms]
      [Redirty Cards: 0.2 ms]
      [Humongous Register: 0.1 ms]
      [Humongous Reclaim: 0.0 ms]
      [Free CSet: 1.5 ms]
   [Eden: 9664.0M(9664.0M)->0.0B(9688.0M) Survivors: 160.0M->136.0M Heap: 10.5G(16.0G)->1101.1M(16.0G)]
 [Times: user=14.38 sys=0.07, real=2.12 secs] 
122620.627: [GC concurrent-string-deduplication, 142.3K->26.4K(115.9K), avg 60.3%, 0.0010556 secs]
   [Last Exec: 0.0010556 secs, Idle: 450.5150579 secs, Blocked: 0/0.0000000 secs]
      [Inspected:            2027]
         [Skipped:              0(  0.0%)]
         [Hashed:            1304( 64.3%)]
         [Known:              602( 29.7%)]
         [New:               1425( 70.3%)    142.3K]
      [Deduplicated:         1240( 87.0%)    115.9K( 81.4%)]
         [Young:             1240(100.0%)    115.9K(100.0%)]
         [Old:                  0(  0.0%)      0.0B(  0.0%)]
   [Total Exec: 277/0.4346528 secs, Idle: 277/122619.8157354 secs, Blocked: 62/0.0449930 secs]
      [Inspected:          801984]
         [Skipped:              0(  0.0%)]
         [Hashed:          411526( 51.3%)]
         [Known:           245334( 30.6%)]
         [New:             556650( 69.4%)     53.2M]
      [Deduplicated:       403870( 72.6%)     32.1M( 60.3%)]
         [Young:           392678( 97.2%)     31.1M( 96.8%)]
         [Old:              11192(  2.8%)   1066.4K(  3.2%)]
   [Table]
      [Memory Usage: 8812.9K]
      [Size: 262144, Min: 1024, Max: 16777216]
      [Entries: 288637, Load: 110.1%, Cached: 0, Added: 358165, Removed: 69528]
      [Resize Count: 8, Shrink Threshold: 174762(66.7%), Grow Threshold: 524288(200.0%)]
      [Rehash Count: 0, Rehash Threshold: 120, Hash Seed: 0x0]
      [Age Threshold: 3]
   [Queue]
      [Dropped: 0]

解决方案

After looking at the logs again, the problem is not the termination time -- it's the Ext Root Scanning -- the termination time is long because those threads are waiting for the other workers to finish RS. This is because use the code cache is getting progressively larger.

这篇关于G1 GC - 极长的终止时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆