GC调整 - 阻止完整的GC [英] GC Tuning - preventing a Full GC

查看:283
本文介绍了GC调整 - 阻止完整的GC的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图避免在生产环境中运行Tomcat中的Grails应用程序的Full GC(从下面的gc.log示例)

有关如何更好地配置GC的任何建议?



14359.317:[Full GC 14359.317:[CMS:3453285K-> 3099828K(4194304K), [13.1778420秒] 4506618K-> 3099828K(6081792K),[CMS Perm:261951K-> 181304K(264372K)] icms_dc = 0,13.1786310秒] [时间:用户= 13.15 sys = 0.04,实际= 13.18秒]



我的虚拟参数如下:

-Xms = 6G

-Xmx = 6G

-XX:MaxPermSize = 1G

-XX:NewSize = 2G

-XX:MaxTenuringThreshold = 8

-XX:SurvivorRatio = 7

-XX:+ UseConcMarkSweepGC

-XX:+ CMSClassUnloadingEnabled

-XX:+ CMSPermGenSweepingEnabled

-XX:+ CMSIncrementalMode

-XX:CMSInitiatingOccupancyFraction = 60

-XX:+ UseCMSInitiatingOccupancyOnly

-XX:+ HeapDumpOnOutOfMemoryError

-XX:+ PrintGCDetails <
-XX:+ PrintGCTimeStamps

-XX:+ PrintTenuringDistribution

-Dsun.reflect.inflationThreshold = 0


 
14169.764:[GC 14169.764:[ParNew
希望的幸存者大小107347968字节,新阈值8(最大8)
- 年龄1:15584312字节,15584312总计
- 年龄2:20053704字节,35638016总计
- 年龄3:13624872字节,49262888总计
- 年龄4:14469608字节,63732496总计
- 年龄5:10553288字节,74285784合计
- 年龄6:11797648字节,86083432 total
- age 7:12591328 bytes,98674760 total
:1826161K-> 130133K(1887488K),0.1726640 secs] 5216326K-> 3537160K(6081792K)icms_dc = 0,0.1733010 secs] [Times:user = 0.66 sys = 0.03,real = 0.17秒]
14218.712:[GC 14218.712:[ParNew
希望的幸存者大小107347968字节,新的阈值8(最大8)
- 年龄1:25898512字节,25898512 total
- age 2:10308160 bytes,36206672 total
- age 3:16927792 bytes,53134464 total
- age 4:13493608 bytes,66628072 total
- 年龄5:14301832字节,80929904总计
- 年龄6:10448408字节,91378312总计
- 年龄7:11724056字节,103102368合计
- 年龄8:12299528字节,115401896合计
:1807957K-> 147911K(1887488K),0.1664510秒] 5214984K-> 3554938K(6081792K)icms_dc = 0,0.16771290秒] [时间:用户= 0.61 sys = 0.00,实际= 0.17秒]
14251.429:[GC 14251.430:[ParNew
希望的幸存者大小107347968字节,新阈值7(最大8)
- 年龄1:25749296字节,25749296总计
- 年龄2:20111888字节,45861184合计
- age 3:7580776 bytes,53441960 total
- age 4:16819072 bytes,70261032 total
- age 5:13209968 bytes,83471000 total
- age 6:14088856 bytes,97559856 total
- age 7:10371160字节,107931016 total
- age 8:11426712 bytes,119357728 total
:1825735K-> 155304K(1887488K),0.1888880 secs] 5232762K-> 3574222K(608 1792K)icms_dc = 0,0.1895340秒] [时间:用户= 0.74 sys = 0.06,实际= 0.19秒]
14291.342:[GC 14291.343:[ParNew
希望的幸存者大小107347968字节,新的阈值7 max 8)
- age 1:25786480字节,25786480 total
- age 2:21991848 bytes,47778328 total
- age 3:16650000 bytes,64428328 total
- age 4: 7387368 bytes,71815696 total
- age 5:16777584 bytes,88593280 total
- age 6:13098856 bytes,101692136 total
- age 7:14029704 bytes,115721840 total
:1833128K - > 151603K(1887488K),0.1941170秒] 5252046K-> 3591384K(6081792K)icms_dc = 0,0.1974390秒] [时间:用户= 0.82 sys = 0.04,实际= 0.20秒]
14334.142:[GC 14334.143: ParNew
希望的幸存者大小107347968字节,新的阈值6(最大8)
- 年龄1:31541800字节,31541800总计
- 年龄2:20826888字节,52368688合计
- 年龄3:19155264字节, 71523952 total
- age 4:16422240 bytes,87946192 total
- age 5:7235616 bytes,95181808 total
- age 6:16549000 bytes,111730808 total
- age 7:13026064字节,124756872总计
:1829427K-> 167467K(1887488K),0.1890190秒] 5269208K-> 3620753K(6081792K)icms_dc = 0,0.1896630秒] [Times:user = 0.80 sys = 0.03,real = 0.19 secs] $ (完全GC 14359.317:[CMS:3453285K-> 3099828K(4194304K),13.1778420秒] 4506618K-> ] [时间:用户= 13.15 sys = 0.04,实际= 13.18秒]
14373.287:[GC [1 CMS-初始标记:3099828K(4194304K)] 3100094K(6081792K),0.0107380秒] [Times:user = 0.01 sys = 0.00,real = 0.00 secs]
14373.298:[CMS-concurrent-mark-start]
14472.579:[GC 14472.579:[ParNew
希望的生存者大小107347968字节,新的阈值8 (最大8)
- 年龄1:42849392字节,42849392共
:1677824K-> 86719K(1887488K),0.1056680秒] 4777652K-> 3186547K(6081792K)icms_dc = 0,0.1063280秒] [时间:用户= 0.61 sys = 0.00,实际= 0.11秒]
14506.980:[GC 14506.980:[ParNew
希望的幸存者大小107347968字节,新阈值8(最大8)
- 年龄1:42002904字节,42002904总计
- 年龄2:35733928字节,77736832合计
:1764543K-> 96136K(1887488K),0.0982790秒] 4864371K-> 3195964K(6081792K)icms_dc = 0,0.0988960秒] [时间:用户= 0.53 sys = 0.01,实际= 0.10秒]
14544.285:[GC 14544.286:[ParNew
希望的幸存者大小107347968字节,新阈值8(最大8)
- 年龄1:26159736字节,26159736总计
- 年龄2:37842840字节,64002576总计
= age 3:33192784 bytes,97195360 total
:1773960K-> 130799K(1887488K),0.1208590 secs] 4873788K-> 3230628K(6081792K)icms_dc = 0,0.1215900 secs] [Times:user = 0.59 sys = 0.02,实际= 0.13秒]
14589.266:[GC 14 589.266:[ParNew
希望的幸存者大小107347968字节,新阈值4(最大8)
- 年龄1:28010360字节,28010360总计
- 年龄2:21136704字节,49147064合计
- 年龄3:35081376字节,84228440总计
- 年龄4:32468056字节,116696496总计
:1808623K-> 148284K(1887488K),0.1423150秒] 4908452K-> 3248112K(6081792K)icms_dc = 0, 0.1429440秒] [时间:用户= 0.70 sys = 0.02,实际= 0.14秒]
14630.947:[GC 14630.947:[ParNew
希望的幸存者大小107347968字节,新的阈值8(最大8)
- 年龄1:28248240字节,28248240总计
- 年龄2:20712320字节,48960560总计
- 年龄3:18217168字节,67177728总计
- 年龄4:34834832字节,102012560总计
:1826108K-> 140347K(1887488K),0.1784680秒] 4925936K-> 3275469K(6081792K)icms_dc = 0,0.1790920秒] [时间:用户= 0.98 sys = 0.03,实际= 0.18秒]
14664.779: [GC 14664.779:[ParN ew
希望的幸存者大小107347968字节,新阈值5(最大8)
- 年龄1:25841000字节,25841000总计
- 年龄2:22264960字节,48105960总计
- 年龄3:17730104 bytes,65836064 total
- age 4:17988048 bytes,83824112 total
- age 5:34739384 bytes,118563496 total
:1818171K-> 147603K(1887488K),0.1714160 secs] 4953293K - > 3282725K(6081792K)icms_dc = 0,0.1720530秒] [时间:用户= 0.82 sys = 0.11,实际= 0.17秒]
14702.488:[GC 14702.489:[ParNew
希望的幸存者大小107347968字节,新阈值8(最多8)
- 年龄1:26887368字节,26887368总计
- 年龄2:21403352字节,48290720总计
- 年龄3:18732224字节,67022944合计
- 年龄4:17640576字节,84663520总计
- 年龄5:17942952字节,102606472总计
:1825427K-> 142695K(1887488K),0.2118320秒] 4960549K-> 3312168K(6081792K)icms_dc = 0,0.212 4630秒] [时间:用户= 1.13 sys = 0.14,实际= 0.21秒]

我瞄准的策略是:
我想限制终极服务的最低要求,我服务的请求,并期望除了一定数量的共享对象之外,其他每个对象都只对当前的请求有用。因此,通过使用一个大的NewSize和一个增加的TenuringThreshold,希望没有任何这些单一的服务对象。



以下是支持我的策略: >
-Xms = 6G

-Xmx = 6G

-XX:NewSize = 2G //大空间让ParNew不会经常出现,让时间为对象过期

-XX:MaxTenuringThreshold = 8 //限制持续时间

-XX:SurvivorRatio = 7 //基于示例
-XX:CMSInitiatingOccupancyFraction = 60
//防止由促销分配导致的完整GC失败

-XX:+ UseCMSInitiatingOccupancyOnly
//以上例为例



MaxPermSize = 1G和-Dsun.reflect.inflationThreshold = 0与另一个我宁愿分开的问题有关。



-XX:+ CMSClassUnloadingEnabled和-XX:+ CMSPermGenSweepingEnabled是否存在,是因为Grails依赖大量额外的类来封闭和反射

-XX:+ CMSIncrementalMode是一个没有取得多大成功的实验

解决方案

日志片段发布显示您拥有大量活动时间超过320秒的物品(每个年轻收藏约40多个物品,在促销前通过8个收藏品存活)。其余的物体然后流血到终身,并最终你击中了一个显然是意想不到的完整的GC,实际上并没有收集太多。

3453285K-> 3099828K(4194304K)



即你有一个4G终身服务,当它被触发时,它的满座率是82%(3453285/4194304),并且在13秒的时间后是满座的74%。这意味着它花了13s收集~350M的总数,这在6G堆的情况下并不多。

这基本上意味着你的堆不够大,或者更可能的是,你有内存泄漏。像这样的泄漏对于CMS来说是一件可怕的事情,因为并发终身收集是一个非压缩事件,这意味着终身制是一个免费列表的集合,这意味着碎片化可能成为CMS的一个大问题,这意味着您对终身制的使用变得越来越低效,意味着促销失败事件的可能性增加(尽管如果这是一个这样的事件,那么我期望看到一条日志消息说明这一点),因为它想要宣传(或者认为它需要推广)XMB到终身但它没有(连续)空闲列表> = X MB可用。这会触发意外的终身收集,这是一个非远程并发STW事件。如果你真的有点收藏(就像你一样),那么你坐在你的大拇指上就不会感到惊讶。

一些通用的指针,在很大程度上重申Vladimir Sitnitov说过的...


  • 在多核盒子上使用iCMS是没有意义的(除非你有 JVM或其他进程在JVM上运行,以至于JVM确实缺少CPU),因此删除此开关

  • 由于复制相对大量的数据在每个集合中的幸存者空间之间的内存,150-200ms是一个非常巨大的 ParNew 集合

    • 正确答案到年轻的发电问题取决于配置行为究竟是什么(例如,也许你会提前得到更好的服务,并减少分散对终身收藏的影响,或者你最好拥有一个更大的新世代,减少年轻族群收集的频率,以减少对象被推销的次数,从而减少终身流血。)
    • b

      有些问题...


      • 它最终走向OoM还是恢复?

      • 在此日志片段中应用程序处于稳定状态(受到一致性负载的影响,远远超出启动范围),还是处于压力之下?

      • ul>

        I'm trying to avoid the Full GC (from gc.log sample below) running a Grails application in Tomcat in production. Any suggestions on how to better configure the GC?

        14359.317: [Full GC 14359.317: [CMS: 3453285K->3099828K(4194304K), 13.1778420 secs] 4506618K->3099828K(6081792K), [CMS Perm : 261951K->181304K(264372K)] icms_dc=0 , 13.1786310 secs] [Times: user=13.15 sys=0.04, real=13.18 secs]

        My VM params are as follow:
        -Xms=6G
        -Xmx=6G
        -XX:MaxPermSize=1G
        -XX:NewSize=2G
        -XX:MaxTenuringThreshold=8
        -XX:SurvivorRatio=7
        -XX:+UseConcMarkSweepGC
        -XX:+CMSClassUnloadingEnabled
        -XX:+CMSPermGenSweepingEnabled
        -XX:+CMSIncrementalMode
        -XX:CMSInitiatingOccupancyFraction=60
        -XX:+UseCMSInitiatingOccupancyOnly
        -XX:+HeapDumpOnOutOfMemoryError
        -XX:+PrintGCDetails
        -XX:+PrintGCTimeStamps
        -XX:+PrintTenuringDistribution
        -Dsun.reflect.inflationThreshold=0

            14169.764: [GC 14169.764: [ParNew
            Desired survivor size 107347968 bytes, new threshold 8 (max 8)
            - age   1:   15584312 bytes,   15584312 total
            - age   2:   20053704 bytes,   35638016 total
            - age   3:   13624872 bytes,   49262888 total
            - age   4:   14469608 bytes,   63732496 total
            - age   5:   10553288 bytes,   74285784 total
            - age   6:   11797648 bytes,   86083432 total
            - age   7:   12591328 bytes,   98674760 total
            : 1826161K->130133K(1887488K), 0.1726640 secs] 5216326K->3537160K(6081792K) icms_dc=0 , 0.1733010 secs] [Times: user=0.66 sys=0.03, real=0.17 secs] 
            14218.712: [GC 14218.712: [ParNew
            Desired survivor size 107347968 bytes, new threshold 8 (max 8)
            - age   1:   25898512 bytes,   25898512 total
            - age   2:   10308160 bytes,   36206672 total
            - age   3:   16927792 bytes,   53134464 total
            - age   4:   13493608 bytes,   66628072 total
            - age   5:   14301832 bytes,   80929904 total
            - age   6:   10448408 bytes,   91378312 total
            - age   7:   11724056 bytes,  103102368 total
            - age   8:   12299528 bytes,  115401896 total
            : 1807957K->147911K(1887488K), 0.1664510 secs] 5214984K->3554938K(6081792K) icms_dc=0 , 0.1671290 secs] [Times: user=0.61 sys=0.00, real=0.17 secs] 
            14251.429: [GC 14251.430: [ParNew
            Desired survivor size 107347968 bytes, new threshold 7 (max 8)
            - age   1:   25749296 bytes,   25749296 total
            - age   2:   20111888 bytes,   45861184 total
            - age   3:    7580776 bytes,   53441960 total
            - age   4:   16819072 bytes,   70261032 total
            - age   5:   13209968 bytes,   83471000 total
            - age   6:   14088856 bytes,   97559856 total
            - age   7:   10371160 bytes,  107931016 total
            - age   8:   11426712 bytes,  119357728 total
            : 1825735K->155304K(1887488K), 0.1888880 secs] 5232762K->3574222K(6081792K) icms_dc=0 , 0.1895340 secs] [Times: user=0.74 sys=0.06, real=0.19 secs] 
            14291.342: [GC 14291.343: [ParNew
            Desired survivor size 107347968 bytes, new threshold 7 (max 8)
            - age   1:   25786480 bytes,   25786480 total
            - age   2:   21991848 bytes,   47778328 total
            - age   3:   16650000 bytes,   64428328 total
            - age   4:    7387368 bytes,   71815696 total
            - age   5:   16777584 bytes,   88593280 total
            - age   6:   13098856 bytes,  101692136 total
            - age   7:   14029704 bytes,  115721840 total
            : 1833128K->151603K(1887488K), 0.1941170 secs] 5252046K->3591384K(6081792K) icms_dc=0 , 0.1947390 secs] [Times: user=0.82 sys=0.04, real=0.20 secs] 
            14334.142: [GC 14334.143: [ParNew
            Desired survivor size 107347968 bytes, new threshold 6 (max 8)
            - age   1:   31541800 bytes,   31541800 total
            - age   2:   20826888 bytes,   52368688 total
            - age   3:   19155264 bytes,   71523952 total
            - age   4:   16422240 bytes,   87946192 total
            - age   5:    7235616 bytes,   95181808 total
            - age   6:   16549000 bytes,  111730808 total
            - age   7:   13026064 bytes,  124756872 total
            : 1829427K->167467K(1887488K), 0.1890190 secs] 5269208K->3620753K(6081792K) icms_dc=0 , 0.1896630 secs] [Times: user=0.80 sys=0.03, real=0.19 secs] 
            14359.317: [Full GC 14359.317: [CMS: 3453285K->3099828K(4194304K), 13.1778420 secs] 4506618K->3099828K(6081792K), [CMS Perm : 261951K->181304K(264372K)] icms_dc=0 , 13.1786310 secs] [Times: user=13.15 sys=0.04, real=13.18 secs]
            14373.287: [GC [1 CMS-initial-mark: 3099828K(4194304K)] 3100094K(6081792K), 0.0107380 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
            14373.298: [CMS-concurrent-mark-start]
            14472.579: [GC 14472.579: [ParNew
            Desired survivor size 107347968 bytes, new threshold 8 (max 8)
            - age   1:   42849392 bytes,   42849392 total
            : 1677824K->86719K(1887488K), 0.1056680 secs] 4777652K->3186547K(6081792K) icms_dc=0 , 0.1063280 secs] [Times: user=0.61 sys=0.00, real=0.11 secs] 
            14506.980: [GC 14506.980: [ParNew
            Desired survivor size 107347968 bytes, new threshold 8 (max 8)
            - age   1:   42002904 bytes,   42002904 total
            - age   2:   35733928 bytes,   77736832 total
            : 1764543K->96136K(1887488K), 0.0982790 secs] 4864371K->3195964K(6081792K) icms_dc=0 , 0.0988960 secs] [Times: user=0.53 sys=0.01, real=0.10 secs] 
            14544.285: [GC 14544.286: [ParNew
            Desired survivor size 107347968 bytes, new threshold 8 (max 8)
            - age   1:   26159736 bytes,   26159736 total
            - age   2:   37842840 bytes,   64002576 total
            - age   3:   33192784 bytes,   97195360 total
            : 1773960K->130799K(1887488K), 0.1208590 secs] 4873788K->3230628K(6081792K) icms_dc=0 , 0.1215900 secs] [Times: user=0.59 sys=0.02, real=0.13 secs] 
            14589.266: [GC 14589.266: [ParNew
            Desired survivor size 107347968 bytes, new threshold 4 (max 8)
            - age   1:   28010360 bytes,   28010360 total
            - age   2:   21136704 bytes,   49147064 total
            - age   3:   35081376 bytes,   84228440 total
            - age   4:   32468056 bytes,  116696496 total
            : 1808623K->148284K(1887488K), 0.1423150 secs] 4908452K->3248112K(6081792K) icms_dc=0 , 0.1429440 secs] [Times: user=0.70 sys=0.02, real=0.14 secs] 
            14630.947: [GC 14630.947: [ParNew
            Desired survivor size 107347968 bytes, new threshold 8 (max 8)
            - age   1:   28248240 bytes,   28248240 total
            - age   2:   20712320 bytes,   48960560 total
            - age   3:   18217168 bytes,   67177728 total
            - age   4:   34834832 bytes,  102012560 total
            : 1826108K->140347K(1887488K), 0.1784680 secs] 4925936K->3275469K(6081792K) icms_dc=0 , 0.1790920 secs] [Times: user=0.98 sys=0.03, real=0.18 secs] 
            14664.779: [GC 14664.779: [ParNew
            Desired survivor size 107347968 bytes, new threshold 5 (max 8)
            - age   1:   25841000 bytes,   25841000 total
            - age   2:   22264960 bytes,   48105960 total
            - age   3:   17730104 bytes,   65836064 total
            - age   4:   17988048 bytes,   83824112 total
            - age   5:   34739384 bytes,  118563496 total
            : 1818171K->147603K(1887488K), 0.1714160 secs] 4953293K->3282725K(6081792K) icms_dc=0 , 0.1720530 secs] [Times: user=0.82 sys=0.11, real=0.17 secs] 
            14702.488: [GC 14702.489: [ParNew
            Desired survivor size 107347968 bytes, new threshold 8 (max 8)
            - age   1:   26887368 bytes,   26887368 total
            - age   2:   21403352 bytes,   48290720 total
            - age   3:   18732224 bytes,   67022944 total
            - age   4:   17640576 bytes,   84663520 total
            - age   5:   17942952 bytes,  102606472 total
            : 1825427K->142695K(1887488K), 0.2118320 secs] 4960549K->3312168K(6081792K) icms_dc=0 , 0.2124630 secs] [Times: user=1.13 sys=0.14, real=0.21 secs] 
        

        The strategy I was aiming at: I want to limit to the minimum what gets Tenured, I'm serving requests and expect that beyond a certain amount of shared objects, every other objects are useful only to the request at hand. Therefore by using a big NewSize and an increased TenuringThreshold and was hoping to have none of these single serving objects stick around.

        The following are there to support my strategy:
        -Xms=6G
        -Xmx=6G
        -XX:NewSize=2G // big space so that ParNew doesn't occur to often and let time for objects to expire
        -XX:MaxTenuringThreshold=8 // to limit the tenuring some more
        -XX:SurvivorRatio=7 // based on examples -XX:CMSInitiatingOccupancyFraction=60
        // to prevent a Full GC caused by promotion allocation failed
        -XX:+UseCMSInitiatingOccupancyOnly
        // to go with the one above based on example

        MaxPermSize=1G and "-Dsun.reflect.inflationThreshold=0" are related to another issue I'd rather keep separated.

        "-XX:+CMSClassUnloadingEnabled" and "-XX:+CMSPermGenSweepingEnabled" are there because of grails which rely heavily and extra classes for closures and reflexion

        -XX:+CMSIncrementalMode is an experiment which hasn't yield much success

        解决方案

        The log snippet posted shows you have a substantial number of objects that are live for >320s (approx 40s per young collection and objects survive through 8 collections before promotion). The remaining objects then bleed into tenured and eventually you hit an apparently unexpected full gc which doesn't actually collect very much.

        3453285K->3099828K(4194304K)

        i.e. you have a 4G tenured which is ~82% full (3453285/4194304) when it is triggered and is ~74% full after 13 long seconds.

        This means it took 13s to collect the grand total of ~350M which, in the context of a 6G heap is not v much.

        This basically means your heap is not big enough or, perhaps more likely, you have a memory leak. A leak like this is a terrible thing for CMS because a concurrent tenured collection is a non compacting event which means tenured is a collection of free lists which means fragmentation can be a big problem for CMS which means that your utilisation of tenured becomes increasingly inefficient which means that there is an increased probability of promotion failure events (though if this were such an event then I'd expect to see a log message saying that) because it wants to promote (or thinks it will need to promote) X MB into tenured but it does not have a (contiguous) free list >= X MB available. This triggers an unexpected tenured collection which is a not remotely concurrent STW event. If you actually have v little to collect (as you do) then there is no surprise you're sitting twiddling your thumbs.

        Some general pointers, to a large extent reiterating what Vladimir Sitnitov has said...

        • using iCMS on a multicore box makes no sense (unless you have lots of JVMs or other processes running on that box such that the JVM really is short of CPU) therefore remove this switch
        • your young collections are unnecessarily long because of the impact of copying relatively substantial quantities of memory between the survivor spaces on every collection, 150-200ms is a really quite massive ParNew collection
          • the right answer to the young gen issue depends on what the allocation behaviour really is (e.g. perhaps you'd be better off tenuring early and reducing the impact of fragmentation on tenured collections OR perhaps you'd be better off having a much more massive new gen and reducing the frequency of young gen collections such that fewer objects are promoted so that there is minimal bleed into tenured).

        Some questions...

        • does it eventually go OoM or does it recover?
        • is the application in a steady state (subject to consistent load at some point well beyond startup) during this log snippet or is it under stress?

        这篇关于GC调整 - 阻止完整的GC的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆