Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6999988

CMS: Increased fragmentation leading to promotion failure after CR#6631166 got implemented

    Details

    • Subcomponent:
      gc
    • Resolved In Build:
      b130
    • CPU:
      sparc
    • OS:
      solaris_10
    • Verification:
      Verified

      Backports

        Description

        On a customer test system we can see increased fragmentation with 6u21+
        leading to promotion failures, which could not be observed with 6u20-

        I attach two logfiles which show the issue. Here is the interesting
        section:

        Statistics for IndexedFreeLists:
        --------------------------------
        Total Free Space: 475074
        Max Chunk Size: 254
        Number of Blocks: 8584
        Av. Block Size: 55
         free=105894392 frag=1,0000 <<<<<<<<<<<
        Before GC:
        Statistics for BinaryTreeDictionary:
        ------------------------------------
        Total Free Space: 0
        Max Chunk Size: 0
        Number of Blocks: 0
        Tree Height: 0
        Statistics for IndexedFreeLists:
        --------------------------------
        Total Free Space: 0
        Max Chunk Size: 0
        Number of Blocks: 0
         free=0 frag=0,0000
        9421,892: [ParNew (promotion failed) <<<<<<<<<<<<<<<<<<
        Desired survivor size 163840 bytes, new threshold 0 (max 0)
        : 48512K->48512K(48832K), 0,2599337 secs]9422,152: [CMS (concurrent mode failure)size[2]
        A second customer reports an issue with 6u22 with these jvm options.

        -XX:+CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=60 -XX:CMSMaxAbortablePrecleanTime=3600000 -XX:+CMSParallelRemarkEnabled -XX:+CMSParallelSurvivorRemarkEnabled -XX:+CMSScavengeBeforeRemark -XX:+HeapDumpOnOutOfMemoryError -XX:InitialHeapSize=32212254720 -XX:MaxHeapSize=32212254720 -XX:MaxNewSize=2147483648 -XX:MaxTenuringThreshold=4 -XX:NewSize=2147483648 -XX:ParallelCMSThreads=8 -XX:PermSize=67108864 -XX:+PrintCommandLineFlags -XX:PrintFLSStatistics=1 -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintTenuringDistribution -XX:+UseBiasedLocking -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseLargePages -XX:+UseMembar -XX:+UseNUMA -XX:+UseParNewGC

        After 14 CMS cycles with 6u22, they report a promotion failure, where a failure happened when tenured space filled to 15G. This is strange as the heap usually fills to 20G before a CMS cycle is triggered.

        Here is the pertinent section from the gc log.

        2010-11-18T19:30:38.703-0600: [GC Before GC:
        Statistics for BinaryTreeDictionary:
        ------------------------------------
        Total Free Space: 2004318869
        Max Chunk Size: 57061
        Number of Blocks: 711866
        Av. Block Size: 2815
        Tree Height: 90
        Before GC:
        Statistics for BinaryTreeDictionary:
        ------------------------------------
        Total Free Space: 1786368
        Max Chunk Size: 1786368
        Number of Blocks: 1
        Av. Block Size: 1786368
        Tree Height: 1
        [ParNew (promotion failed)
        Desired survivor size 107347968 bytes, new threshold 1 (max 4)
        - age 1: 152613256 bytes, 152613256 total
        : 1861587K->1887488K(1887488K), 1.7214606 secs][CMSCMS: Large block 0xfffffd797ec6d6a0
        : 13801830K->2261237K(29360128K), 46.1976857 secs] 15542483K->2261237K(31247616K), [CMS Perm : 47619K->47461K(79492K)]After GC:
        Statistics for BinaryTreeDictionary:
        ------------------------------------
        Total Free Space: -826325716
        Max Chunk Size: -826325716
        Number of Blocks: 1
        Av. Block Size: -826325716
        Tree Height: 1
        After GC:
        Statistics for BinaryTreeDictionary:
        ------------------------------------
        Total Free Space: 0
        Max Chunk Size: 0
        Number of Blocks: 0
        Tree Height: 0
        , 47.9259930 secs] [Times: user=50.62 sys=0.02, real=47.93 secs]

        Engineering reported that it almost seems like some kind of bug in CMS allocation because there is plenty of free space (and comparatively not that much promotion) when the promotion failure occurs (although full gc logs would be needed before one could be confident of this pronouncement). So it would be worthwhile to investigate this closely to see why this is happening. I somehow do not think this is a tuning issue, but something else.

        Also, in 6u21 we integrated 6631166 which is aimed at reducing fragmentation.
        If at all possible, could you down-rev to 6u20 (i.e. pre-6u21) and see
        if the onset of fragmentation changes in any manner?

        The customer repeated the test with 6u20. There were no promotion failures, even after 7000 cycles. CMS cycles were always triggered after tenured gen filled to 20G.

          Attachments

            Issue Links

              Activity

                People

                • Assignee:
                  ysr Y. Ramakrishna
                  Reporter:
                  tviessma Thomas Viessmann (Inactive)
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  8 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved:
                    Imported:
                    Indexed: