Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8214356

Verification of class metadata unloading takes a long time



    • Type: Bug
    • Status: Resolved
    • Priority: P3
    • Resolution: Fixed
    • Affects Version/s: 11, 12
    • Fix Version/s: 12
    • Component/s: hotspot
    • Labels:
    • Subcomponent:
    • Resolved In Build:


      During investigation of JDK-8211211 it has been found that the cause for the long test duration is that class unloading verification takes too long time.

      In particular, with CMS where class unloading is delayed due to CMS precleaning, the test accumulates so many classes to unload that it easily exceeds the timeouts.

      The relevant result is https://bugs.openjdk.java.net/browse/JDK-8211211?focusedCommentId=14225624&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14225624 (copied here for convenience):

      The test spends 95%(!) of its runtime in ClassLoaderData::free_deallocate_list in some verification routine I think (this is a fastdebug build)

      Perf trace (mangled, but readable):

            - 95.33% _ZN8VMThread4loopEv
               - 95.31% _ZN20SafepointSynchronize5beginEv
                  - _ZN15ClassLoaderData20free_deallocate_listEv
                     - 95.31% _ZN15ClassLoaderData20free_deallocate_listEv
                        - 91.08% _ZN13InstanceKlass19deallocate_contentsEP15ClassLoaderData
                           - 84.24% _ZN13InstanceKlass18deallocate_methodsEP15ClassLoaderDataP5ArrayIP6MethodE
                              - _ZN13InstanceKlass18deallocate_methodsEP15ClassLoaderDataP5ArrayIP6MethodE
                                 - 84.22% _ZN20ClassLoaderMetaspace10deallocateEP8MetaWordmb
                                    - _ZN9metaspace12SpaceManager10deallocateEP8MetaWordm
                                       - _ZN9metaspace13BlockFreelist12return_blockEP8MetaWordm
                                       - _ZN9metaspace13BlockFreelist12return_blockEP8MetaWordm
                                          - 84.21% _ZN20BinaryTreeDictionaryIN9metaspace9MetablockE8FreeListIS1_EE20insert_chunk_in_treeEPS1_
                                             - _ZN8TreeListIN9metaspace9MetablockE8FreeListIS1_EE20return_chunk_at_tailEP9TreeChunkIS1_S3_E
                                                  84.19% _ZN8TreeListIN9metaspace9MetablockE8FreeListIS1_EE20return_chunk_at_tailEP9TreeChunkIS1_S3_E
                           - 6.22% _ZN20ClassLoaderMetaspace10deallocateEP8MetaWordmb
                              - _ZN9metaspace12SpaceManager10deallocateEP8MetaWordm
                                 - _ZN9metaspace13BlockFreelist12return_blockEP8MetaWordm
                                 - _ZN9metaspace13BlockFreelist12return_blockEP8MetaWordm
                                    - 6.22% _ZN20BinaryTreeDictionaryIN9metaspace9MetablockE8FreeListIS1_EE20insert_chunk_in_treeEPS1_
                                       - _ZN8TreeListIN9metaspace9MetablockE8FreeListIS1_EE20return_chunk_at_tailEP9TreeChunkIS1_S3_E
                           - 0.58% _ZN15ClassLoaderData13remove_handleE9OopHandle
                                0.58% _ZN15ClassLoaderData13remove_handleE9OopHandle
                        - 4.22% _ZN20ClassLoaderMetaspace10deallocateEP8MetaWordmb
                           - _ZN9metaspace12SpaceManager10deallocateEP8MetaWordm
                           - _ZN9metaspace13BlockFreelist12return_blockEP8MetaWordm
                              - _ZN20BinaryTreeDictionaryIN9metaspace9MetablockE8FreeListIS1_EE20insert_chunk_in_treeEPS1_
                                 - _ZN8TreeListIN9metaspace9MetablockE8FreeListIS1_EE20return_chunk_at_tailEP9TreeChunkIS1_S3_E

      Ultimately I think the problem is the loop in FreeList<Chunk>::verify_chunk_in_free_list(), because nop'ing that one out improves the test significantly (it does not "hang" at the first unloading phase any more and within the 30s actually manages a few concurrent cycles), but I do not know how important this check is for integrity purposes (I do not think it is that useful to verify *all* chunks every time you delete a single one by default).

      The specific problem with CMS seems to be related to precleaning - this takes very long time (~5s until it is aborted) in this case, so its concurrent cycle takes very long, as the test accumulates more and more classloader data to unload. Since the algorithm with verification is O(n^2) at least, and n is much larger than with other cases with CMS, the test takes a very long time.

      There seems to be another long verification phase at the end of the run that seems independent of gc.

      JDK-8211211 hackfixed this by adding a VM option so that CMS avoids the precleaning. It would be nice to get these verification times down to reasonable levels as this also slows down other programs run in fastdebug mode that do a lot of class unloading.

      One item to investigate could be looking whether it is required to verify all metaspace chunks every time there is a single class unloading.


          Issue Links



              coleenp Coleen Phillimore
              tschatzl Thomas Schatzl
              0 Vote for this issue
              2 Start watching this issue