Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8170358

[REDO] 8k class metaspace chunks misallocated from 4k chunk freelist

    Details

    • Subcomponent:
      gc
    • Resolved In Build:
      b150
    • CPU:
      x86_64
    • OS:
      linux

      Backports

        Description

        FULL PRODUCT VERSION :
        java version "1.8.0_102"
        Java(TM) SE Runtime Environment (build 1.8.0_102-b14)
        Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode)

        Reproduced with a default clone of OpenJDK jdk8u

        FULL OS VERSION :
        Linux pm-cluster-rhel7-1b 3.10.0-327.28.3.el7.x86_64 #1 SMP Fri Aug 12 13:21:05 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
        (can be reproduced on any 64-bit Linux flavour)


        A DESCRIPTION OF THE PROBLEM :
        We have an application server that generates code when applications are deployed. Recently it started failing during the deploy of a large application with "java.lang.OutOfMemoryError: Metaspace" errors. Experimenting with the command-line metaspace configuration flags made no difference. The only thing that did work was to disable CMS entirely, but this is not a practical long-term solution.

        To try to determine the root cause of the issue, it was investigated using a clone of the OpenJDK jdk8u Mercurial repository. Local builds of the JDK with extra debug logging were made. Eventually the bug was tracked down to an implementation error in hotspot/src/share/vm/memory/metaspace.cpp. The ChunkManager::list_index() method returns the wrong answer for humongous class metadata chunks if the chunk size happens to be the same size as a non-class metadata medium chunk (8K).

        Chunk sizes are specified as so (from metaspace.cpp):

         77 enum ChunkSizes { // in words.
         78 ClassSpecializedChunk = 128,
         79 SpecializedChunk = 128,
         80 ClassSmallChunk = 256,
         81 SmallChunk = 512,
         82 ClassMediumChunk = 4 * K,
         83 MediumChunk = 8 * K
         84 };

        list_index() is a static method that returns the index of an appropriate freelist:

        2330 ChunkIndex ChunkManager::list_index(size_t size) {
        2331 switch (size) {
        2332 case SpecializedChunk:
        2333 assert(SpecializedChunk == ClassSpecializedChunk,
        2334 "Need branch for ClassSpecializedChunk");
        2335 return SpecializedIndex;
        2336 case SmallChunk:
        2337 case ClassSmallChunk:
        2338 return SmallIndex;
        2339 case MediumChunk:
        2340 case ClassMediumChunk:
        2341 return MediumIndex;
        2342 default:
        2343 assert(size > MediumChunk || size > ClassMediumChunk,
        2344 "Not a humongous chunk");
        2345 return HumongousIndex;
        2346 }
        2347 }

        It's obvious looking at the code that if an 8K class metadata chunk is requested, this method is going to erroneously claim that it's a medium chunk not a humongous chunk. This leads to 4K chunks being allocated from medium chunk freelist, if any are available there, which aren't big enough to hold the 8K of data needed. Consequently, the allocation fails, is retried a couple of times, causes GC to be initiated, the allocation is subsequently tried again, but fails for the same reason, eventually causing the java.lang.OutOfMemoryError.

        The error *only* occurs when there are free chunks available on the medium chunk freelist. If there aren't any there, new chunks *of the correct size* are allocated from virtual memory space and all is well.

        THE PROBLEM WAS REPRODUCIBLE WITH -Xint FLAG: Did not try

        THE PROBLEM WAS REPRODUCIBLE WITH -server FLAG: Yes

        REGRESSION. Last worked in version 7u80

        STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
        Load a class requiring an 8K class metadata chunk when there are 4K chunks available on the medium chunk freelist.


        EXPECTED VERSUS ACTUAL BEHAVIOR :
        Expected: The class should load successfully

        Actual: A java.lang.OutOfMemoryError: Metaspace error occurs
        ERROR MESSAGES/STACK TRACES THAT OCCUR :
        Metaspace debug log showing a (failed) request for 8061 words being "satisfied" using a 4096 word chunk:

        SpaceManager::grow_and_allocate for 8061 words 2627 words used 1469 words left
        Metadata humongous allocation:
          word_size 0x0000000000001f7d
          chunk_word_size 0x0000000000002000
            chunk overhead 0x0000000000000005
        ChunkManager::free_chunks_get: free_list 0x00007f57c00a3fc0 head 0x0000000104729c00 size 4096
        ChunkManager::chunk_freelist_allocate: 0x00007f57c00a3f80 chunk 0x0000000104729c00 size 4096 count 292 Free chunk total 1285504 count 609
        SpaceManager::add_chunk: 8) Metachunk: bottom 0x0000000104729c00 top 0x0000000104729c28 end 0x0000000104731c00 size 4096
            used 5 free 4091


        REPRODUCIBILITY :
        This bug can be reproduced often.

        ---------- BEGIN SOURCE ----------
        Once the issue was understood an attempt was made to create a standalone test case that could reproduce it, but that effort has so far failed.
        ---------- END SOURCE ----------

        CUSTOMER SUBMITTED WORKAROUND :
        Disabling CMS GC is the only known effective workaround.

        A patch against the OpenJDK that fixes the issue has been written, but it's too big to fit here.

          Attachments

            Issue Links

              Activity

                People

                • Assignee:
                  stefank Stefan Karlsson
                  Reporter:
                  jwilhelm Jesper Wilhelmsson
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  4 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: