Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8158871

Long response times with G1 and StringDeduplication

    Details

    • Subcomponent:
      gc
    • Resolved In Build:
      b127

      Backports

        Description


        Using JDK 8u92 with -XX:+UseG1GC -XX:+UseStringDeduplication -Xms13g -Xmx13g,
        several observations were made:

        1) During load testing application response times are significantly impacted
        (increase to 14 sec response times; normal response times are 1 sec) during
        the next time string deduplication runs AFTER a concurrent g1 marking cycle;
        further that particular string deduplication takes 14 sec to run.

        2) All other string deduplications appear normal with durations in the <0.2
        sec range.

        3) The impact can be lessened by increasing parallel gc threads however it is
        still quite significant (14 sec impact w/16 threads, 3 sec impact w/83
        threads).

        4) The impact occurs 100% of the time just after a concurrent g1 cycle when
        the next string deduplication runs.

        When String Deduplication was removed, application response times are normal
        throughout the load test even after g1 concurrent cycles.

          Activity

          Hide
          pliden Per Liden added a comment -
          Had a look at this. My current theory is that it's the trimming of the table entry cache that is causing the long dedup time and potentially blocking safepoints and String.intern() calls from happening, which in turn would cause application stalls.

          From the log, you can see that after the first concurrent GC phase about 2GB is old space is declared dead. The following dedup cycle now finds about 5.5million dead strings which are removed from the dedup table. This causes the table entry cache to fill up and later needs to be trimmed by the dedup thread. When trimming the cache the dedup thread grabs the StringDedupTable_lock, which effectively blocks String.intern() calls. Further, the dedup thread is still inside the the SuspendibleThreadSet while trimming, which effectively blocks safepoints to happen. This would normally not be a problem if it wasn't for the fact that we're trimming a very large number of entries (5.5 million), so trimming might take a long time.

          I will try to write a reproducer for this to see if I can prove my theory.
          Show
          pliden Per Liden added a comment - Had a look at this. My current theory is that it's the trimming of the table entry cache that is causing the long dedup time and potentially blocking safepoints and String.intern() calls from happening, which in turn would cause application stalls. From the log, you can see that after the first concurrent GC phase about 2GB is old space is declared dead. The following dedup cycle now finds about 5.5million dead strings which are removed from the dedup table. This causes the table entry cache to fill up and later needs to be trimmed by the dedup thread. When trimming the cache the dedup thread grabs the StringDedupTable_lock, which effectively blocks String.intern() calls. Further, the dedup thread is still inside the the SuspendibleThreadSet while trimming, which effectively blocks safepoints to happen. This would normally not be a problem if it wasn't for the fact that we're trimming a very large number of entries (5.5 million), so trimming might take a long time. I will try to write a reproducer for this to see if I can prove my theory.
          Hide
          pliden Per Liden added a comment -
          I wrote a test to reproduce the pattern where lots of deduplicated Sting objects die all at once. On Solaris/Sparc I was able to provoke a 3.7 seconds long cache trimming (freeing several million cache entries). The default malloc()/free() used on Solaris is known for performing poorly in a multi-threaded environments, so other threads in the VM (or JNI threads) were doing malloc()/free() concurrently with the dedup thread one can imagine that this could severely prolong the time to trim the cache.
          Show
          pliden Per Liden added a comment - I wrote a test to reproduce the pattern where lots of deduplicated Sting objects die all at once. On Solaris/Sparc I was able to provoke a 3.7 seconds long cache trimming (freeing several million cache entries). The default malloc()/free() used on Solaris is known for performing poorly in a multi-threaded environments, so other threads in the VM (or JNI threads) were doing malloc()/free() concurrently with the dedup thread one can imagine that this could severely prolong the time to trim the cache.
          Hide
          pliden Per Liden added a comment -
          Currently testing a patch, which does the cache trimming without the need to hold the StringTable_lock and only temporarily joins the SuspendibleThreadSet to bulk fetch entries to delete.
          Show
          pliden Per Liden added a comment - Currently testing a patch, which does the cache trimming without the need to hold the StringTable_lock and only temporarily joins the SuspendibleThreadSet to bulk fetch entries to delete.
          Hide
          hgupdate HG Updates added a comment -
          URL: http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/ba08710f3b6c
          User: pliden
          Date: 2016-06-27 08:30:10 +0000
          Show
          hgupdate HG Updates added a comment - URL: http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/ba08710f3b6c User: pliden Date: 2016-06-27 08:30:10 +0000
          Hide
          hgupdate HG Updates added a comment -
          URL: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/ba08710f3b6c
          User: amurillo
          Date: 2016-07-08 20:16:41 +0000
          Show
          hgupdate HG Updates added a comment - URL: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/ba08710f3b6c User: amurillo Date: 2016-07-08 20:16:41 +0000

            People

            • Assignee:
              pliden Per Liden
              Reporter:
              shadowbug Shadow Bug
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: