Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8065402

G1 does not expand marking stack when mark stack overflow happens during concurrent marking

    Details

    • Subcomponent:
      gc
    • Resolved In Build:
      b21
    • CPU:
      generic
    • OS:
      generic

      Description

      Attached is the spreadsheet summarizing Intel's experiment increasing MarkStackSize manually vs count of
      '[GC concurrent-mark-reset-for-overflow]'

      The observation is we do not expand MarkStackSize to MarkStackSizeMax when concurrent-mark-overflow happens. They have to increase it manually.

      The expand flag is set when concurrent-mark-reset-for-overflow happens.
      The issue is we try to expand markStack in void ConcurrentMark::checkpointRootsFinal(bool clear_all_soft_refs)
      If there is no overflow, at the end, we call set_non_marking_state(); then try to expand markStack.

      set_non_marking_state() calls reset_marking_state and reset expand based on _cm->has_overflown(_cm overflow is cleaned during marking). So when we check if (_markStack.should_expand()), it is always false.
      1. b8065402.java
        1 kB
        Alexander Harlap
      2. gclogs.tar.gz
        1.90 MB
        Alexander Harlap
      3. nweight_stackoverflow_summary.xlsx
        64 kB
        Jenny Zhang

        Issue Links

          Activity

          Hide
          aharlap Alexander Harlap added a comment -
          I simulated this issue with following debugging change - in order to decrease size of G1CMTaskQueue:


          --- a/src/share/vm/gc/g1/g1ConcurrentMark.hpp Tue Mar 14 22:14:33 2017 -0700
          +++ b/src/share/vm/gc/g1/g1ConcurrentMark.hpp Wed Mar 22 13:30:14 2017 -0400
          @@ -93,7 +93,7 @@
           #pragma warning(pop)
           #endif
           
          -typedef GenericTaskQueue<G1TaskQueueEntry, mtGC> G1CMTaskQueue;
          +typedef GenericTaskQueue<G1TaskQueueEntry, mtGC, 1024> G1CMTaskQueue;
           typedef GenericTaskQueueSet<G1CMTaskQueue, mtGC> G1CMTaskQueueSet;
           
           // Closure used by CM during concurrent reference discovery
          @@ -221,7 +221,7 @@
           class G1CMMarkStack VALUE_OBJ_CLASS_SPEC {
           public:
             // Number of TaskQueueEntries that can fit in a single chunk.
          - static const size_t EntriesPerChunk = 1024 - 1 /* One reference for the next pointer */;
          + static const size_t EntriesPerChunk = 64 - 1 /* One reference for the next pointer */;
           private:
             struct TaskQueueEntryChunk {
               TaskQueueEntryChunk* next;

          After that GCBasher was used as a test with flag "-XX:MarkStackSize=1K"
          I ran test on machine with 68 cores.

          Attached file gclogs.tar.gz contains two log files - with current way of expanding MarkStack (only if overflow happened in remark) - file base.log, and file fix.log - for proposed fix - to expand MarkStack if overflow happened in the Concurrent Mark.

          Base.log has no mark stack expansion, has two of "Concurrent Mark Abort" and two matched Full GC

          Fix.log has few " Expanded mark stack" and none of "Concurrent Mark Abort" and none of FULL GC


           
          Show
          aharlap Alexander Harlap added a comment - I simulated this issue with following debugging change - in order to decrease size of G1CMTaskQueue: --- a/src/share/vm/gc/g1/g1ConcurrentMark.hpp Tue Mar 14 22:14:33 2017 -0700 +++ b/src/share/vm/gc/g1/g1ConcurrentMark.hpp Wed Mar 22 13:30:14 2017 -0400 @@ -93,7 +93,7 @@  #pragma warning(pop)  #endif   -typedef GenericTaskQueue<G1TaskQueueEntry, mtGC> G1CMTaskQueue; +typedef GenericTaskQueue<G1TaskQueueEntry, mtGC, 1024> G1CMTaskQueue;  typedef GenericTaskQueueSet<G1CMTaskQueue, mtGC> G1CMTaskQueueSet;    // Closure used by CM during concurrent reference discovery @@ -221,7 +221,7 @@  class G1CMMarkStack VALUE_OBJ_CLASS_SPEC {  public:    // Number of TaskQueueEntries that can fit in a single chunk. - static const size_t EntriesPerChunk = 1024 - 1 /* One reference for the next pointer */; + static const size_t EntriesPerChunk = 64 - 1 /* One reference for the next pointer */;  private:    struct TaskQueueEntryChunk {      TaskQueueEntryChunk* next; After that GCBasher was used as a test with flag "-XX:MarkStackSize=1K" I ran test on machine with 68 cores. Attached file gclogs.tar.gz contains two log files - with current way of expanding MarkStack (only if overflow happened in remark) - file base.log, and file fix.log - for proposed fix - to expand MarkStack if overflow happened in the Concurrent Mark. Base.log has no mark stack expansion, has two of "Concurrent Mark Abort" and two matched Full GC Fix.log has few " Expanded mark stack" and none of "Concurrent Mark Abort" and none of FULL GC  
          Hide
          aharlap Alexander Harlap added a comment -
          Added test for reproducing issue
          Show
          aharlap Alexander Harlap added a comment - Added test for reproducing issue
          Hide
          aharlap Alexander Harlap added a comment - - edited
          Constructed test b8065402.java that does not require any VM modifications to reproduce an issue.

          Testing with this file:
          jdk-base/bin/java -Xmx32g -server -XX:+UseG1GC -Xlog:gc*=debug b8065402 150
          Duration - 182.831 sec (with 5 full gc)

          Proposed fix:
          jdk-fix/bin/java -Xmx32g -server -XX:+UseG1GC -Xlog:gc*=debug b8065402 150
          Duration - 68.267 sec (with 0 full gc) Mark Stack was expanded (twice): 4M -> 8M ->16M
          Show
          aharlap Alexander Harlap added a comment - - edited Constructed test b8065402.java that does not require any VM modifications to reproduce an issue. Testing with this file: jdk-base/bin/java -Xmx32g -server -XX:+UseG1GC -Xlog:gc*=debug b8065402 150 Duration - 182.831 sec (with 5 full gc) Proposed fix: jdk-fix/bin/java -Xmx32g -server -XX:+UseG1GC -Xlog:gc*=debug b8065402 150 Duration - 68.267 sec (with 0 full gc) Mark Stack was expanded (twice): 4M -> 8M ->16M
          Hide
          hgupdate HG Updates added a comment -
          URL: http://hg.openjdk.java.net/jdk10/hs/hotspot/rev/24afa1eef92f
          User: kbarrett
          Date: 2017-05-09 23:24:46 +0000
          Show
          hgupdate HG Updates added a comment - URL: http://hg.openjdk.java.net/jdk10/hs/hotspot/rev/24afa1eef92f User: kbarrett Date: 2017-05-09 23:24:46 +0000
          Hide
          hgupdate HG Updates added a comment -
          URL: http://hg.openjdk.java.net/jdk10/jdk10/hotspot/rev/24afa1eef92f
          User: jwilhelm
          Date: 2017-08-18 18:01:35 +0000
          Show
          hgupdate HG Updates added a comment - URL: http://hg.openjdk.java.net/jdk10/jdk10/hotspot/rev/24afa1eef92f User: jwilhelm Date: 2017-08-18 18:01:35 +0000

            People

            • Assignee:
              aharlap Alexander Harlap
              Reporter:
              yuzhang Jenny Zhang (Inactive)
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: