Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6902692

G1: g1CollectorPolicy.cpp:1533, assert(false,"Debugging data for CR 6898948 has been dumped above")

    Details

    • Type: Bug
    • Status: Open
    • Priority: P4
    • Resolution: Unresolved
    • Affects Version/s: hs17
    • Fix Version/s: tbd_major
    • Component/s: hotspot
    • Labels:
    • Subcomponent:
      gc
    • Understanding:
      Fix Understood
    • Introduced In Build:
      b01
    • Introduced In Version:
    • CPU:
      generic
    • OS:
      generic

      Description

      http://sqeweb.sfbay.sun.com/nfs/results/vm/gtee/JDK7/NIGHTLY/VM/2009-11-16/G1_GC_Baseline/vm/linux-amd64/server/mixed/linux-amd64_server_mixed_nsk.stack.testlist/ResultDir/stack017/stack017.log

      [2009-11-17T06:08:24.35] Maximal recursion depth: 10435
      [2009-11-17T06:08:26.30] recent_avg_pause_time_ratio() out of bounds
      [2009-11-17T06:08:26.30] -------------------------------------------
      [2009-11-17T06:08:26.30] Recent GC Times (ms):
      [2009-11-17T06:08:26.30] _num = 3, _sum = 1954.384, _sum_of_squares = 1910809.870
      [2009-11-17T06:08:26.30] _davg = 776.233, _dvariance = 112378.745, _alpha = 0.700
      [2009-11-17T06:08:26.30] _length = 10, _next = 3
      [2009-11-17T06:08:26.30]
      [2009-11-17T06:08:26.30] [0]=999.569 [1]= 0.000 [2]=954.815 [3]= 0.000 [4]= 0.000
      [2009-11-17T06:08:26.30] [5]= 0.000 [6]= 0.000 [7]= 0.000 [8]= 0.000 [9]= 0.000
      [2009-11-17T06:08:26.30] (End Time=1.954) Recent GC End Times (s):
      [2009-11-17T06:08:26.30] _num = 4, _sum = 3.954, _sum_of_squares = 5.818
      [2009-11-17T06:08:26.30] _davg = 0.943, _dvariance = 0.429, _alpha = 0.700
      [2009-11-17T06:08:26.30] _length = 10, _next = 4
      [2009-11-17T06:08:26.30]
      [2009-11-17T06:08:26.30] [0]= 0.000 [1]= 1.000 [2]= 1.000 [3]= 1.954 [4]= 0.000
      [2009-11-17T06:08:26.30] [5]= 0.000 [6]= 0.000 [7]= 0.000 [8]= 0.000 [9]= 0.000
      [2009-11-17T06:08:26.30] GC = 1954.384, Interval = 1954.384, Ratio = 1.000
      [2009-11-17T06:08:26.30] # To suppress the following error report, specify this argument
      [2009-11-17T06:08:26.30] # after -XX: or in .hotspotrc: SuppressErrorAt=/g1CollectorPolicy.cpp:1533
      [2009-11-17T06:08:26.30] #
      [2009-11-17T06:08:26.30] # A fatal error has been detected by the Java Runtime Environment:
      [2009-11-17T06:08:26.30] #
      [2009-11-17T06:08:26.30] # Internal Error (/tmp/jprt/P1/B/195911.ysr/source/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp:1533), pid=15217, tid=1093196096
      [2009-11-17T06:08:26.35] # Error: assert(false,"Debugging data for CR 6898948 has been dumped above")
      [2009-11-17T06:08:26.35] #
      [2009-11-17T06:08:26.35] # JRE version: 7.0-b76
      [2009-11-17T06:08:26.35] # Java VM: OpenJDK 64-Bit Server VM (17.0-b04-2009-11-13-195911.ysr.fpdump-fastdebug mixed mode linux-amd64 compressed oops)
      [2009-11-17T06:08:26.35] # An error report file with more information is saved as:
      [2009-11-17T06:08:26.35] # /export/local/25677.JDK7.NIGHTLY.VM+linux-amd64_server_mixed_nsk.stack.testlist/results/ResultDir/stack017/hs_err_pid15217.log
      [2009-11-17T06:08:26.35] #
      [2009-11-17T06:08:26.35] # If you would like to submit a bug report, please visit:
      [2009-11-17T06:08:26.35] # http://java.sun.com/webapps/bugreport/crash.jsp
      [2009-11-17T06:08:26.35] #
      The computation of recent_avg_pause_time() is susceptible to unbounded error
      as the JVM ages because of the way it's currently computed. (See comments
      section for a description of the problem and a solution.) The susceptibility to
      the error depends on the occurrence of specific patterns of very short GC's
      alternating with long periods of no GC. It turns out, though, that in the
      cases where this test fails in the nightlies, the timestamps generated by
      the JVM (on Linux/amd64) themselves are "bursty" (time presumably does not progress
      smoothly -- it jumps forward then stalls for periods of time with no
      forward progress; this defect should be independently and separately
      investigated to see if the JVM time-service or the undelying os/platform/hardware
      might be defective here) -- this turns out to simulate precisely the condition
      of a JVM that has lived for a very long time and then sees a burst of very
      short GC pauses. Computing the rcent_avg_pause_time_ratio() in a slightly
      different way should protect us from this increasing coarseness/error
      as the JVM ages. See the comments section for more details.

      http://sqeweb.sfbay.sun.com/nfs/results/vm/gtee/JDK7/NIGHTLY/VM/2009-11-17/G1_GC_Baseline/vm/linux-amd64/server/mixed/linux-amd64_server_mixed_vm.gc.testlist/ResultDir/InterruptGC//InterruptGC.log

      [2009-11-18T11:16:28.40] recent_avg_pause_time_ratio() out of bounds
      [2009-11-18T11:16:28.40] -------------------------------------------
      [2009-11-18T11:16:28.40] Recent GC Times (ms):
      [2009-11-18T11:16:28.40] _num = 7, _sum = 1957.514, _sum_of_squares = 1916791.489
      [2009-11-18T11:16:28.40] _davg = 497.299, _dvariance = 166474.335, _alpha = 0.700
      [2009-11-18T11:16:28.40] _length = 10, _next = 7
      [2009-11-18T11:16:28.40]
      [2009-11-18T11:16:28.40] [0]= 0.000 [1]= 0.000 [2]= 0.000 [3]= 0.000 [4]= 0.000
      [2009-11-18T11:16:28.40] [5]=999.505 [6]=958.009 [7]= 0.000 [8]= 0.000 [9]= 0.000
      [2009-11-18T11:16:28.40] (End Time=1.958) Recent GC End Times (s):
      [2009-11-18T11:16:28.40] _num = 8, _sum = 2.957, _sum_of_squares = 4.831
      [2009-11-18T11:16:28.40] _davg = 0.797, _dvariance = 0.507, _alpha = 0.700
      [2009-11-18T11:16:28.40] _length = 10, _next = 8
      [2009-11-18T11:16:28.40]
      [2009-11-18T11:16:28.40] [0]= 0.000 [1]= 0.000 [2]= 0.000 [3]= 0.000 [4]= 0.000
      [2009-11-18T11:16:28.40] [5]= 0.000 [6]= 1.000 [7]= 1.958 [8]= 0.000 [9]= 0.000
      [2009-11-18T11:16:28.40] GC = 1957.514, Interval = 1957.514, Ratio = 1.000
      [2009-11-18T11:16:28.40] # To suppress the following error report, specify this argument
      [2009-11-18T11:16:28.40] # after -XX: or in .hotspotrc: SuppressErrorAt=/g1CollectorPolicy.cpp:1533
      [2009-11-18T11:16:28.40] #
      [2009-11-18T11:16:28.40] # A fatal error has been detected by the Java Runtime Environment:
      [2009-11-18T11:16:28.40] #
      [2009-11-18T11:16:28.40] # Internal Error (/tmp/jprt/P1/B/195911.ysr/source/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp:1533), pid=25021, tid=1089169728
      [2009-11-18T11:16:28.40] # Error: assert(false,"Debugging data for CR 6898948 has been dumped above")
      [2009-11-18T11:16:28.40] #
      [2009-11-18T11:16:28.40] # JRE version: 7.0-b76
      [2009-11-18T11:16:28.40] # Java VM: OpenJDK 64-Bit Server VM (17.0-b04-2009-11-13-195911.ysr.fpdump-fastdebug mixed mode linux-amd64 )
      [2009-11-18T11:16:28.40] # An error report file with more information is saved as:
      [2009-11-18T11:16:28.40] # /export/local/25758.JDK7.NIGHTLY.VM+linux-amd64_server_mixed_vm.gc.testlist/results/ResultDir/InterruptGC/hs_err_pid25021.log
      [2009-11-18T11:16:28.40] #
      [2009-11-18T11:16:28.40] # If you would like to submit a bug report, please visit:
      [2009-11-18T11:16:28.44] # http://java.sun.com/webapps/bugreport/crash.jsp
      [2009-11-18T11:16:28.44] #
      http://sqeweb.sfbay.sun.com/nfs/results/vm/gtee/JDK7/NIGHTLY/VM/2009-11-16/G1_GC_Baseline/vm/linux-amd64/server/mixed/linux-amd64_server_mixed_nsk.stack.testlist/ResultDir/stack018/stack018.log

      [2009-11-17T06:08:28.30] Maximal recursion depth: 1800
      [2009-11-17T06:11:24.27] recent_avg_pause_time_ratio() out of bounds
      [2009-11-17T06:11:24.27] -------------------------------------------
      [2009-11-17T06:11:24.27] Recent GC Times (ms):
      [2009-11-17T06:11:24.27] _num = 10, _sum = 98957.411, _sum_of_squares = 1093972265.883
      [2009-11-17T06:11:24.27] _davg = 7471.264, _dvariance = 17696958.204, _alpha = 0.700
      [2009-11-17T06:11:24.27] _length = 10, _next = 1
      [2009-11-17T06:11:24.27]
      [2009-11-17T06:11:24.27] [0]= 0.000 [1]=10995.279 [2]=10995.260 [3]=10995.262 [4]=12994.433
      [2009-11-17T06:11:24.27] [5]=10950.218 [6]=10040.721 [7]=10995.265 [8]=9995.682 [9]=10995.291
      [2009-11-17T06:11:24.27] (End Time=175.969) Recent GC End Times (s):
      [2009-11-17T06:11:24.27] _num = 10, _sum = 1369.813, _sum_of_squares = 196609.242
      [2009-11-17T06:11:24.27] _davg = 158.450, _dvariance = 524.830, _alpha = 0.700
      [2009-11-17T06:11:24.27] _length = 10, _next = 2
      [2009-11-17T06:11:24.27]
      [2009-11-17T06:11:24.27] [0]=175.969 [1]=175.969 [2]= 88.007 [3]= 99.002 [4]=109.997
      [2009-11-17T06:11:24.27] [5]=122.992 [6]=133.942 [7]=143.983 [8]=154.978 [9]=164.974
      [2009-11-17T06:11:24.27] GC = 98957.411, Interval = 98957.411, Ratio = 1.000
      [2009-11-17T06:11:24.27] # To suppress the following error report, specify this argument
      [2009-11-17T06:11:24.27] # after -XX: or in .hotspotrc: SuppressErrorAt=/g1CollectorPolicy.cpp:1533
      [2009-11-17T06:11:24.27] #
      [2009-11-17T06:11:24.27] # A fatal error has been detected by the Java Runtime Environment:
      [2009-11-17T06:11:24.27] #
      [2009-11-17T06:11:24.27] # Internal Error (/tmp/jprt/P1/B/195911.ysr/source/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp:1533), pid=15253, tid=1082833216
      [2009-11-17T06:11:25.27] # Error: assert(false,"Debugging data for CR 6898948 has been dumped above")
      [2009-11-17T06:11:25.27] #
      [2009-11-17T06:11:25.27] # JRE version: 7.0-b76
      [2009-11-17T06:11:25.27] # Java VM: OpenJDK 64-Bit Server VM (17.0-b04-2009-11-13-195911.ysr.fpdump-fastdebug mixed mode linux-amd64 compressed oops)
      [2009-11-17T06:11:25.27] # An error report file with more information is saved as:
      [2009-11-17T06:11:25.27] # /export/local/25677.JDK7.NIGHTLY.VM+linux-amd64_server_mixed_nsk.stack.testlist/results/ResultDir/stack018/hs_err_pid15253.log
      [2009-11-17T06:11:25.27] #
      [2009-11-17T06:11:25.27] # If you would like to submit a bug report, please visit:
      [2009-11-17T06:11:25.27] # http://java.sun.com/webapps/bugreport/crash.jsp
      [2009-11-17T06:11:25.27] #

        Issue Links

          Activity

          Hide
          ysr Y. Ramakrishna added a comment -
          BT2:EVALUATION

          I am marking this a P4 because the effect has not been seen in practice except on
          Linux/x64 machines that display the "jerky" time progress behaviour described
          in the "Description" section. (It's a day-zero bug.)
          Show
          ysr Y. Ramakrishna added a comment - BT2:EVALUATION I am marking this a P4 because the effect has not been seen in practice except on Linux/x64 machines that display the "jerky" time progress behaviour described in the "Description" section. (It's a day-zero bug.)
          Hide
          brutisso Bengt Rutisson (Inactive) added a comment -
          targeting this for hs25 for now. I think we should do some more investigation before we defer it.
          Show
          brutisso Bengt Rutisson (Inactive) added a comment - targeting this for hs25 for now. I think we should do some more investigation before we defer it.
          Hide
          brutisso Bengt Rutisson (Inactive) added a comment -
          Here is the patch with the instrumentation that JDK-6898948 added and JDK-8141394 removed.

          diff --git a/src/share/vm/gc/g1/g1CollectorPolicy.cpp b/src/share/vm/gc/g1/g1CollectorPolicy.cpp
          --- a/src/share/vm/gc/g1/g1CollectorPolicy.cpp
          +++ b/src/share/vm/gc/g1/g1CollectorPolicy.cpp
          @@ -976,6 +976,19 @@
               _recent_avg_pause_time_ratio = _recent_gc_times_ms->sum()/interval_ms;
               if (recent_avg_pause_time_ratio() < 0.0 ||
                   (recent_avg_pause_time_ratio() - 1.0 > 0.0)) {
          +#ifndef PRODUCT
          + // Dump info to allow post-facto debugging
          + gclog_or_tty->print_cr("recent_avg_pause_time_ratio() out of bounds");
          + gclog_or_tty->print_cr("-------------------------------------------");
          + gclog_or_tty->print_cr("Recent GC Times (ms):");
          + _recent_gc_times_ms->dump();
          + gclog_or_tty->print_cr("(End Time=%3.3f) Recent GC End Times (s):", end_time_sec);
          + _recent_prev_end_times_for_all_gcs_sec->dump();
          + gclog_or_tty->print_cr("GC = %3.3f, Interval = %3.3f, Ratio = %3.3f",
          + _recent_gc_times_ms->sum(), interval_ms, recent_avg_pause_time_ratio());
          + // In debug mode, terminate the JVM if the user wants to debug at this point.
          + assert(!G1FailOnFPError, "Debugging data for CR 6898948 has been dumped above");
          +#endif // !PRODUCT
                 // Clip ratio between 0.0 and 1.0, and continue. This will be fixed in
                 // CR 6902692 by redoing the manner in which the ratio is incrementally computed.
                 if (_recent_avg_pause_time_ratio < 0.0) {
          diff --git a/src/share/vm/gc/g1/g1_globals.hpp b/src/share/vm/gc/g1/g1_globals.hpp
          --- a/src/share/vm/gc/g1/g1_globals.hpp
          +++ b/src/share/vm/gc/g1/g1_globals.hpp
          @@ -208,6 +208,10 @@
             develop(bool, G1HRRSFlushLogBuffersOnVerify, false, \
                     "Forces flushing of log buffers before verification.") \
                                                                                       \
          + develop(bool, G1FailOnFPError, false, \
          + "When set, G1 will fail when it encounters an FP 'error', " \
          + "so as to allow debugging") \
          + \
             product(size_t, G1HeapRegionSize, 0, \
                     "Size of the G1 regions.") \
                     range(0, 32*M) \
          Show
          brutisso Bengt Rutisson (Inactive) added a comment - Here is the patch with the instrumentation that JDK-6898948 added and JDK-8141394 removed. diff --git a/src/share/vm/gc/g1/g1CollectorPolicy.cpp b/src/share/vm/gc/g1/g1CollectorPolicy.cpp --- a/src/share/vm/gc/g1/g1CollectorPolicy.cpp +++ b/src/share/vm/gc/g1/g1CollectorPolicy.cpp @@ -976,6 +976,19 @@      _recent_avg_pause_time_ratio = _recent_gc_times_ms->sum()/interval_ms;      if (recent_avg_pause_time_ratio() < 0.0 ||          (recent_avg_pause_time_ratio() - 1.0 > 0.0)) { +#ifndef PRODUCT + // Dump info to allow post-facto debugging + gclog_or_tty->print_cr("recent_avg_pause_time_ratio() out of bounds"); + gclog_or_tty->print_cr("-------------------------------------------"); + gclog_or_tty->print_cr("Recent GC Times (ms):"); + _recent_gc_times_ms->dump(); + gclog_or_tty->print_cr("(End Time=%3.3f) Recent GC End Times (s):", end_time_sec); + _recent_prev_end_times_for_all_gcs_sec->dump(); + gclog_or_tty->print_cr("GC = %3.3f, Interval = %3.3f, Ratio = %3.3f", + _recent_gc_times_ms->sum(), interval_ms, recent_avg_pause_time_ratio()); + // In debug mode, terminate the JVM if the user wants to debug at this point. + assert(!G1FailOnFPError, "Debugging data for CR 6898948 has been dumped above"); +#endif // !PRODUCT        // Clip ratio between 0.0 and 1.0, and continue. This will be fixed in        // CR 6902692 by redoing the manner in which the ratio is incrementally computed.        if (_recent_avg_pause_time_ratio < 0.0) { diff --git a/src/share/vm/gc/g1/g1_globals.hpp b/src/share/vm/gc/g1/g1_globals.hpp --- a/src/share/vm/gc/g1/g1_globals.hpp +++ b/src/share/vm/gc/g1/g1_globals.hpp @@ -208,6 +208,10 @@    develop(bool, G1HRRSFlushLogBuffersOnVerify, false, \            "Forces flushing of log buffers before verification.") \                                                                              \ + develop(bool, G1FailOnFPError, false, \ + "When set, G1 will fail when it encounters an FP 'error', " \ + "so as to allow debugging") \ + \    product(size_t, G1HeapRegionSize, 0, \            "Size of the G1 regions.") \            range(0, 32*M) \

            People

            • Assignee:
              Unassigned
              Reporter:
              ysr Y. Ramakrishna
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Imported:
                Indexed: