Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8043070

nmethod::verify_interrupt_point() shouldn't enter safepoint

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: P2
    • Resolution: Fixed
    • Affects Version/s: 9, 10
    • Fix Version/s: 10
    • Component/s: hotspot
    • Labels:
    • Subcomponent:
    • Resolved In Build:
      team

      Description

      (lldb) bt
      * thread #16: tid = 0x6cd5ba, 0x00007fff8a7d0866 libsystem_kernel.dylib`__pthread_kill + 10, name = 'Java: C2 CompilerThread0', stop reason = signal SIGABRT
        * frame #0: 0x00007fff8a7d0866 libsystem_kernel.dylib`__pthread_kill + 10
          frame #1: 0x00007fff895f135c libsystem_pthread.dylib`pthread_kill + 92
          frame #2: 0x00007fff8e715b1a libsystem_c.dylib`abort + 125
          frame #3: 0x0000000103a7bcf6 libjvm.dylib`os::abort(dump_core=true) + 150 at os_bsd.cpp:1161
          frame #4: 0x0000000103c84fbf libjvm.dylib`VMError::report_and_die(this=0x0000000113500100) + 2431 at vmError.cpp:1091
          frame #5: 0x00000001034e239c libjvm.dylib`report_vm_error(file=0x0000000103d2c154, line=48, error_msg=0x0000000103d2c19c, detail_msg=0x0000000103d2c1b6) + 124 at debug.cpp:226
          frame #6: 0x000000010335ce3c libjvm.dylib`frame(this=0x0000000113500378, sp=0x0000000113503f88, fp=0x0000000000000000, pc=0x0000000000000000) + 108 at frame_x86.inline.hpp:48
          frame #7: 0x000000010335cccd libjvm.dylib`frame(this=0x0000000113500378, sp=0x0000000113503f88, fp=0x0000000000000000, pc=0x0000000000000000) + 45 at frame_x86.inline.hpp:59
          frame #8: 0x0000000103a84591 libjvm.dylib`os::get_sender_for_C_frame(fr=0x00000001135009a8) + 81 at os_bsd_x86.cpp:365
          frame #9: 0x0000000103c8398a libjvm.dylib`VMError::report(this=0x0000000113500d80, st=0x0000000113500c68) + 4474 at vmError.cpp:607
          frame #10: 0x0000000103c84aa9 libjvm.dylib`VMError::report_and_die(this=0x0000000113500d80) + 1129 at vmError.cpp:1004
          frame #11: 0x00000001034e239c libjvm.dylib`report_vm_error(file=0x0000000103e26c1f, line=897, error_msg=0x0000000103d61fd6, detail_msg=0x0000000103e27181) + 124 at debug.cpp:226
          frame #12: 0x00000001034e23dd libjvm.dylib`report_fatal(file=0x0000000103e26c1f, line=897, message=0x0000000103e27181) + 45 at debug.cpp:231
          frame #13: 0x0000000103c1004a libjvm.dylib`Thread::check_for_valid_safepoint_state(this=0x0000000102821800, potential_vm_operation=false) + 90 at thread.cpp:897
          frame #14: 0x0000000103a2de56 libjvm.dylib`Monitor::check_prelock_state(this=0x0000000100502ba8, thread=0x0000000102821800) + 342 at mutex.cpp:1363
          frame #15: 0x0000000103a2da9f libjvm.dylib`Monitor::lock(this=0x0000000100502ba8, Self=0x0000000102821800) + 47 at mutex.cpp:898
          frame #16: 0x0000000103a2dfc5 libjvm.dylib`Monitor::lock(this=0x0000000100502ba8) + 37 at mutex.cpp:939
          frame #17: 0x0000000103001857 libjvm.dylib`MutexLocker(this=0x0000000113501200, mutex=0x0000000100502ba8) + 135 at mutexLocker.hpp:182
          frame #18: 0x000000010300174d libjvm.dylib`MutexLocker(this=0x0000000113501200, mutex=0x0000000100502ba8) + 29 at mutexLocker.hpp:183
          frame #19: 0x0000000103a4162f libjvm.dylib`nmethod::verify_interrupt_point(this=0x0000000105d1fe90, call_site=0x0000000105d21b9f) + 239 at nmethod.cpp:2409
          frame #20: 0x0000000103a41436 libjvm.dylib`nmethod::verify_scopes(this=0x0000000105d1fe90) + 246 at nmethod.cpp:2434
          frame #21: 0x0000000103a412e9 libjvm.dylib`nmethod::verify(this=0x0000000105d1fe90) + 713 at nmethod.cpp:2393
          frame #22: 0x0000000103a3a273 libjvm.dylib`nmethod::new_nmethod(method=<unavailable>, compile_id=9480, entry_bci=-1, offsets=0x0000000113503330, orig_pc_offset=0, debug_info=0x0000000102822158, dependencies=0x0000000102bd4f78, code_buffer=0x00000001135033a8, frame_size=4, oop_maps=0x000000011660cf98, handler_table=0x0000000113503568, nul_chk_table=0x0000000113503588, compiler=0x0000000100520b98, comp_level=4) + 1219 at nmethod.cpp:614
          frame #23: 0x00000001033afdea libjvm.dylib`ciEnv::register_method(this=0x0000000113503880, target=0x0000000112833958, entry_bci=-1, offsets=0x0000000113503330, orig_pc_offset=0, code_buffer=0x00000001135033a8, frame_words=4, oop_map_set=0x000000011660cf98, handler_table=0x0000000113503568, inc_table=0x0000000113503588, compiler=0x0000000100520b98, comp_level=4, has_unsafe_access=false, has_wide_vectors=false, rtm_state=NoRTM) + 1370 at ciEnv.cpp:1033
          frame #24: 0x000000010345a3d6 libjvm.dylib`Compile(this=0x0000000113502b18, ci_env=0x0000000113503880, compiler=0x0000000100520b98, target=0x0000000112833958, osr_bci=-1, subsume_loads=true, do_escape_analysis=true, eliminate_boxing=true) + 5654 at compile.cpp:912
          frame #25: 0x0000000103458dae libjvm.dylib`Compile(this=0x0000000113502b18, ci_env=0x0000000113503880, compiler=0x0000000100520b98, target=0x0000000112833958, osr_bci=-1, subsume_loads=true, do_escape_analysis=true, eliminate_boxing=true) + 126 at compile.cpp:928
          frame #26: 0x00000001033708b8 libjvm.dylib`C2Compiler::compile_method(this=0x0000000100520b98, env=0x0000000113503880, target=0x0000000112833958, entry_bci=-1) + 312 at c2compiler.cpp:121
          frame #27: 0x0000000103475f92 libjvm.dylib`CompileBroker::invoke_compiler_on_method(task=0x0000000100564048) + 1586 at compileBroker.cpp:1923
          frame #28: 0x0000000103475341 libjvm.dylib`CompileBroker::compiler_thread_loop() + 945 at compileBroker.cpp:1746
          frame #29: 0x0000000103c11e06 libjvm.dylib`compiler_thread_entry(thread=0x0000000102821800, __the_thread__=0x0000000102821800) + 86 at thread.cpp:3218
          frame #30: 0x0000000103c126f3 libjvm.dylib`JavaThread::thread_main_inner(this=0x0000000102821800) + 339 at thread.cpp:1689
          frame #31: 0x0000000103c12432 libjvm.dylib`JavaThread::run(this=0x0000000102821800) + 642 at thread.cpp:1669
          frame #32: 0x0000000103a7b280 libjvm.dylib`java_start(thread=0x0000000102821800) + 448 at os_bsd.cpp:752
          frame #33: 0x00007fff895f0899 libsystem_pthread.dylib`_pthread_body + 138
          frame #34: 0x00007fff895f072a libsystem_pthread.dylib`_pthread_start + 137

        Issue Links

          Activity

          Hide
          iveresov Igor Veresov added a comment - - edited
          Flags to reproduce: -XX:+TieredCompilation -XX:CompileThreshold=100 -XX:+DeoptimizeALot -XX:+PrintCompilation -XX:+StressCodeAging -XX:CompilationPolicyChoice=2 -XX:HotMethodDetectionLimit=25 -XX:MinPassesBeforeFlush=0 -XX:NmethodSweepActivity=100 -XX:ReservedCodeCacheSize=5M -XX:+TraceDeoptimization -XX:+LogCompilation

          I used test/closed/compiler/6440479 to cause the failure.

          So the culprit seems to be the -XX:MinPassesBeforeFlush=0 flag. The reason of the failure is a race between the code installation and the sweeper. The sweeper make the method not entrant after nmethod::new_method() releases the CodeCache_lock and before nmethod::verify() is called.

          The solution has to make sure that the method is not touched by the sweeper and it's state is not changed until the method is fully published.
          Show
          iveresov Igor Veresov added a comment - - edited Flags to reproduce: -XX:+TieredCompilation -XX:CompileThreshold=100 -XX:+DeoptimizeALot -XX:+PrintCompilation -XX:+StressCodeAging -XX:CompilationPolicyChoice=2 -XX:HotMethodDetectionLimit=25 -XX:MinPassesBeforeFlush=0 -XX:NmethodSweepActivity=100 -XX:ReservedCodeCacheSize=5M -XX:+TraceDeoptimization -XX:+LogCompilation I used test/closed/compiler/6440479 to cause the failure. So the culprit seems to be the -XX:MinPassesBeforeFlush=0 flag. The reason of the failure is a race between the code installation and the sweeper. The sweeper make the method not entrant after nmethod::new_method() releases the CodeCache_lock and before nmethod::verify() is called. The solution has to make sure that the method is not touched by the sweeper and it's state is not changed until the method is fully published.
          Hide
          thartmann Tobias Hartmann added a comment -
          Changing type to Bug. Any particular reason this failure was filed as enhancement?
          Show
          thartmann Tobias Hartmann added a comment - Changing type to Bug. Any particular reason this failure was filed as enhancement?
          Hide
          thartmann Tobias Hartmann added a comment -
          ILW = Failure due to race between code installation and sweeper, rare but reproducible, no workaround = HLH = P2
          Show
          thartmann Tobias Hartmann added a comment - ILW = Failure due to race between code installation and sweeper, rare but reproducible, no workaround = HLH = P2
          Hide
          iveresov Igor Veresov added a comment -
          The problem is still reproducible, however I don't think I can quite follow my initial evaluation. It seems rather that the problem is that the verification code in nmethod::verify() allows safepoints (by locking mutexes that check for safepoints). The error is basically about the fact that a safepoint happened.
          Show
          iveresov Igor Veresov added a comment - The problem is still reproducible, however I don't think I can quite follow my initial evaluation. It seems rather that the problem is that the verification code in nmethod::verify() allows safepoints (by locking mutexes that check for safepoints). The error is basically about the fact that a safepoint happened.
          Hide
          iveresov Igor Veresov added a comment -
          Ah, I see what I meant before. So, the sweeper runs possibly_flush() for all is_use nmethods, which also includes nmethods that a where just created but not installed. Making such an nmethod not entrant breaks the logic in nmethod::verify_interrupt_point():

            bool is_installed = (method()->code() == this) // nmethod is in state 'in_use' and installed
                                || !this->is_in_use(); // nmethod is installed, but not in 'in_use' state

          is_installed becomes true. It should not.
          Show
          iveresov Igor Veresov added a comment - Ah, I see what I meant before. So, the sweeper runs possibly_flush() for all is_use nmethods, which also includes nmethods that a where just created but not installed. Making such an nmethod not entrant breaks the logic in nmethod::verify_interrupt_point():   bool is_installed = (method()->code() == this) // nmethod is in state 'in_use' and installed                       || !this->is_in_use(); // nmethod is installed, but not in 'in_use' state is_installed becomes true. It should not.
          Hide
          iveresov Igor Veresov added a comment - - edited
          Interestingly the is_installed guard was added in JDK-8023037 that battled a similar problem.
          Show
          iveresov Igor Veresov added a comment - - edited Interestingly the is_installed guard was added in JDK-8023037 that battled a similar problem.
          Hide
          iveresov Igor Veresov added a comment -
          Wow, and I actually wrote a lot in JDK-8028001 about this problem.
          Show
          iveresov Igor Veresov added a comment - Wow, and I actually wrote a lot in JDK-8028001 about this problem.
          Hide
          iveresov Igor Veresov added a comment -
          Tentative fix that introduces an additional nmethod state: http://cr.openjdk.java.net/~iveresov/8043070/webrev.00/
          Show
          iveresov Igor Veresov added a comment - Tentative fix that introduces an additional nmethod state: http://cr.openjdk.java.net/~iveresov/8043070/webrev.00/
          Hide
          hgupdate HG Updates added a comment -
          URL: http://hg.openjdk.java.net/jdk/hs/rev/ab3959df2115
          User: iveresov
          Date: 2017-11-21 03:04:47 +0000
          Show
          hgupdate HG Updates added a comment - URL: http://hg.openjdk.java.net/jdk/hs/rev/ab3959df2115 User: iveresov Date: 2017-11-21 03:04:47 +0000

            People

            • Assignee:
              iveresov Igor Veresov
              Reporter:
              iveresov Igor Veresov
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: