Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8196626

[Graal] runtime/appcds/UseAppCDS.java crashes with "VM thread using lock Heap_lock (not allowed to block on)"

    Details

      Description

      runtime/appcds/UseAppCDS.java crashes with
       # Internal Error (/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S299/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/4b20d9bc-4b5d-4aa0-88ff-bb2fa5d8da44/runs/44472215-b4ca-4138-95b6-acf54af1ff08/workspace/open/src/hotspot/share/runtime/mutex.cpp:1390), pid=21637, tid=21657
      # fatal error: VM thread using lock Heap_lock (not allowed to block on)
      #
      # JRE version: Java(TM) SE Runtime Environment (10.0) (fastdebug build 10-internal+0-2018-01-31-2202159.ekaterina.pavlova.jdk.hs)
      # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 10-internal+0-2018-01-31-2202159.ekaterina.pavlova.jdk.hs, interpreted mode, tiered, jvmci, jvmci compiler, compressed oops, g1 gc, linux-amd64)
      #

      when running in Graal JIT mode with -Xcomp.

      Steps to run the test:
      > jtreg -vt -jdk:JDK_HS_fastdebug -dir:test/hotspot/jtreg
        -javaoptions:-XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:+TieredCompilation -XX:+UseJVMCICompiler -Djvmci.Compiler=graal -Xcomp
        runtime/appcds/UseAppCDS.java

        Activity

        Hide
        thartmann Tobias Hartmann added a comment -
        I'm able to reproduce this by modifying the metaspace allocation code to fail (which happens in the reported case):

        diff -r 11920d5d14a8 src/hotspot/share/gc/shared/collectorPolicy.cpp
        --- a/src/hotspot/share/gc/shared/collectorPolicy.cpp Wed Jan 31 17:43:46 2018 -0800
        +++ b/src/hotspot/share/gc/shared/collectorPolicy.cpp Fri Feb 02 15:31:33 2018 +0100
        @@ -815,7 +815,7 @@
         
           do {
             MetaWord* result = loader_data->metaspace_non_null()->allocate(word_size, mdtype);
        - if (result != NULL) {
        + if (false) {
               return result;
             }
         
        diff -r 11920d5d14a8 src/hotspot/share/memory/metaspace.cpp
        --- a/src/hotspot/share/memory/metaspace.cpp Wed Jan 31 17:43:46 2018 -0800
        +++ b/src/hotspot/share/memory/metaspace.cpp Fri Feb 02 15:31:33 2018 +0100
        @@ -3944,7 +3944,7 @@
           // Try to allocate metadata.
           MetaWord* result = loader_data->metaspace_non_null()->allocate(word_size, mdtype);
         
        - if (result == NULL) {
        + if (true) {
             tracer()->report_metaspace_allocation_failure(loader_data, word_size, type, mdtype);
         
             // Allocation failed.

        The reason this only reproduces with Graal as JIT is that Graal loads more classes and during dumping I see:

        Preload Warning: Cannot find org/graalvm/compiler/hotspot/HotSpotGraalJVMCIServiceLocator
        Preload Warning: Cannot find org/graalvm/compiler/hotspot/HotSpotGraalJVMCIServiceLocator$Shared
        Preload Warning: Cannot find org/graalvm/compiler/hotspot/HotSpotGraalCompilerFactory
        Preload Warning: Cannot find org/graalvm/compiler/hotspot/HotSpotGraalRuntimeProvider
        Preload Warning: Cannot find org/graalvm/compiler/api/runtime/GraalRuntime
        Preload Warning: Cannot find org/graalvm/compiler/runtime/RuntimeProvider
        Preload Warning: Cannot find org/graalvm/compiler/replacements/SnippetCounter$Group$Factory
        Preload Warning: Cannot find org/graalvm/compiler/hotspot/JVMCIVersionCheck
        Preload Warning: Cannot find org/graalvm/compiler/hotspot/HotSpotGraalOptionValues
        Preload Warning: Cannot find org/graalvm/compiler/options/OptionValuesAccess
        Preload Warning: Cannot find org/graalvm/util/UnmodifiableEconomicMap

        Constraints are then added for those classes:

        V [libjvm.so+0x181ec27] SystemDictionaryShared::add_verification_constraint(Klass*, Symbol*, Symbol*, bool, bool, bool)+0x77
        V [libjvm.so+0x18d1050] VerificationType::is_reference_assignable_from(VerificationType const&, ClassVerifier*, bool, Thread*) const+0x3a0
        V [libjvm.so+0x18e298f] StackMapFrame::pop_stack(VerificationType, Thread*)+0x4f
        V [libjvm.so+0x18ddf18] ClassVerifier::verify_method(methodHandle const&, Thread*)+0xf58
        V [libjvm.so+0x18e146e] ClassVerifier::verify_class(Thread*)+0x1ae
        V [libjvm.so+0x18e1b91] Verifier::verify(InstanceKlass*, Verifier::Mode, bool, Thread*)+0x401
        V [libjvm.so+0xf1e7db] InstanceKlass::link_class_impl(bool, Thread*)+0x4fb
        V [libjvm.so+0xf1e982] InstanceKlass::link_class(Thread*)+0x72
        V [libjvm.so+0x1459154] MetaspaceShared::try_link_class(InstanceKlass*, Thread*)+0x94
        V [libjvm.so+0x145b9a3] MetaspaceShared::preload_classes(char const*, Thread*)+0x1f3
        V [libjvm.so+0x145bb0d] MetaspaceShared::preload_and_dump(Thread*)+0x12d
        V [libjvm.so+0x186052d] Threads::create_vm(JavaVMInitArgs*, bool*)+0x8bd
        V [libjvm.so+0x1092fa7] JNI_CreateJavaVM+0xa7
        C [libjli.so+0x3cc0] JavaMain+0x90

        And in SharedDictionaryEntry::finalize_verification_constraints() we allocate arrays in metaspace (which may fail and the crash the VM because we acquire the Heap_lock):

        V [libjvm.so+0x1425b1b] Monitor::check_prelock_state(Thread*)+0x15b
        V [libjvm.so+0x1425dfe] Monitor::lock(Thread*)+0x4e
        V [libjvm.so+0xa612f9] CollectorPolicy::satisfy_failed_metadata_allocation(ClassLoaderData*, unsigned long, Metaspace::MetadataType)+0x99
        V [libjvm.so+0x13a41d4] Metaspace::allocate(ClassLoaderData*, unsigned long, MetaspaceObj::Type, Thread*)+0x284
        V [libjvm.so+0x17453c7] SharedDictionaryEntry::finalize_verification_constraints()+0x247
        V [libjvm.so+0x1745e90] SystemDictionaryShared::finalize_verification_constraints()+0x60
        V [libjvm.so+0x13ae5a7] VM_PopulateDumpSharedSpace::doit()+0xe37
        V [libjvm.so+0x186f7f8] VM_Operation::evaluate()+0x208
        V [libjvm.so+0x186b72e] VMThread::evaluate_operation(VM_Operation*)+0x16e
        V [libjvm.so+0x186c255] VMThread::loop()+0x2a5
        V [libjvm.so+0x186c9d0] VMThread::run()+0xc0
        V [libjvm.so+0x14ba82a] thread_native_entry(Thread*)+0xfa

        I'm not too familiar with the CDS code. Assigning to runtime for further investigation.
        Show
        thartmann Tobias Hartmann added a comment - I'm able to reproduce this by modifying the metaspace allocation code to fail (which happens in the reported case): diff -r 11920d5d14a8 src/hotspot/share/gc/shared/collectorPolicy.cpp --- a/src/hotspot/share/gc/shared/collectorPolicy.cpp Wed Jan 31 17:43:46 2018 -0800 +++ b/src/hotspot/share/gc/shared/collectorPolicy.cpp Fri Feb 02 15:31:33 2018 +0100 @@ -815,7 +815,7 @@      do {      MetaWord* result = loader_data->metaspace_non_null()->allocate(word_size, mdtype); - if (result != NULL) { + if (false) {        return result;      }   diff -r 11920d5d14a8 src/hotspot/share/memory/metaspace.cpp --- a/src/hotspot/share/memory/metaspace.cpp Wed Jan 31 17:43:46 2018 -0800 +++ b/src/hotspot/share/memory/metaspace.cpp Fri Feb 02 15:31:33 2018 +0100 @@ -3944,7 +3944,7 @@    // Try to allocate metadata.    MetaWord* result = loader_data->metaspace_non_null()->allocate(word_size, mdtype);   - if (result == NULL) { + if (true) {      tracer()->report_metaspace_allocation_failure(loader_data, word_size, type, mdtype);        // Allocation failed. The reason this only reproduces with Graal as JIT is that Graal loads more classes and during dumping I see: Preload Warning: Cannot find org/graalvm/compiler/hotspot/HotSpotGraalJVMCIServiceLocator Preload Warning: Cannot find org/graalvm/compiler/hotspot/HotSpotGraalJVMCIServiceLocator$Shared Preload Warning: Cannot find org/graalvm/compiler/hotspot/HotSpotGraalCompilerFactory Preload Warning: Cannot find org/graalvm/compiler/hotspot/HotSpotGraalRuntimeProvider Preload Warning: Cannot find org/graalvm/compiler/api/runtime/GraalRuntime Preload Warning: Cannot find org/graalvm/compiler/runtime/RuntimeProvider Preload Warning: Cannot find org/graalvm/compiler/replacements/SnippetCounter$Group$Factory Preload Warning: Cannot find org/graalvm/compiler/hotspot/JVMCIVersionCheck Preload Warning: Cannot find org/graalvm/compiler/hotspot/HotSpotGraalOptionValues Preload Warning: Cannot find org/graalvm/compiler/options/OptionValuesAccess Preload Warning: Cannot find org/graalvm/util/UnmodifiableEconomicMap Constraints are then added for those classes: V [libjvm.so+0x181ec27] SystemDictionaryShared::add_verification_constraint(Klass*, Symbol*, Symbol*, bool, bool, bool)+0x77 V [libjvm.so+0x18d1050] VerificationType::is_reference_assignable_from(VerificationType const&, ClassVerifier*, bool, Thread*) const+0x3a0 V [libjvm.so+0x18e298f] StackMapFrame::pop_stack(VerificationType, Thread*)+0x4f V [libjvm.so+0x18ddf18] ClassVerifier::verify_method(methodHandle const&, Thread*)+0xf58 V [libjvm.so+0x18e146e] ClassVerifier::verify_class(Thread*)+0x1ae V [libjvm.so+0x18e1b91] Verifier::verify(InstanceKlass*, Verifier::Mode, bool, Thread*)+0x401 V [libjvm.so+0xf1e7db] InstanceKlass::link_class_impl(bool, Thread*)+0x4fb V [libjvm.so+0xf1e982] InstanceKlass::link_class(Thread*)+0x72 V [libjvm.so+0x1459154] MetaspaceShared::try_link_class(InstanceKlass*, Thread*)+0x94 V [libjvm.so+0x145b9a3] MetaspaceShared::preload_classes(char const*, Thread*)+0x1f3 V [libjvm.so+0x145bb0d] MetaspaceShared::preload_and_dump(Thread*)+0x12d V [libjvm.so+0x186052d] Threads::create_vm(JavaVMInitArgs*, bool*)+0x8bd V [libjvm.so+0x1092fa7] JNI_CreateJavaVM+0xa7 C [libjli.so+0x3cc0] JavaMain+0x90 And in SharedDictionaryEntry::finalize_verification_constraints() we allocate arrays in metaspace (which may fail and the crash the VM because we acquire the Heap_lock): V [libjvm.so+0x1425b1b] Monitor::check_prelock_state(Thread*)+0x15b V [libjvm.so+0x1425dfe] Monitor::lock(Thread*)+0x4e V [libjvm.so+0xa612f9] CollectorPolicy::satisfy_failed_metadata_allocation(ClassLoaderData*, unsigned long, Metaspace::MetadataType)+0x99 V [libjvm.so+0x13a41d4] Metaspace::allocate(ClassLoaderData*, unsigned long, MetaspaceObj::Type, Thread*)+0x284 V [libjvm.so+0x17453c7] SharedDictionaryEntry::finalize_verification_constraints()+0x247 V [libjvm.so+0x1745e90] SystemDictionaryShared::finalize_verification_constraints()+0x60 V [libjvm.so+0x13ae5a7] VM_PopulateDumpSharedSpace::doit()+0xe37 V [libjvm.so+0x186f7f8] VM_Operation::evaluate()+0x208 V [libjvm.so+0x186b72e] VMThread::evaluate_operation(VM_Operation*)+0x16e V [libjvm.so+0x186c255] VMThread::loop()+0x2a5 V [libjvm.so+0x186c9d0] VMThread::run()+0xc0 V [libjvm.so+0x14ba82a] thread_native_entry(Thread*)+0xfa I'm not too familiar with the CDS code. Assigning to runtime for further investigation.
        Hide
        iklam Ioi Lam added a comment -
        Looks like we should change this in Metaspace::allocate

          if (result == NULL) {
            tracer()->report_metaspace_allocation_failure(loader_data, word_size, type, mdtype);

            // Allocation failed.
            if (is_init_completed()) {
              // Only start a GC if the bootstrapping has completed.

              // Try to clean out some memory and retry.
              result = Universe::heap()->collector_policy()->satisfy_failed_metadata_allocation(
                  loader_data, word_size, mdtype);
            }
          }


        to something like

             if (is_init_completed() && !(DumpSharedSpaces && .... current thread is VM thread)) {
             ...}

        Also, we should force a full GC before running this

            VM_PopulateDumpSharedSpace op;
            VMThread::execute(&op);


        Show
        iklam Ioi Lam added a comment - Looks like we should change this in Metaspace::allocate   if (result == NULL) {     tracer()->report_metaspace_allocation_failure(loader_data, word_size, type, mdtype);     // Allocation failed.     if (is_init_completed()) {       // Only start a GC if the bootstrapping has completed.       // Try to clean out some memory and retry.       result = Universe::heap()->collector_policy()->satisfy_failed_metadata_allocation(           loader_data, word_size, mdtype);     }   } to something like      if (is_init_completed() && !(DumpSharedSpaces && .... current thread is VM thread)) {      ...} Also, we should force a full GC before running this     VM_PopulateDumpSharedSpace op;     VMThread::execute(&op);
        Hide
        ccheung Calvin Cheung added a comment -
        I haven't been able to reproduce it so far on my linux machine.

        I ran the following test 100 times using a script like:
        n=0; time while (( n++ < 100)); do $RUN_TEST_NO_MAX_RAM; done

        where RUN_TEST_NO_MAX_RAM is defined as:
        java -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:+TieredCompilation -XX:+UseJVMCICompiler -Djvmci.Compiler=graal -Xcomp -XX:+UnlockCommercialFeatures -XX:-UnlockDiagnosticVMOptions -cp ./test.jar -XX:+UseAppCDS -XX:SharedClassListFile=./UseAppCDS.classlist -XX:SharedArchiveFile=./shared.jsa -Xshare:dump

        I also used the -XX:+PrintSystemDictionaryAtExit option to check that those org.graal.vm.compiler.* classes are in the archive.
        For my run, there are 2249 of those classes.
        Show
        ccheung Calvin Cheung added a comment - I haven't been able to reproduce it so far on my linux machine. I ran the following test 100 times using a script like: n=0; time while (( n++ < 100)); do $RUN_TEST_NO_MAX_RAM; done where RUN_TEST_NO_MAX_RAM is defined as: java -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:+TieredCompilation -XX:+UseJVMCICompiler -Djvmci.Compiler=graal -Xcomp -XX:+UnlockCommercialFeatures -XX:-UnlockDiagnosticVMOptions -cp ./test.jar -XX:+UseAppCDS -XX:SharedClassListFile=./UseAppCDS.classlist -XX:SharedArchiveFile=./shared.jsa -Xshare:dump I also used the -XX:+PrintSystemDictionaryAtExit option to check that those org.graal.vm.compiler.* classes are in the archive. For my run, there are 2249 of those classes.
        Hide
        thartmann Tobias Hartmann added a comment -
        Did you modify the hotspot code as I've described above? With these modifications, I can easily reproduce the problem on my machine (Linux x86_64) with this command line:
        jtreg -jdk:/oracle/jdk_hs/build/linux-x86_64-normal-server-fastdebug/images/jdk/ -timeout:5 -va -k:\!ignore -javaoptions:"-XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:+TieredCompilation -XX:+UseJVMCICompiler -Djvmci.Compiler=graal -Xcomp" -J-Djavatest.maxOutputSize=10000000 test/hotspot/jtreg/runtime/appcds/UseAppCDS.java
        Show
        thartmann Tobias Hartmann added a comment - Did you modify the hotspot code as I've described above? With these modifications, I can easily reproduce the problem on my machine (Linux x86_64) with this command line: jtreg -jdk:/oracle/jdk_hs/build/linux-x86_64-normal-server-fastdebug/images/jdk/ -timeout:5 -va -k:\!ignore -javaoptions:"-XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:+TieredCompilation -XX:+UseJVMCICompiler -Djvmci.Compiler=graal -Xcomp" -J-Djavatest.maxOutputSize=10000000 test/hotspot/jtreg/runtime/appcds/UseAppCDS.java
        Hide
        ccheung Calvin Cheung added a comment -
        No, I haven't modified the hotspot source code.
        I was just trying to see how hard to reproduce the original problem.
        Show
        ccheung Calvin Cheung added a comment - No, I haven't modified the hotspot source code. I was just trying to see how hard to reproduce the original problem.
        Hide
        iklam Ioi Lam added a comment -
        With the latest jdk/hs repo from last night, I can cause the crash without any modification, with fastdebug build:

        $ java -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:+TieredCompilation -XX:+UseJVMCICompiler -Djvmci.Compiler=graal -Xcomp -XX:+UnlockCommercialFeatures -XX:-UnlockDiagnosticVMOptions -cp ./test.jar -XX:+UseAppCDS -XX:DumpLoadedClassList=./UseAppCDS.classlist -cp ~/tmp HelloWorld
        [...]
        Hello World

        $ java -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:+TieredCompilation -XX:+UseJVMCICompiler -Djvmci.Compiler=graal -XX:+UnlockCommercialFeatures -XX:-UnlockDiagnosticVMOptions -cp ./test.jar -XX:+UseAppCDS -XX:SharedClassListFile=./UseAppCDS.classlist -XX:SharedArchiveFile=./shared.jsa -Xshare:dump -XX:MetaspaceSize=20M -XX:MaxMetaspaceSize=20M
        [...]
        Preload Error: Failed to load java/lang/Shutdown$Lock
        Preload Error: Failed to load java/lang/Shutdown$Lock
        Loading classes to share: done.
        Rewriting and linking classes ...
        Preload Warning: Removed error class: sun.net.www.protocol.file.FileURLConnection
        Preload Warning: Removed error class: sun.net.www.URLConnection
        Rewriting and linking classes: done
        Number of classes 3476
            instance classes = 3411
            obj array classes = 57
            type array classes = 8
        Updating ConstMethods ... done.
        Removing unshareable information ... done.
        # To suppress the following error report, specify this argument
        # after -XX: or in .hotspotrc: SuppressErrorAt=/mutex.cpp:1390
        #
        # A fatal error has been detected by the Java Runtime Environment:
        #
        # Internal Error (/home/iklam/jdk/abe/open/src/hotspot/share/runtime/mutex.cpp:1390), pid=27777, tid=27833
        # fatal error: VM thread using lock Heap_lock (not allowed to block on)
        Show
        iklam Ioi Lam added a comment - With the latest jdk/hs repo from last night, I can cause the crash without any modification, with fastdebug build: $ java -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:+TieredCompilation -XX:+UseJVMCICompiler -Djvmci.Compiler=graal -Xcomp -XX:+UnlockCommercialFeatures -XX:-UnlockDiagnosticVMOptions -cp ./test.jar -XX:+UseAppCDS -XX:DumpLoadedClassList=./UseAppCDS.classlist -cp ~/tmp HelloWorld [...] Hello World $ java -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:+TieredCompilation -XX:+UseJVMCICompiler -Djvmci.Compiler=graal -XX:+UnlockCommercialFeatures -XX:-UnlockDiagnosticVMOptions -cp ./test.jar -XX:+UseAppCDS -XX:SharedClassListFile=./UseAppCDS.classlist -XX:SharedArchiveFile=./shared.jsa -Xshare:dump -XX:MetaspaceSize=20M -XX:MaxMetaspaceSize=20M [...] Preload Error: Failed to load java/lang/Shutdown$Lock Preload Error: Failed to load java/lang/Shutdown$Lock Loading classes to share: done. Rewriting and linking classes ... Preload Warning: Removed error class: sun.net. www.protocol.file.FileURLConnection Preload Warning: Removed error class: sun.net. www.URLConnection Rewriting and linking classes: done Number of classes 3476     instance classes = 3411     obj array classes = 57     type array classes = 8 Updating ConstMethods ... done. Removing unshareable information ... done. # To suppress the following error report, specify this argument # after -XX: or in .hotspotrc: SuppressErrorAt=/mutex.cpp:1390 # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/home/iklam/jdk/abe/open/src/hotspot/share/runtime/mutex.cpp:1390), pid=27777, tid=27833 # fatal error: VM thread using lock Heap_lock (not allowed to block on)
        Hide
        ccheung Calvin Cheung added a comment -
        I could reproduce the crash as well even with last Friday's build.

        The following options make it reproducible:
        -XX:MetaspaceSize=20M -XX:MaxMetaspaceSize=20M
        Show
        ccheung Calvin Cheung added a comment - I could reproduce the crash as well even with last Friday's build. The following options make it reproducible: -XX:MetaspaceSize=20M -XX:MaxMetaspaceSize=20M

          People

          • Assignee:
            ccheung Calvin Cheung
            Reporter:
            epavlova Ekaterina Pavlova
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated: