Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8270423

AbortVMOnSafepointTimeout can hang the VM due to signal use

    XMLWordPrintable

    Details

      Description

      Test runtime/Safepoint/TestAbortVMOnSafepointTimeout.java timed out in the CI testing.

      The native stacks show the problem.

      Thread 14 (Thread 0xfffcdcfff1e0 (LWP 3806994)):
      #0 0x0000fffd272d3a60 in __lll_lock_wait_private () from /lib64/libc.so.6
      #1 0x0000fffd272d8b50 in malloc () from /lib64/libc.so.6
      #2 0x0000fffd266a4664 in os::malloc(unsigned long, MEMFLAGS, NativeCallStack const&) () from /opt/mach5/mesos/work_dir/jib-master/install/jdk-18+6-226/linux-aarch64-debug.jdk/jdk-18/fastdebug/lib/server/libjvm.so
      #3 0x0000fffd25879f54 in AllocateHeap(unsigned long, MEMFLAGS, AllocFailStrategy::AllocFailEnum) () from /opt/mach5/mesos/work_dir/jib-master/install/jdk-18+6-226/linux-aarch64-debug.jdk/jdk-18/fastdebug/lib/server/libjvm.so
      #4 0x0000fffd25c93e08 in Decoder::decode(unsigned char*, char*, int, int*, char const*, bool) () from /opt/mach5/mesos/work_dir/jib-master/install/jdk-18+6-226/linux-aarch64-debug.jdk/jdk-18/fastdebug/lib/server/libjvm.so
      #5 0x0000fffd266aa1c4 in os::dll_address_to_function_name(unsigned char*, char*, int, int*, bool) () from /opt/mach5/mesos/work_dir/jib-master/install/jdk-18+6-226/linux-aarch64-debug.jdk/jdk-18/fastdebug/lib/server/libjvm.so
      #6 0x0000fffd25de0af0 in frame::print_C_frame(outputStream*, char*, int, unsigned char*) () from /opt/mach5/mesos/work_dir/jib-master/install/jdk-18+6-226/linux-aarch64-debug.jdk/jdk-18/fastdebug/lib/server/libjvm.so
      #7 0x0000fffd26a5b700 in VMError::report(outputStream*, bool) () from /opt/mach5/mesos/work_dir/jib-master/install/jdk-18+6-226/linux-aarch64-debug.jdk/jdk-18/fastdebug/lib/server/libjvm.so
      #8 0x0000fffd26a5f3e8 in VMError::report_and_die(int, char const*, char const*, std::__va_list, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long) () from /opt/mach5/mesos/work_dir/jib-master/install/jdk-18+6-226/linux-aarch64-debug.jdk/jdk-18/fastdebug/lib/server/libjvm.so
      #9 0x0000fffd26a600f8 in VMError::report_and_die(Thread*, unsigned int, unsigned char*, void*, void*, char const*, ...) () from /opt/mach5/mesos/work_dir/jib-master/install/jdk-18+6-226/linux-aarch64-debug.jdk/jdk-18/fastdebug/lib/server/libjvm.so
      #10 0x0000fffd26840594 in JVM_handle_linux_signal () from /opt/mach5/mesos/work_dir/jib-master/install/jdk-18+6-226/linux-aarch64-debug.jdk/jdk-18/fastdebug/lib/server/libjvm.so
      #11 <signal handler called>
      #12 0x0000fffd272d6820 in sysmalloc () from /lib64/libc.so.6
      #13 0x0000fffd272d7b18 in _int_malloc () from /lib64/libc.so.6
      #14 0x0000fffd272d8b5c in malloc () from /lib64/libc.so.6
      #15 0x0000fffd266a4664 in os::malloc(unsigned long, MEMFLAGS, NativeCallStack const&) () from /opt/mach5/mesos/work_dir/jib-master/install/jdk-18+6-226/linux-aarch64-debug.jdk/jdk-18/fastdebug/lib/server/libjvm.so
      #16 0x0000fffd25879f54 in AllocateHeap(unsigned long, MEMFLAGS, AllocFailStrategy::AllocFailEnum) () from /opt/mach5/mesos/work_dir/jib-master/install/jdk-18+6-226/linux-aarch64-debug.jdk/jdk-18/fastdebug/lib/server/libjvm.so
      #17 0x0000fffd25bcf0a0 in CodeStrings::copy(CodeStrings&) () from /opt/mach5/mesos/work_dir/jib-master/install/jdk-18+6-226/linux-aarch64-debug.jdk/jdk-18/fastdebug/lib/server/libjvm.so
      #18 0x0000fffd25bcf54c in CodeBuffer::copy_code_to(CodeBlob*) () from /opt/mach5/mesos/work_dir/jib-master/install/jdk-18+6-226/linux-aarch64-debug.jdk/jdk-18/fastdebug/lib/server/libjvm.so
      #19 0x0000fffd25bc7bc8 in RuntimeStub::new_runtime_stub(char const*, CodeBuffer*, int, int, OopMapSet*, bool) () from /opt/mach5/mesos/work_dir/jib-master/install/jdk-18+6-226/linux-aarch64-debug.jdk/jdk-18/fastdebug/lib/server/libjvm.so

      A compiler thread has been hit with the SIGILL that AbortVMOnSafepointTimeout sends out, but it hits during malloc. The thread tries to generate an error report but that leads to a further use of malloc and so we deadlock on the libc malloc lock. (We know that use of malloc within the Decoder is risky, but worth the risk to get symbolic stacktraces).

      Meanwhile the VMThread is trying to abort the VM but it gets there second and so stops here:

      Thread 8 (Thread 0xfffcec41f1e0 (LWP 3806984)):
      #0 0x0000fffd2730439c in nanosleep () from /lib64/libc.so.6
      #1 0x0000fffd27304240 in sleep () from /lib64/libc.so.6
      #2 0x0000fffd266ae9a0 in os::infinite_sleep() () from /opt/mach5/mesos/work_dir/jib-master/install/jdk-18+6-226/linux-aarch64-debug.jdk/jdk-18/fastdebug/lib/server/libjvm.so
      #3 0x0000fffd26a5f3b4 in VMError::report_and_die(int, char const*, char const*, std::__va_list, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long) () from /opt/mach5/mesos/work_dir/jib-master/install/jdk-18+6-226/linux-aarch64-debug.jdk/jdk-18/fastdebug/lib/server/libjvm.so
      #4 0x0000fffd25c84aa0 in report_fatal(VMErrorType, char const*, int, char const*, ...) () from /opt/mach5/mesos/work_dir/jib-master/install/jdk-18+6-226/linux-aarch64-debug.jdk/jdk-18/fastdebug/lib/server/libjvm.so
      #5 0x0000fffd26801bc4 in SafepointSynchronize::print_safepoint_timeout() () from /opt/mach5/mesos/work_dir/jib-master/install/jdk-18+6-226/linux-aarch64-debug.jdk/jdk-18/fastdebug/lib/server/libjvm.so

      The mechanism is inherently unsafe/risky and so the test can fail. Not sure if there is really anything to be done here, but the test failure can be associated with this bug report.

        Attachments

          Activity

            People

            Assignee:
            Unassigned Unassigned
            Reporter:
            dholmes David Holmes
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: