Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8130425

libjvm crash due to stack overflow in executables with 32k tbss/tdata

    Details

    • Subcomponent:
    • Resolved In Build:
      b110

      Backports

        Description

        Problem summary: When a large TLS (Thread local storage) size is set for threads, JVM is throwing stack overflow exception.

        Problem Identified:
        As per investigation and a discussion we came to the conclusion that issue is not with the JVM but it lies in the way glibc has been implemented. When a TLS is declared , it steals the space from threads stack size. So if a thread is created with small stack size, and TLS is setted to a large value, then it will result in StackOverflow. This is the exact case in this bug where reaper thread is allocated a very low stack size 32768.

        Discussion thread:
        http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-December/037558.html

        Solution proposed:
        Its expected to get fix in glibc sometime , but for now I propose a workaround, a boolean system property "processReaperUseDefaultStackSize"
        using which we can set the stack size for reaper thread to default instead of fix 32768. This property can be set by the user using "-D" or "System.setProperty()".
        I have tested this fix, it works well with TLS size between 32k to 128k.

        Fix:
        diff -r 5c4530bb9ae6
        src/java.base/share/classes/java/lang/ProcessHandleImpl.java
        --- a/src/java.base/share/classes/java/lang/ProcessHandleImpl.java Fri Jan 08 13:06:29 2016 +0800
        +++ b/src/java.base/share/classes/java/lang/ProcessHandleImpl.java Tue
        Jan 12 15:55:50 2016 +0530
        @@ -83,9 +83,13 @@
                          ThreadGroup systemThreadGroup = tg;

                          ThreadFactory threadFactory = grimReaper -> {
        - // Our thread stack requirement is quite modest.
        - Thread t = new Thread(systemThreadGroup, grimReaper,
        - "process reaper", 32768);
        + Thread t = null;
        + if
        (Boolean.getBoolean("processReaperUseDefaultStackSize")) {
        + t = new Thread(systemThreadGroup, grimReaper,
        "process reaper");
        + } else {
        + // Our thread stack requirement is quite modest.
        + t = new Thread(systemThreadGroup, grimReaper,
        "process reaper", 32768);
        + }
                              t.setDaemon(true);
                              // A small attempt (probably futile) to avoid priority inversion
                              t.setPriority(Thread.MAX_PRIORITY);



        For test case please check the attached file.

          Issue Links

            Activity

            Hide
            martin Martin Buchholz added a comment -
            I don't think we have consensus yet that we can't or shouldn't fix this in hotspot. Any user-specified stack size should be in addition to any OS overhead, including native thread local storage.

            This problem has been known for years; not sure that glibc folks will do anything on their side to fix things.
            Show
            martin Martin Buchholz added a comment - I don't think we have consensus yet that we can't or shouldn't fix this in hotspot. Any user-specified stack size should be in addition to any OS overhead, including native thread local storage. This problem has been known for years; not sure that glibc folks will do anything on their side to fix things.
            Hide
            dholmes David Holmes added a comment -
            AFAIK there is no direct way for the VM to know how much stack might be stolen by glibc before creating a thread with a given stack size. Further if native code later creates its own TLS data structures that would need to further added to any "default" stack sizes calculated during VM startup.

            Open to practical suggestions.
            Show
            dholmes David Holmes added a comment - AFAIK there is no direct way for the VM to know how much stack might be stolen by glibc before creating a thread with a given stack size. Further if native code later creates its own TLS data structures that would need to further added to any "default" stack sizes calculated during VM startup. Open to practical suggestions.
            Hide
            martin Martin Buchholz added a comment -
            The glibc bug report now mentions our difficulties in Java
            https://sourceware.org/bugzilla/show_bug.cgi?id=11787#c44
            Show
            martin Martin Buchholz added a comment - The glibc bug report now mentions our difficulties in Java https://sourceware.org/bugzilla/show_bug.cgi?id=11787#c44
            Hide
            martin Martin Buchholz added a comment -
            No guarantees, but we carry a local patch to compute tls size using glibc internals. Find it here:
            http://cr.openjdk.java.net/~martin/webrevs/openjdk9/tls-size-guarantee/
            Show
            martin Martin Buchholz added a comment - No guarantees, but we carry a local patch to compute tls size using glibc internals. Find it here: http://cr.openjdk.java.net/~martin/webrevs/openjdk9/tls-size-guarantee/
            Hide
            hgupdate HG Updates added a comment -
            URL: http://hg.openjdk.java.net/jdk9/hs-rt/jdk/rev/460323d4a285
            User: kevinw
            Date: 2016-02-29 12:16:28 +0000
            Show
            hgupdate HG Updates added a comment - URL: http://hg.openjdk.java.net/jdk9/hs-rt/jdk/rev/460323d4a285 User: kevinw Date: 2016-02-29 12:16:28 +0000
            Hide
            hgupdate HG Updates added a comment -
            URL: http://hg.openjdk.java.net/jdk9/jdk9/jdk/rev/460323d4a285
            User: lana
            Date: 2016-03-14 15:55:07 +0000
            Show
            hgupdate HG Updates added a comment - URL: http://hg.openjdk.java.net/jdk9/jdk9/jdk/rev/460323d4a285 User: lana Date: 2016-03-14 15:55:07 +0000

              People

              • Assignee:
                csahu Cheleswer Sahu (Inactive)
                Reporter:
                shadowbug Shadow Bug
              • Votes:
                0 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: