Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6906488

one pkcs11 operation is done by multiple different threads which causes trouble

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: P2
    • Resolution: Duplicate
    • Affects Version/s: solaris_10u9, 6u13, 6u15, 6u16, 6u17
    • Fix Version/s: 6-pool
    • Component/s: security-libs
    • Labels:

      Backports

        Description

        The original reported problem was coredump of Java in the pkcs#11 native libraries:

        (dbx) where
        current thread: t@1685
        =>[1] __lwp_kill(0x0, 0x6, 0x0, 0x6, 0xffbffeff, 0x0), at 0xff2cc674
          [2] raise(0x6, 0x0, 0xff334f18, 0xff2abf30, 0xffffffff, 0x6), at 0xff265a74
          [3] abort(0x2cfc8, 0x1, 0xfeb00ab0, 0xeeb60, 0xff3333d8, 0x0), at 0xff24194c
          [4] os::abort(0x1, 0xfedca58c, 0x1, 0xfedb2000, 0x1858c, 0x18400), at 0xfeaf67b4
          [5] VMError::report_and_die(0xfeded4a8, 0x0, 0x1, 0xfed6095b, 0xfed67006, 0xfedf2ce8), at 0xfec088d8
          [6] JVM_handle_solaris_signal(0xb, 0xb17fcc98, 0xb17fc9e0, 0xafc00, 0xa5c800, 0x2f3a5), at 0xfe5b91e8
          [7] __sighndlr(0xb, 0xb17fcc98, 0xb17fc9e0, 0xfe5b8724, 0x0, 0x1), at 0xff2c8a94
          ---- called from signal handler with signal 11 (SIGSEGV) ------
          [8] arcfour_crypt(0x0, 0x12, 0xb17fdea8, 0x12, 0xb17fcea4, 0xb17fdea8), at 0xfbd58404
          [9] soft_arcfour_crypt(0x2cdb50, 0xb17fdea8, 0x12, 0xb17fcea4, 0xb17fdea4, 0x116), at 0xfbd3f288
          [10] C_EncryptUpdate(0x3, 0xb17fdea8, 0x12, 0xb17fcea4, 0xb17fdea4, 0x7), at 0xfbd364f0
          [11] Java_sun_security_pkcs11_wrapper_PKCS11_C_1EncryptUpdate(0xa5c910, 0x12, 0x2cdae8, 0xfbff6a80, 0x0, 0x0), at 0xfe025abc
          [12] 0xfc00d4a0(0x9ff1, 0xb17ff064, 0xb17fefc0, 0xffffff68, 0x4b067f88, 0x0), at 0xfc00d4a0
          [13] 0xfc00d44c(0xbb816510, 0xe6ec09a0, 0x0, 0x34, 0xe6ec09a0, 0xb17ff028), at 0xfc00d44c
          [14] 0xfc372194(0xc14dc2b0, 0xe6ec09a0, 0x5, 0x12, 0xe6ec09a0, 0x5), at 0xfc372194
          [15] 0xfc0059d0(0xc14dc2b0, 0xe6ec09a0, 0xb82f58d0, 0xfc017228, 0xe6ec09a0, 0xb17ff160), at 0xfc0059d0
          [16] 0xfc3abd3c(0xb81f3208, 0xc14dc278, 0xb81f2c50, 0xfc016470, 0xe6ecd850, 0xb17ff1d0), at 0xfc3abd3c
          [17] 0xfc005868(0xc14dba58, 0xb7, 0x0, 0xfc019da0, 0xfc274980, 0xb17ff268), at 0xfc005868
          [18] 0xfc005868(0xc14dba58, 0xb17ff3e8, 0x0, 0xfc017228, 0x1, 0xb17ff300), at 0xfc005868
          [19] 0xfc005868(0xc14dba58, 0x18, 0x0, 0xfc019da0, 0xb79cefb0, 0xb17ff390), at 0xfc005868
          [20] 0xfc005868(0xc14dba58, 0xb7, 0x0, 0xfc019da0, 0x58c00, 0xb17ff420), at 0xfc005868
          [21] 0xfc005868(0xc14dba58, 0xe6ec0868, 0x0, 0xfc017228, 0x2, 0xb17ff4c8), at 0xfc005868
          [22] 0xfc275f58(0xffffffff, 0xc14eed28, 0x0, 0x2000, 0x13, 0xb820ee20), at 0xfc275f58
          [23] 0xfc33d00c(0xc14dbb28, 0xfedec370, 0x3a36c, 0x0, 0x0, 0xfede13a5), at 0xfc33d00c
          [24] 0xfc2e54cc(0xc14dbb28, 0xc3a59658, 0x0, 0x1, 0x4, 0x0), at 0xfc2e54cc
          [25] 0xfc1b1e84(0x0, 0xc3a59658, 0x0, 0x1, 0xf069e9d0, 0x3a400), at 0xfc1b1e84
          [26] 0xfc26c590(0xc3a59658, 0xb8142320, 0xc14dbb48, 0x8, 0xff1e8000, 0x0), at 0xfc26c590
          [27] 0xfc005d88(0xb17fffa0, 0x3a370, 0x0, 0xfc0174a0, 0xe64b8b50, 0xb17ff760), at 0xfc005d88
          [28] 0xfc00021c(0xb17ff84c, 0xb17ffaf8, 0xa, 0xb782ec30, 0xfc00b3c0, 0xb17ff9e0), at 0xfc00021c
          [29] JavaCalls::call_helper(0xfc0001c0, 0xa5c800, 0x1, 0x58c3d8, 0xb782ec30, 0xb17ffaf8), at 0xfe551b94
          [30] JavaCalls::call_virtual(0xb17ffaf0, 0x58c3dc, 0x58c3e8, 0x860400, 0xb17ff9d8, 0xff79f98c), at 0xfe8ec220
          [31] JavaCalls::call_virtual(0xb17ffaf0, 0xb17ffaec, 0xb17ffae8, 0xb17ffae4, 0xb17ffae0, 0x58c3dc), at 0xfe5e5704
          [32] thread_entry(0xb7830cf8, 0xa5c800, 0x4d400, 0xfedffa48, 0xfedff7d4, 0xfedff524), at 0xfe5f8784
          [33] JavaThread::thread_main_inner(0xa5c800, 0x1d6ca8, 0x695, 0xb, 0xfedb2000, 0x0), at 0xfebb5318
          [34] java_start(0xa5c800, 0xb7d, 0xfedb2000, 0xfed00079, 0x5131e8, 0xfedfb3f4), at 0xfeaf5910

        At the very first glance this looks like a problem in the native pkcs11 libraries outside Java but this is only one part of the problem here. The libraries are called with an invalid session and this causes the coredump. Bug 6905996 was filed to address this issue and to detect this bogus argument to avoid the coredump.

        The root cause however is different and was captured using this dtrace script:

        -----------------------------------------------------------------------------
        #!/usr/sbin/dtrace -s

        BEGIN {
        printf("Target pid: %d\n", $target);
        }

        long active_session[long];

        pid$target::C_EncryptInit:entry
        {
          printf("session = %li (tid=%li)", arg0, (long)tid);
          active_session[arg0] = tid;
        }

        pid$target::C_EncryptInit:return
        { }

        pid$target::C_Encrypt:entry
        {
          self->my_session = arg0;
        }

        pid$target::C_Encrypt:return
        {
          active_session[self->my_session] = 0;
          self->my_session = 0;
        }

        pid$target::C_EncryptUpdate:entry
        / active_session[arg0] != tid /
        {
          printf("\nError:\n");
          printf("session = %li (owner = %li / caller = %li)\n", arg0, (long)active_session[arg0], (long)tid);
        }

        /*
        pid$target::C_EncryptUpdate:return
        {
        }
        */

        pid$target::C_EncryptFinal:entry
        {
          self->my_session = arg0;
        }

        pid$target::C_EncryptFinal:return
        {
          active_session[self->my_session] = 0;
          self->my_session = 0;
        }
        -----------------------------------------------------------------------------

        Output of a test with an affected application (tomcat server using JRE 6.0_17-b04:

        -----------------------------------------------------------------------------
        CPU ID FUNCTION:NAME
         17 1 :BEGIN Target pid: 27662

          2 111400 C_EncryptInit:entry session = 19233912 (tid=48)
          2 111402 C_EncryptInit:return
          2 111401 C_EncryptInit:entry session = 19233912 (tid=48)
          2 111403 C_EncryptInit:return
          0 111408 C_EncryptUpdate:entry
        Error:
        session = 6791712 (owner = 22 / caller = 42)

          0 111409 C_EncryptUpdate:entry
        Error:
        session = 6791712 (owner = 22 / caller = 42)

          0 111408 C_EncryptUpdate:entry
        Error:
        session = 6791712 (owner = 22 / caller = 33)

          0 111409 C_EncryptUpdate:entry
        Error:
        session = 6791712 (owner = 22 / caller = 33)

          0 111408 C_EncryptUpdate:entry
        Error:
        session = 6791712 (owner = 22 / caller = 33)

          0 111409 C_EncryptUpdate:entry
        Error:
        session = 6791712 (owner = 22 / caller = 33)

          0 111408 C_EncryptUpdate:entry
        Error:
        session = 6791712 (owner = 0 / caller = 1364)

          0 111409 C_EncryptUpdate:entry
        Error:
        session = 6791712 (owner = 0 / caller = 1364)

         16 111400 C_EncryptInit:entry session = 6791712 (tid=22)
         16 111402 C_EncryptInit:return
         16 111401 C_EncryptInit:entry session = 6791712 (tid=22)
         16 111403 C_EncryptInit:return
        dtrace: pid 27662 has exited
        -----------------------------------------------------------------------------

        The problem is that one thread is starting the crypto operation but another thread is continuing with the encryption. This violates the PKCS#11 standard (e.g. see section 6.7.6 of ftp://ftp.rsasecurity.com/pub/pkcs/pkcs-11/v2-30/pkcs-11v2-30b-d6.pdf) where a single operation must not be used simulatneous by different threads. Everything between the C_EncryptInit() and C_EncryptFinal() is one operation (even if in multiple parts) and must be run on the same thread.

        If multiple threads are used in parallel then each thread must have it's own session to be safe.

        In the example crash shown above the PCKS#11 functions have been called with this argument showing the broken session (context is the NULL pointer):

        > 0x002cdb50::print -t crypto_active_op_t
        {
            CK_MECHANISM mech = {
                CK_MECHANISM_TYPE mechanism = 0x111 (CKM_RC4)
                CK_VOID_PTR pParameter = 0
                CK_ULONG ulParameterLen = 0
            }
            void *context = 0
            uint32_t flags = 0
        }
        CRs 7025227,6932403 may be a big factor on the issues seen here. Improper (early) disposal of the Ciperlocks at SSL fatal call times and at closeSocket calls could lead to consequenses in the underlying Solaris native library calls. Improvements have been made to the SSLSocketImpl class as part of 7024697 & 7001094 fixes and initial (early) testing has shown no exceptions/crashes.
        Updated bug to reflect the correct bug IDs. - root cause is possibly linked to : CRs 7025227,6932403 (NOT 7024697,7001094)

          Attachments

            Issue Links

              Activity

                People

                Assignee:
                coffeys Sean Coffey
                Reporter:
                wley Wolfgang Ley (Inactive)
                Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                  Dates

                  Created:
                  Updated:
                  Resolved:
                  Imported:
                  Indexed: