Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-5088398

java.lang.instrument TCK test deadlock (test11)

    XMLWordPrintable

    Details

    • Subcomponent:
    • Resolved In Build:
      b26
    • CPU:
      generic
    • OS:
      generic
    • Verification:
      Verified

      Description

      Cheryl wrote:

      >One of my tests is intermittently (once every 10 or 20 times, maybe)
      >deadlocking. The thread dump is below. Can one of you investigate
      >and/or is this not interesting until I get a more final build
      >and/or is this most likely a problem in the test itself?
      >What this test does is register 4 trivial (no-op) transformers.
      >I'm not seeing deadlock in any of my other tests.
      >
      >thanks,
      >cheryl
      >
      >===================================================
      >***JPLIS ASSERTION FAILED***: "test == expected" at Reentrancy.c line: 80
      >***JPLIS ASSERTION FAILED***: "test == expected" at Reentrancy.c line: 80
      >Full thread dump Java HotSpot(TM) Client VM (20030619211432.rfield.hotspot
      >mixed mode):
      >
      >"Low Memory Detector" daemon prio=5 tid=0x009bad70 nid=0x3f0 waiting on
      >condition [0..0]
      >
      >"CompilerThread0" daemon prio=5 tid=0x009b9a10 nid=0x7f0 waiting on
      >condition [0..1822fa90]
      >
      >"Signal Dispatcher" daemon prio=10 tid=0x009b4008 nid=0x774 waiting on
      >condition [0..0]
      >
      >"Finalizer" daemon prio=9 tid=0x0093e848 nid=0x7fc in Object.wait()
      >[1816f000..1816fd94]
      > at java.lang.Object.wait(Native Method)
      > - waiting on <0x100a0198> (a java.lang.ref.ReferenceQueue$Lock)
      > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:111)
      > - locked <0x100a0198> (a java.lang.ref.ReferenceQueue$Lock)
      > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:127)
      > at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
      >
      >"Reference Handler" daemon prio=10 tid=0x0093d3c0 nid=0x810 waiting for
      >monitor entry [1812f000..1812fd94]
      > at
      >javasoft.sqe.tests.api.java.lang.instrument.ClassFileTransformer.Test11Agent
      >.transform(Test11Agent.java:60)
      > at
      >com.sun.lang.instrument.TransformerManager.transform(TransformerManager.java
      >:127)
      > at
      >com.sun.lang.instrument.InstrumentationImpl.transform(InstrumentationImpl.ja
      >va:158)
      > at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:122)
      >
      >"main" prio=5 tid=0x00235928 nid=0x6e4 waiting for monitor entry
      >[6f000..6fbfc]
      > at java.lang.ClassLoader.findBootstrapClass(Native Method)
      > at java.lang.ClassLoader.findBootstrapClass0(ClassLoader.java:709)
      > at java.lang.ClassLoader.loadClass(ClassLoader.java:284)
      > - locked <0x100a0450> (a sun.misc.Launcher$ExtClassLoader)
      > at java.lang.ClassLoader.loadClass(ClassLoader.java:282)
      > - locked <0x100a0220> (a sun.misc.Launcher$AppClassLoader)
      > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:274)
      > - locked <0x100a0220> (a sun.misc.Launcher$AppClassLoader)
      > at java.lang.ClassLoader.loadClass(ClassLoader.java:235)
      > at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:302)
      > - locked <0x100a0220> (a sun.misc.Launcher$AppClassLoader)
      >
      >"VM Thread" prio=5 tid=0x0097de88 nid=0x730 runnable
      >
      >"VM Periodic Task Thread" prio=10 tid=0x009af818 nid=0x7e4 waiting on
      >condition
      >
      >"Suspend Checker Thread" prio=10 tid=0x009a6618 nid=0x7f8 runnable
      >
      >
      >Found one Java-level deadlock:
      >=============================
      >"Reference Handler":
      > waiting to lock monitor 0x0093deac (object 0x100a0220, a
      >sun.misc.Launcher$AppClassLoader),
      > which is held by "main"
      >"main":
      > waiting to lock monitor 0x0093de8c (object 0x14013f18, a [[I),
      > which is held by "Reference Handler"
      >
      >Java stack information for the threads listed above:
      >===================================================
      >"Reference Handler":
      > at
      >javasoft.sqe.tests.api.java.lang.instrument.ClassFileTransformer.Test11Agent
      >.transform(Test11Agent.java:60)
      > at
      >com.sun.lang.instrument.TransformerManager.transform(TransformerManager.java
      >:127)
      > at
      >com.sun.lang.instrument.InstrumentationImpl.transform(InstrumentationImpl.ja
      >va:158)
      > at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:122)
      >"main":
      > at java.lang.ClassLoader.findBootstrapClass(Native Method)
      > at java.lang.ClassLoader.findBootstrapClass0(ClassLoader.java:709)
      > at java.lang.ClassLoader.loadClass(ClassLoader.java:284)
      > - locked <0x100a0450> (a sun.misc.Launcher$ExtClassLoader)
      > at java.lang.ClassLoader.loadClass(ClassLoader.java:282)
      > - locked <0x100a0220> (a sun.misc.Launcher$AppClassLoader)
      > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:274)
      > - locked <0x100a0220> (a sun.misc.Launcher$AppClassLoader)
      > at java.lang.ClassLoader.loadClass(ClassLoader.java:235)
      > at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:302)
      > - locked <0x100a0220> (a sun.misc.Launcher$AppClassLoader)
      >
      >Found 1 deadlock.
      >
      >
      >



      Assuming the line numbers haven't changed, the "Reference Handler"
      thread starts in
      ReferenceHandler.run and at this point holds no locks, it references a
      class ("Cleaner")
      which kicks off the JPLIS code. Unless I'm missing something, the only
      lock being
      grabbed by the JPLIS code (native and Java) is on the TransformerManager
      which is
      not at play here.

      On the "main" thread, regular class loading has locked the AppClassLoader.
      It is not clear to me why the native code of class loader is waiting to
      lock an instance of "int[][]"
      an examination of the code doesn't show anything at the top level.
      However that doesn't seem strange, what does seem strange to me is that
      the TCK code is
      locking the same array.

      Cheryl, what locks do you grab in this test?
      In particular, do you grab any locks on a two dimensional array of int,
      or on the class loader?

      -Robert



      ----------------------------------------------------------


      "Reference Handler":
       waiting to lock monitor 0x0093deac (object 0x100a0220, a
                                                     
      sun.misc.Launcher$AppClassLoader),
       which is held by "main"

          "Reference Handler":
           [3] at
          javasoft.sqe.tests.api.java.lang.instrument.ClassFileTransformer
                              .Test11Agent.transform(Test11Agent.java:60)
           [2] at com.sun.lang.instrument.TransformerManager
                              .transform(TransformerManager.java :127)
           [1] at com.sun.lang.instrument.InstrumentationImpl
                              .transform(InstrumentationImpl.java:158)
           [0] at
          java.lang.ref.Reference$ReferenceHandler.run(Reference.java:122)

      > Assuming the line numbers haven't changed, the "Reference Handler"
      > thread starts in
      > ReferenceHandler.run and at this point holds no locks, it references a
      > class ("Cleaner")
      > which kicks off the JPLIS code. Unless I'm missing something, the
      > only lock being
      > grabbed by the JPLIS code (native and Java) is on the
      > TransformerManager which is
      > not at play here.

      Cheryl wrote that the test doesn't explicitly grab any locks. So what
      Jeff postulated must
      be true:

          The deadlock detector says that this thread holds the [[I lock, which is
          why I am concluding that the "invisible to the java thread dumper"
          native code frames must be doing the acquisition of this lock.

      It seems to me that this could be occurring between [0] and [1] or after
      [3]. But I believe
      the below assumes you think this is happenning between [0] and [1]?
      That makes sense
      to me.

          The java transformer then tries to run, which causes an implicit
          load of a class
          that invokes the AppClassLoader. This tries to get AppClassLoader lock.

      Cheryl, is line 60 referencing a class? This would means we are doing a
      nested load.
      Which I guess is normally working as long as another thread isn't in the
      middle of a class
      load at the same time.

      "main":
       waiting to lock monitor 0x0093de8c (object 0x14013f18, a [[I),
       which is held by "Reference Handler"

          "main":
                 at java.lang.ClassLoader.findBootstrapClass(Native Method)
                 at
          java.lang.ClassLoader.findBootstrapClass0(ClassLoader.java:709)
                 at java.lang.ClassLoader.loadClass(ClassLoader.java:284)
                 - locked <0x100a0450> (a sun.misc.Launcher$ExtClassLoader)
                 at java.lang.ClassLoader.loadClass(ClassLoader.java:282)
                 - locked <0x100a0220> (a sun.misc.Launcher$AppClassLoader)
                 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:274)
                 - locked <0x100a0220> (a sun.misc.Launcher$AppClassLoader)
                 at java.lang.ClassLoader.loadClass(ClassLoader.java:235)
                 at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:302)
                 - locked <0x100a0220> (a sun.misc.Launcher$AppClassLoader)

      > On the "main" thread, regular class loading has locked the
      > AppClassLoader.

      I'm pretty sure what is happenning is that the main class is being
      loaded, and that is
      setting off a cascade of other loads. The only object lock grabs I saw
      in the VM
      (the SDK native layer is thin and I didn't see any lock grabs at all) is
      of the class
      loader, the constant pool, and the instance being created. So one
      possibility is that
      somewhere in that sequence a int[][] is being created -- but I can't
      figure out why
      the other thread would be locking that object. The other possibility
      is that this is
      some internal lock that I missed.

      I guess we should continue to try to figure out what is actually
      happenning, but it does
      raise a spec question -- should we say that an agent should assure that
      it's transformer
      code should (directly and indirectly) use only loaded classes?
      Otherwise we are putting
      the class loading mechanism in a situation it would normally never
      encounter. Designing
      the JPLIS code so it does not reenter (as it is designed) does nothing
      for this case.

      -Robert


      ----------------------------------------------------------


      INTERVENING INFO IN ATTACHMENTS

      ----------------------------------------------------------


      Hey, our latest testing with 5.0 betas (not sure exactly which, I can
      find out) is surfacing this deadlock again-- interleaved acquisition =
      of
      an AppClassLoader and an [[I. Any ideas? Did this ever get run down l=
      ast
      fall?
      =20
      jeff

      ----------------------------------------------------------



      We?re testing this with the JDK 5.0 Beta 3 Build 60 snapshot build. This definitely does appear to be a timing issue when one thread loads a class from within the transform method at the same time as any other thread is loading a class from the same classloader. It appears that the native code is holding a lock (on [[I) when it calls the transformers or else this issue would show up with any classloading from multiple threads and not just when the transformer hook is used. I?ve attached a test case that reproduces the problem (more details below). It shouldn?t require any changes to the specification, just a bug fix in the JVM so it doesn?t hold this lock when calling the transformers. This is a very important bug to us because it means that using the BCI hook can deadlock arbitrary threads in our customers applications and that is not something we are willing to do so we will not be able to use this feature until this bug is fixed. You stated that it is too late to fix this in Tiger. Did you mean it is too late to make specification changes or too late to do any bug fixing? If it is too late to do any bug fixing, would it be a candidate for a 5.0_01 patch release? Would it help if I file this through the standard developer connection website as well? Please let me know if there is any additional information I can provide or testing I can do to help get this resolved faster. Thanks for any help or information you can provide!

       

                  Daryl

       

      Test case details: The test case is in the attached zip file. The test is a very simple scenario of loading classes from multiple threads (one from within a transformer) and doing that repeatedly to try and hit the timing hole. The problem doesn?t happen every single time I run the test but I was able to reproduce it at least one out of every two or three runs on my Windows 2000 machine. I?ve included the source files, the jar files and a couple of batch scripts to build and run it respectively. The batch scripts are very simple, mainly wrappers around the javac, jar and java tools so they should be easy to replicate if you?re using a different platform. For output it prints out a period each iteration so you can see it making progress. If it completes 1000 iterations it prints out a success message and stops but if it deadlocks before then you can tell because it stops printing the periods at which point you can get a thread dump that reports the deadlock ? I?ve also attached output and a thread dump from one run where it deadlocked.

      ----------------------------------------------------------

      ZIP in attachments

      ----------------------------------------------------------

      Yes, I?m very confident that the [[I lock is not in the agent for a couple of reasons:

      1) The source code of the transformer involved does not acquire any locks (and definitely not on any int[][] objects)

      I?ve included the source code and you can verify this yourself

      2) The thread that is waiting to acquire the [[I lock has not yet reached the transformer (and is in a native method)

      In the sample output I sent you, the ?TestThread? is waiting to acquire that lock and the top of its stack trace is:

              at java.lang.ClassLoader.findBootstrapClass(Native Method)

              at java.lang.ClassLoader.findBootstrapClass0(ClassLoader.java:891)

              at java.lang.ClassLoader.loadClass(ClassLoader.java:301)

       

      Does this help? Will the bug be visible via the developer connection? If so, can you send me the bug number once you file it so we can keep track of it? Thanks!

       

                  Daryl

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              dcubed Daniel Daugherty
              Reporter:
              rfield Robert Field
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: