Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8046339

sun.rmi.transport.DGCAckHandler leaks memory

    Details

    • Subcomponent:
    • Resolved In Build:
      b106
    • Verification:
      Verified

      Backports

        Description

        Instances of class sun.rmi.transport.DGCAckHandler accumulate and eventually
        cause OutOfMemoryError. Memory leak is suspected.
         

          Issue Links

            Activity

            Hide
            smarks Stuart Marks added a comment - - edited
            Possible duplicate of JDK-7116204. Not sure which issue to close out as a duplicate of which; maybe close JDK-7116204 as a duplicate of this one, since this one is a shadow bug. On the other hand, this bug is confidential and JDK-7116204 is open.

            This bug probably applies to all current JDK releases, including 6u, 7u, 8u, and 9.
            Show
            smarks Stuart Marks added a comment - - edited Possible duplicate of JDK-7116204 . Not sure which issue to close out as a duplicate of which; maybe close JDK-7116204 as a duplicate of this one, since this one is a shadow bug. On the other hand, this bug is confidential and JDK-7116204 is open. This bug probably applies to all current JDK releases, including 6u, 7u, 8u, and 9.
            Hide
            smarks Stuart Marks added a comment - - edited
            Previously reported over two years ago by a CAP member against 6u29 (see JDK-7116204). We should probably fix this one. See that bug for further information in the Description and Comments. I've closed the other bug as a duplicate of this one.
            Show
            smarks Stuart Marks added a comment - - edited Previously reported over two years ago by a CAP member against 6u29 (see JDK-7116204 ). We should probably fix this one. See that bug for further information in the Description and Comments. I've closed the other bug as a duplicate of this one.
            Hide
            smarks Stuart Marks added a comment -
            Comment in email from Darryl Mocek, 2011-12-30:

            A DGCAckHandler (attached) gets created by a ConnectionOutputStream with a UID and placed into the static DGCAckHandler's idTable HashMap. When the ConnectionOutputStream's done method is called, the DGCAckHandler's startTimer method is called. This starts a 5 minute timer for the DGCAck to be received and schedules a Runnable for releasing the reference to the DGCAckHandler if the DGCAck isn't received. In TCPTransport's (also attached), handleMessages method, if the transport operation is a DGCAck, then DGCAckHandler.received is called and it's removed from idTable. However, if DGCAckHandler.received isn't called and the timer expires, the task is cancelled but the reference in the idTable is never removed. This causes idTable to continue to grow. The fix, which is in the attached DGCAckHandler, is to add idTable.remove(id) to the Runnable created in startTimer. I have verified this works by commenting out the implementation of received, simulating that it is never called and causing the timer to expire. What I need to do is to create a test which will cause the received method to never be called, causing the timer to expire and the reference to be removed via the Runnable. I haven't been able to get this to happen yet. A customer has reported this issue and I have requested more information on the configuration of their systems in the hopes of duplicating their problem, however I haven't received anything yet. Any suggestions on how to prevent the DGCAck from being called would help.
            Show
            smarks Stuart Marks added a comment - Comment in email from Darryl Mocek, 2011-12-30: A DGCAckHandler (attached) gets created by a ConnectionOutputStream with a UID and placed into the static DGCAckHandler's idTable HashMap. When the ConnectionOutputStream's done method is called, the DGCAckHandler's startTimer method is called. This starts a 5 minute timer for the DGCAck to be received and schedules a Runnable for releasing the reference to the DGCAckHandler if the DGCAck isn't received. In TCPTransport's (also attached), handleMessages method, if the transport operation is a DGCAck, then DGCAckHandler.received is called and it's removed from idTable. However, if DGCAckHandler.received isn't called and the timer expires, the task is cancelled but the reference in the idTable is never removed. This causes idTable to continue to grow. The fix, which is in the attached DGCAckHandler, is to add idTable.remove(id) to the Runnable created in startTimer. I have verified this works by commenting out the implementation of received, simulating that it is never called and causing the timer to expire. What I need to do is to create a test which will cause the received method to never be called, causing the timer to expire and the reference to be removed via the Runnable. I haven't been able to get this to happen yet. A customer has reported this issue and I have requested more information on the configuration of their systems in the hopes of duplicating their problem, however I haven't received anything yet. Any suggestions on how to prevent the DGCAck from being called would help.
            Hide
            smarks Stuart Marks added a comment -
            Here's a patch of the modified DGCAckHandler.java file that Darryl Mocek mentioned (see previous comment). The previous changeset was 00cd9dc3c2b5. The diagnosis and fix seem sensible but I haven't verified it. Also, this needs a regression test.

            --- DGCAckHandler.00cd9dc3c2b5.java 2014-06-09 15:37:50.000000000 -0700
            +++ DGCAckHandler.dmocek.2011-12-30.java 2014-06-09 15:30:44.000000000 -0700
            @@ -118,6 +118,9 @@
                     if (objList != null && task == null) {
                         task = scheduler.schedule(new Runnable() {
                             public void run() {
            + if (id != null) {
            + idTable.remove(id);
            + }
                                 release();
                             }
                         }, dgcAckTimeout, TimeUnit.MILLISECONDS);
            @@ -140,6 +143,9 @@
                  * release its references.
                  **/
                 public static void received(UID id) {
            + System.out.println("DGCAckHandler.received, about to print call stack.");
            + new Throwable().printStackTrace();
            + System.out.println("DGCAckHandler.received, after printing call stack.");
                     DGCAckHandler h = idTable.remove(id);
                     if (h != null) {
                         h.release();
            Show
            smarks Stuart Marks added a comment - Here's a patch of the modified DGCAckHandler.java file that Darryl Mocek mentioned (see previous comment). The previous changeset was 00cd9dc3c2b5. The diagnosis and fix seem sensible but I haven't verified it. Also, this needs a regression test. --- DGCAckHandler.00cd9dc3c2b5.java 2014-06-09 15:37:50.000000000 -0700 +++ DGCAckHandler.dmocek.2011-12-30.java 2014-06-09 15:30:44.000000000 -0700 @@ -118,6 +118,9 @@          if (objList != null && task == null) {              task = scheduler.schedule(new Runnable() {                  public void run() { + if (id != null) { + idTable.remove(id); + }                      release();                  }              }, dgcAckTimeout, TimeUnit.MILLISECONDS); @@ -140,6 +143,9 @@       * release its references.       **/      public static void received(UID id) { + System.out.println("DGCAckHandler.received, about to print call stack."); + new Throwable().printStackTrace(); + System.out.println("DGCAckHandler.received, after printing call stack.");          DGCAckHandler h = idTable.remove(id);          if (h != null) {              h.release();
            Hide
            smarks Stuart Marks added a comment -
            The fix above needs to be verified and a regression test written.
            Show
            smarks Stuart Marks added a comment - The fix above needs to be verified and a regression test written.
            Hide
            smarks Stuart Marks added a comment -
            To elaborate on my comment above, what I think this bug needs is 1) an evaluation of the DGC system and an analysis of this fix, and 2) a regression test.

            The purpose of 1) is to show that the fix is correct. Note that a regression test is not evidence of this; it's merely evidence that the change does what it's intended to do, not that the change is correct. Point 1) shouldn't be onerous, but it does require that someone spend some time to get up to speed with how DGC works.

            2) Can probably be written as a whitebox test that sets up some short timeouts, waits a bit, and then reflects on the idTable to ensure that the reference has been removed. It might be somewhat tricky to force the code into the timeout path, since the DGC acks are handled automatically at a fairly low level of the system. One way is to forcibly close the connection (again, using reflection to get at RMI internals). An alternative approach might be to fork a JVM in a subprocess and simply forcibly exit it at the right time.
            Show
            smarks Stuart Marks added a comment - To elaborate on my comment above, what I think this bug needs is 1) an evaluation of the DGC system and an analysis of this fix, and 2) a regression test. The purpose of 1) is to show that the fix is correct. Note that a regression test is not evidence of this; it's merely evidence that the change does what it's intended to do, not that the change is correct. Point 1) shouldn't be onerous, but it does require that someone spend some time to get up to speed with how DGC works. 2) Can probably be written as a whitebox test that sets up some short timeouts, waits a bit, and then reflects on the idTable to ensure that the reference has been removed. It might be somewhat tricky to force the code into the timeout path, since the DGC acks are handled automatically at a fairly low level of the system. One way is to forcibly close the connection (again, using reflection to get at RMI internals). An alternative approach might be to fork a JVM in a subprocess and simply forcibly exit it at the right time.
            Hide
            hgupdate HG Updates added a comment -
            URL: http://hg.openjdk.java.net/jdk9/dev/jdk/rev/18751144d0fc
            User: igerasim
            Date: 2016-02-10 13:18:34 +0000
            Show
            hgupdate HG Updates added a comment - URL: http://hg.openjdk.java.net/jdk9/dev/jdk/rev/18751144d0fc User: igerasim Date: 2016-02-10 13:18:34 +0000
            Hide
            hgupdate HG Updates added a comment -
            URL: http://hg.openjdk.java.net/jdk9/jdk9/jdk/rev/18751144d0fc
            User: lana
            Date: 2016-02-17 20:42:45 +0000
            Show
            hgupdate HG Updates added a comment - URL: http://hg.openjdk.java.net/jdk9/jdk9/jdk/rev/18751144d0fc User: lana Date: 2016-02-17 20:42:45 +0000

              People

              • Assignee:
                igerasim Ivan Gerasimov
                Reporter:
                shadowbug Shadow Bug
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: