Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4073195

(process) Process.destroy() isn't guaranteed to kill child process on Solaris

    Details

    • Type: Enhancement
    • Status: Resolved
    • Priority: P3
    • Resolution: Duplicate
    • Affects Version/s: 1.1.2, 1.1.5, 5.0, 6
    • Fix Version/s: None
    • Component/s: core-libs
    • Subcomponent:
    • CPU:
      x86, sparc
    • OS:
      solaris, solaris_2.5.1, solaris_2.6

      Description

      Name: joT67522 Date: 08/20/97


      This is a copy of the code fragment that creates the process
      and executes the printit shell script...
      ================================================================================
      =======
            try {
              Runtime rt = Runtime.getRuntime();
              Process pro = rt.exec("/export/home/usr/msgoldst/printit test.dat");
              DataInputStream in =
                new DataInputStream ( new BufferedInputStream(pro.getInputStream()));
              while ((s=in.readLine()) != null) {
                 System.out.println("out - > " + s);
              }
              DataInputStream ine =
                new DataInputStream ( new BufferedInputStream(pro.getErrorStream()));
              while ((s=in.readLine()) != null) {
                  System.out.println("err - > " + s);
              }
            }
            //.
            //. catch blocks follow for IOException and Exception ...
            //.
      ================================================================================
      ====
      This is a copy of the shell script that is called printit
      ================================================================================
      ====
      #!/bin/ksh
      lp -dit_printer $1
      exit
      ================================================================================
      ====
      Here is what the process table looks like after running the program ...
      Note process pid=24775. Every time the program runs I get a new
      <defunct> process.
      ================================================================================
      ====
      msgoldst 24775 24769 0 0:00 <defunct>
      msgoldst 24349 24347 0 08:10:12 pts/1 0:02 -ksh
      msgoldst 24781 24349 1 09:41:02 pts/1 0:00 grep gold
      msgoldst 24769 24753 14 09:40:47 pts/2 0:05 /export/home/usr/msgoldst/jdk1.1/
      bin/../bin/sparc/green_threads/java -Dorbixweb
      msgoldst 24753 24544 1 09:40:28 pts/2 0:00 orbixd
      msgoldst 24544 24542 0 08:36:08 pts/2 0:01 -ksh

      ================================================================================
      ====
      What Can I do about cleaning up the process so that there is no <defunct>
      processes ?
      Also note that I also experience that the program will hang up on occassion
      while
      handling the reading of standard input from the child process.

      I did discover the following on your fixed bugs for vesion 1.1.2 important known
      bugs in the Virtual machine :

      bug id Summary:
      -------
      --------------------------------------------------------------------------------
      ---------------------------------
      1237893 On Solaris platforms only, a blocking read of System.in blocks all
      threads. As a crude workaround, call the
                following readin() routine instead of System.in.read(): static int
      readin() throws IOException,
                InterruptedException It polls every 50 milliseconds, and it can't
      detect EOF, so its behavior is substandard, but
                it works

      I wonder if this is what is causing my problem on the process hanging on reading
      standard in ? I will try the suggested work-around for this problem but I don't
      know what to do about the <defunct> process problem ?
      ================================================================================
      ====

      Thanks in advance,

      Best Regards,

      Mark Goldston
      ###@###.###
      Rockwell Automotive
       
      company - Rockwell Authomotive , email - ###@###.###
      ======================================================================


      1) create dummy long-lived app

      public class Bar {

        public static void main(String args[])
        {
          // Just sit here.
          Object foo = new Object();
          synchronized (foo) {
            try {
      foo.wait();
            } catch (InterruptedException e) {}
          }
        }
      }

      2) create a script that runs the long-lived app.

      -rwxrwxr-x 1 abartle 19 Dec 26 12:34 runjava*
      #!/bin/sh
      java Bar

      3) create a Foo class that demonstrates the bug:
      Process.destroy won't kill a shell script

      import java.io.IOException;
      /**
       * Blah
       */
      public class Foo {

        public static void main(String args[])
        {
          try
            {
      Process foo = Runtime.getRuntime().exec("runjava");
      System.out.println("Started runjava");

      // Wait 10 seconds
      Thread.currentThread().sleep(10000);

      // try to kill the process
      System.out.println("Killing runjava");
      foo.destroy(); // doesn't work

      // get exit value -- will throw exception if process not killed!
      foo.exitValue();
            }
          catch (InterruptedException e) {System.out.println(e.getMessage());}
          catch (IOException e) {System.out.println(e.getMessage());}
          catch (IllegalThreadStateException e) {System.out.println(e.getMessage());}
        }
      }


      4) Look at the output of running Foo
      foomachine:% java Foo
      Started runjava
      Killing runjava
      process hasn't exited

      (it threw the IllegalThreadStateException)

      5) Verify that the Solaris process are all still running
       19073 pts/16 S 0:00 /usr/local/java/bin/../bin/sparc/green_threads/java Foo
       19078 pts/16 S 0:00 /bin/sh ./runjava
       19079 pts/16 S 0:00 /usr/local/java/bin/../bin/sparc/green_threads/java Bar

      Note BOTH the "runjava" and java Bar are still there!

      Thanks for your help,

      Aron

        Issue Links

          Activity

          Hide
          never Tom Rodriguez added a comment -
          BT2:EVALUATION

          Well, regrettably the two report that have been combined into this one have different causes. The first problem with defunct process probably is caused by reading from System.in as the submitter suggests. I've sent him email in hopes of getting a reply.

          The second report points at a few problem. I can explain to you what I think is going on. The Java VM using SIGTERM to implement Process.destroy(). Your runjava script uses sh and it hangs around. Apparently sh ignores SIGTERM when it's running so Process.destroy() has no effect. We could use SIGKILL but all that will end up doing is killing runjava but "java Bar" will continue running. Probably when we create a new process using Runtime.exec() it should be in it's own process group and we should kill the process group using SIGTERM and then wait a bit to see if it's still running and then kill it using SIGKILL. I guess that would work but it's fairly complicated.

          If you don't use the intermediate shell script then the program works fine. You may also be able to use a different shell or write a shell trap handler which would make your shell script work.

          The other problem this points at is that since the subprogram hasn't exited the VM hangs around because the reader threads it uses aren't daemons. This is fixed in 1.2, but I'm not sure of the bug id.

          If you have opinions about how this should behave, I'd like to hear them.
          tom.rodriguez@Eng 1998-03-04

          I think the code should probably be changed to use SIGINT. Potentially we could send SIGINT first, wait a moment and if it's still alive send SIGKILL but given the complexity of UNIX signals and process groups I'm not sure we can really get great semantics out of this.
          tom.rodriguez@Eng 1998-05-05
          Show
          never Tom Rodriguez added a comment - BT2:EVALUATION Well, regrettably the two report that have been combined into this one have different causes. The first problem with defunct process probably is caused by reading from System.in as the submitter suggests. I've sent him email in hopes of getting a reply. The second report points at a few problem. I can explain to you what I think is going on. The Java VM using SIGTERM to implement Process.destroy(). Your runjava script uses sh and it hangs around. Apparently sh ignores SIGTERM when it's running so Process.destroy() has no effect. We could use SIGKILL but all that will end up doing is killing runjava but "java Bar" will continue running. Probably when we create a new process using Runtime.exec() it should be in it's own process group and we should kill the process group using SIGTERM and then wait a bit to see if it's still running and then kill it using SIGKILL. I guess that would work but it's fairly complicated. If you don't use the intermediate shell script then the program works fine. You may also be able to use a different shell or write a shell trap handler which would make your shell script work. The other problem this points at is that since the subprogram hasn't exited the VM hangs around because the reader threads it uses aren't daemons. This is fixed in 1.2, but I'm not sure of the bug id. If you have opinions about how this should behave, I'd like to hear them. tom.rodriguez@Eng 1998-03-04 I think the code should probably be changed to use SIGINT. Potentially we could send SIGINT first, wait a moment and if it's still alive send SIGKILL but given the complexity of UNIX signals and process groups I'm not sure we can really get great semantics out of this. tom.rodriguez@Eng 1998-05-05
          Hide
          defectconv Defect Conversion BT2 (Inactive) added a comment -
          BT2:WORK AROUND

          Name: joT67522 Date: 08/20/97



          ======================================================================
          Show
          defectconv Defect Conversion BT2 (Inactive) added a comment - BT2:WORK AROUND Name: joT67522 Date: 08/20/97 ======================================================================
          Hide
          martin Martin Buchholz added a comment -
          BT2:WORK AROUND

          Implement a (non-portable) getPid for the Java VM
          for example, by running

          /bin/sh -c 'echo $PPID'

          Then you can (non-portably) collect process tree information,
          including all the children of this JVM using something like

          /usr/bin/ps -e -o 'pid,ppid' ...

          Then you can explicitly kill any descendant processes gathered from
          the previous analysis using

          /usr/xpg4/bin/kill

          using any desired death row policy.
          Show
          martin Martin Buchholz added a comment - BT2:WORK AROUND Implement a (non-portable) getPid for the Java VM for example, by running /bin/sh -c 'echo $PPID' Then you can (non-portably) collect process tree information, including all the children of this JVM using something like /usr/bin/ps -e -o 'pid,ppid' ... Then you can explicitly kill any descendant processes gathered from the previous analysis using /usr/xpg4/bin/kill using any desired death row policy.
          Hide
          martin Martin Buchholz added a comment -
          BT2:EVALUATION

          I think the Java platform is too mature today for us to
          actually change the way that Process.destroy() works.
          We could enable solutions for users by providing access to
          the child pid, so that users could more easily run
            kill -9 childPid
          themselves.

          See
          4244896: (process) Provide System.getPid(), System.killProcess(String pid)

          Or we could add a method
            Process.destroyWithoutMercy
          that would have the desired guaranteed kill semantics.

          But users are never satisfied. They probably also want a way
          to kill all descendants of the child process.

          Not at all easy. Changing to Cause Known, RFE.
          Show
          martin Martin Buchholz added a comment - BT2:EVALUATION I think the Java platform is too mature today for us to actually change the way that Process.destroy() works. We could enable solutions for users by providing access to the child pid, so that users could more easily run   kill -9 childPid themselves. See 4244896: (process) Provide System.getPid(), System.killProcess(String pid) Or we could add a method   Process.destroyWithoutMercy that would have the desired guaranteed kill semantics. But users are never satisfied. They probably also want a way to kill all descendants of the child process. Not at all easy. Changing to Cause Known, RFE.

            People

            • Assignee:
              robm Robert Mckenna
              Reporter:
              johsunw Joon Oh (Inactive)
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                Imported:
                Indexed: