Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8075484

SocketInputStream.socketRead0 can hang even with soTimeout set

    Details

    • Subcomponent:
    • Resolved In Build:
      b137
    • CPU:
      x86_64
    • OS:
      linux

      Backports

        Description

        FULL PRODUCT VERSION :


        A DESCRIPTION OF THE PROBLEM :
        As noted at JDK-8049846, the implementation of Java_java_net_SocketInputStream_socketRead0 assumes that read() won't block after poll() reports that a read is possible. This assumption does not hold, as noted on the man page for select (referenced by the man page for poll): Under Linux, select() may report a socket file descriptor as "ready for reading", while nevertheless a subsequent read blocks. This could for example happen when data has arrived but upon examination has wrong checksum and is discarded. There may be other circumstances in which a file descriptor is spuriously reported as ready. Thus it may be safer to use O_NONBLOCK on sockets that should not block.

        In production, we hit this about once a day. For testing and reproduction purposes, we can use fault injection to get spurious poll() results on demand.

        [This report is probably more appropriate as a comment on JDK-8049846, but commenting requires an account, and obtaining an account does not appear to be an easy task]

        STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
        1. Create file called poll.c with contents:
        -----
        #define _GNU_SOURCE
        #include <poll.h>
        #include <dlfcn.h>
        #include <stdio.h>

        int poll(struct pollfd *fds, nfds_t nfds, int timeout) {
          static int n;
          if ((++n & 0x3) == 0) {
            // Fault injection: perhaps we should report a spurious readiness notification
            int i;
            for (i = 0; i < nfds; ++i) {
              if (fds[i].events & POLLIN) {
                fds[i].revents |= POLLIN;
        return 1;
              }
            }
          }
          return ((int(*)(struct pollfd*,nfds_t,int))dlsym(RTLD_NEXT, "poll"))(fds, nfds, timeout);
        }
        -----

        2. Create file called OneReaderThread.java with contents:
        -----
        import java.io.IOException;
        import java.io.InputStream;
        import java.io.OutputStream;
        import java.net.ServerSocket;
        import java.net.Socket;

        public class OneReaderThread {
        public static void main(String[] args) throws IOException {
        @SuppressWarnings("resource")
        ServerSocket serverSocket = new ServerSocket(17291);
        new ReadingThreadA().start();
        Socket client = serverSocket.accept();
        OutputStream os = client.getOutputStream();
        byte[] writeData = new byte[2];
        for (;;) {
        waitingForA = true;
        long start = System.currentTimeMillis();
        while (waitingForA) {
        long now = System.currentTimeMillis();
        if (now > start + 500) {
        // 500ms have passed, which is 10x the read timeout
        System.out.println("Should never happen: A is unresponsive");
        os.write(writeData);
        break;
        }
        }
        }
        }

        private static volatile boolean waitingForA;

        private static final class ReadingThreadA extends Thread {
        @Override
        public void run() {
        try {
        @SuppressWarnings("resource")
        Socket s = new Socket("localhost", 17291);
        s.setSoTimeout(50); // SO_TIMEOUT is set, meaning that reads should not block for more than 50ms
        final InputStream is = s.getInputStream();
        byte[] readDataA = new byte[2];
        for (;;) {
        int n = 0;
        try {
        n = is.read(readDataA);
        } catch (IOException e) {
        // Ignore
        }
        System.out.println("A tick (" + n + ")");
        waitingForA = false; // This assignment should happen at least once every 50ms
        }
        } catch (Exception e) {
        e.printStackTrace();
        }
        }
        }
        }
        -----

        3. Compile the C code: gcc -o poll.so -shared poll.c -ldl -fPIC
        4. Compile the Java code: javac OneReaderThread.java
        5. Run the Java code with the C library preloaded: LD_PRELOAD=./poll.so java -cp . OneReaderThread

        EXPECTED VERSUS ACTUAL BEHAVIOR :
        EXPECTED -
        Expect an output stream consisting solely of:
        A tick (0)
        A tick (0)
        A tick (0)
        A tick (0)
        A tick (0)
        A tick (0)
        A tick (0)
        A tick (0)
        ACTUAL -
        Actual output stream is repetitions of:
        Should never happen: A is unresponsive
        A tick (2)
        A tick (0)
        A tick (0)
        A tick (0)
        Should never happen: A is unresponsive
        A tick (2)
        A tick (0)
        A tick (0)
        A tick (0)

        REPRODUCIBILITY :
        This bug can be reproduced occasionally.

        1. OneReaderThread.java
          1 kB
          Pardeep Sharma
        2. poll.c
          0.5 kB
          Pardeep Sharma

          Issue Links

            Activity

            Hide
            pardesha Pardeep Sharma added a comment -
            1) Attaching the testcase (poll.c, OneReaderThread.java).

            2) On a Linux 32-bit system, checked this for JDK 7u80, 8u31, 8u40, 8u60 b06, 9 ea b54 and could reproduce this issue.
            ****************************************************************************************
            7u80: FAIL
            8: FAIL
            8u31 : FAIL
            8u40: FAIL
            8u60 ea b05: FAIL
            9 ea b54: FAIL
            ******************************************************************************************
            Output of the testcase in 8u40:
            $ LD_PRELOAD=./poll.so /net/jre.us.oracle.com/onestop/j2sdk/1.8.0_40/latest/binaries/linux-i586/bin/java -cp . OneReaderThread
            Should never happen: A is unresponsive
            A tick (0)
            A tick (2)
            Should never happen: A is unresponsive
            A tick (2)
            A tick (0)
            A tick (0)
            A tick (0)
            Should never happen: A is unresponsive
            A tick (2)
            A tick (0)
            A tick (0)
            ...................


            Output of the testcase in 7u80:
            $ LD_PRELOAD=./poll.so /net/scanas412.us.oracle.com/export/java_re/jdk/7u80/latest/binaries/linux-i586/bin/java -cp . OneReaderThread
            Should never happen: A is unresponsive
            A tick (0)
            A tick (2)
            Should never happen: A is unresponsive
            A tick (2)
            A tick (0)
            A tick (0)
            A tick (0)
            Should never happen: A is unresponsive
            A tick (2)
            A tick (0)
            A tick (0)
            A tick (0)
            Should never happen: A is unresponsive
            ..............................
            Show
            pardesha Pardeep Sharma added a comment - 1) Attaching the testcase (poll.c, OneReaderThread.java). 2) On a Linux 32-bit system, checked this for JDK 7u80, 8u31, 8u40, 8u60 b06, 9 ea b54 and could reproduce this issue. **************************************************************************************** 7u80: FAIL 8: FAIL 8u31 : FAIL 8u40: FAIL 8u60 ea b05: FAIL 9 ea b54: FAIL ****************************************************************************************** Output of the testcase in 8u40: $ LD_PRELOAD=./poll.so /net/jre.us.oracle.com/onestop/j2sdk/1.8.0_40/latest/binaries/linux-i586/bin/java -cp . OneReaderThread Should never happen: A is unresponsive A tick (0) A tick (2) Should never happen: A is unresponsive A tick (2) A tick (0) A tick (0) A tick (0) Should never happen: A is unresponsive A tick (2) A tick (0) A tick (0) ................... Output of the testcase in 7u80: $ LD_PRELOAD=./poll.so /net/scanas412.us.oracle.com/export/java_re/jdk/7u80/latest/binaries/linux-i586/bin/java -cp . OneReaderThread Should never happen: A is unresponsive A tick (0) A tick (2) Should never happen: A is unresponsive A tick (2) A tick (0) A tick (0) A tick (0) Should never happen: A is unresponsive A tick (2) A tick (0) A tick (0) A tick (0) Should never happen: A is unresponsive ..............................
            Hide
            chegar Chris Hegarty added a comment -
            The connection between 'poll' and 'select' is tenuous, but I accept that there may be an issue, albeit rare. Some testing should be done to see if the issue can be reproduced in-house.

            It seems reasonable to prototype a version of socketRead0 that flips the non-blocking bit before and after, to see what other implications arise from doing that.
            Show
            chegar Chris Hegarty added a comment - The connection between 'poll' and 'select' is tenuous, but I accept that there may be an issue, albeit rare. Some testing should be done to see if the issue can be reproduced in-house. It seems reasonable to prototype a version of socketRead0 that flips the non-blocking bit before and after, to see what other implications arise from doing that.
            Hide
            vtewari Vyom Tewari added a comment -
            I will do the prototype, as I am suspecting that JDK-8134601 if failing intermittently because of the same root cause.
            Show
            vtewari Vyom Tewari added a comment - I will do the prototype, as I am suspecting that JDK-8134601 if failing intermittently because of the same root cause.
            Hide
            hgupdate HG Updates added a comment -
            URL: http://hg.openjdk.java.net/jdk9/dev/jdk/rev/af17b6bc08dd
            User: vtewari
            Date: 2016-09-13 11:30:45 +0000
            Show
            hgupdate HG Updates added a comment - URL: http://hg.openjdk.java.net/jdk9/dev/jdk/rev/af17b6bc08dd User: vtewari Date: 2016-09-13 11:30:45 +0000
            Hide
            hgupdate HG Updates added a comment -
            URL: http://hg.openjdk.java.net/jdk9/jdk9/jdk/rev/af17b6bc08dd
            User: lana
            Date: 2016-09-21 20:18:03 +0000
            Show
            hgupdate HG Updates added a comment - URL: http://hg.openjdk.java.net/jdk9/jdk9/jdk/rev/af17b6bc08dd User: lana Date: 2016-09-21 20:18:03 +0000
            Hide
            robm Robert Mckenna added a comment -
            Critical request as a) this was already in 17_04 and appears to have been pushed out for some reason and b) we have already committed to providing this fix in Oct in a number of escalations.
            Show
            robm Robert Mckenna added a comment - Critical request as a) this was already in 17_04 and appears to have been pushed out for some reason and b) we have already committed to providing this fix in Oct in a number of escalations.

              People

              • Assignee:
                vtewari Vyom Tewari
                Reporter:
                webbuggrp Webbug Group
              • Votes:
                0 Vote for this issue
                Watchers:
                12 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: