Uploaded image for project: 'Code Tools'
  1. Code Tools
  2. CODETOOLS-7902057

When calling ServerSocket.accept, AgentVM should consider timeoutFactor

    Details

    • Type: Enhancement
    • Status: Resolved
    • Priority: P2
    • Resolution: Fixed
    • Affects Version/s: jtreg4.2
    • Fix Version/s: jtreg4.2
    • Component/s: tools
    • Labels:
    • Subcomponent:
    • Resolved In Build:
      b10

      Description

      When running on slow machines with a large number of CPUs with jtreg4.2 b09 using agentvm mode and -Xcomp, we get frequent socket errors. For example, the following is an extract from the jtreg console log during an internal test run:

      [2017-10-18 02:33:37,303] Agent[6]: stderr: java.net.ConnectException: Connection refused (Connection refused)
      [2017-10-18 02:33:37,721] Agent[6]: stderr: at java.base/java.net.PlainSocketImpl.socketConnect(Native Method)
      [2017-10-18 02:33:38,478] Agent[6]: stderr: at java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:400)
      [2017-10-18 02:33:38,479] Agent[6]: stderr: at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:243)
      [2017-10-18 02:33:38,480] Agent[6]: stderr: at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:225)
      [2017-10-18 02:33:38,480] Agent[6]: stderr: at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:402)
      [2017-10-18 02:33:38,481] Agent[6]: stderr: at java.base/java.net.Socket.connect(Socket.java:591)
      [2017-10-18 02:33:38,482] Agent[6]: stderr: at java.base/java.net.Socket.connect(Socket.java:540)
      [2017-10-18 02:33:38,483] Agent[6]: stderr: at java.base/java.net.Socket.<init>(Socket.java:436)
      [2017-10-18 02:33:38,483] Agent[6]: stderr: at java.base/java.net.Socket.<init>(Socket.java:246)
      [2017-10-18 02:33:38,484] Agent[6]: stderr: at com.sun.javatest.regtest.agent.AgentServer.<init>(AgentServer.java:154)
      [2017-10-18 02:33:38,485] Agent[6]: stderr: at com.sun.javatest.regtest.agent.AgentServer.main(AgentServer.java:60)

      In this particular run, we pass the following arguments to JTREG:

      -concurrency:6 ... -agentvm ... -timeoutFactor:10 -J-Xmx512m ... \
          -javaoptions:-Xcomp ... -javaoptions:-XX:CompileThreshold=100 ... \
          -javaoptions:-server -javaoptions:-XX:-TieredCompilation ... \

      Note the large timeoutFactor is chosen because the tests are expected to run very slowly when -Xcomp is enabled.

      The problem is when jtreg spawns agent JVM processes, it uses a fixed 60 second timeout when calling ServerSocket.accept(). See http://hg.openjdk.java.net/code-tools/jtreg/file/684f12eef4a8/src/share/classes/com/sun/javatest/regtest/exec/Agent.java#l124

           final int ACCEPT_TIMEOUT = 60*1000; // 1 minute, for server to start and "phone home"
            ss.setSoTimeout(ACCEPT_TIMEOUT);
            Socket s = ss.accept();

      Due to -Xcomp, the agent JVM runs very slowly (printing a single stack trace for Agent[6] takes over 1 second!). So it's very likely that it would take over 60 seconds before executing the following code to connect to the above ServerSocket:

      http://hg.openjdk.java.net/code-tools/jtreg/file/684f12eef4a8/src/share/classes/com/sun/javatest/regtest/agent/AgentServer.java#l154

                if (port > 0) {
      154: Socket s = new Socket(host, port);

      The suggested fix is to increase ACCEPT_TIMEOUT w.r.t. timeoutFactor.

        Attachments

          Activity

            People

            • Assignee:
              jjg Jonathan Gibbons
              Reporter:
              iklam Ioi Lam
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: