Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8075621

JEP 279: Improve Test-Failure Troubleshooting

    Details

    • Author:
      Igor Ignatyev
    • JEP Type:
      Feature
    • Exposure:
      Open
    • Scope:
      Implementation
    • Discussion:
      hotspot dash dev at openjdk dot java dot net, core dash libs dash dev at openjdk dot java dot net
    • Effort:
      XS
    • Duration:
      XS
    • Alert Status:
       Green
    • JEP Number:
      279

      Description

      Summary

      Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures and timeouts.

      Goals

      Gather the following information to help diagnose test failures and timeouts:

      • For Java processes which are still running on a host after test failure or timeout:
        • C and Java stacks
        • Core dumps (minidumps on Windows)
        • Heap statistics
      • Environment information:
        • Running processes
        • CPU and I/O loads
        • Open files and sockets
        • Free disk space and memory
        • Most recent system messages and events

      We will develop a library that provides this functionality and co-locate the library sources with the product code.

      Motivation

      It is difficult to troubleshoot intermittent test failures when there is no information about the testing environment. Such test failures often depend on test execution order and concurrence, which makes it extremely difficult to reproduce them.

      Description

      Currently, there are two extension points in the jtreg test harness. The first one is the timeout handler, which jtreg runs when a test times out. The second one is the observer, which implements the observer design pattern to track different events in a test run. We will use these extension points to gather diagnostic information and develop a custom observer and timeout handler for jtreg.

      Information about environment and non-Java processes will be collected by running platform-specific commands. Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by JEP 228, e.g., the print_vm_state command which collects information similar to hs_err files. The information gathered will be stored for later inspection together with test results. The observer will collect the information on finishedTest events when tests fail.

      Since tests may create other processes, information about test processes and their child processes will be collected. To find such processes, the library will create a process tree with the original test process at the root.

      Library sources will be placed in the test directory in the top-level repository, and makefiles will be updated to build them and bundle them as a part of test bundles.

      Testing

      We will schedule regular testing which uses this library. When the results and test execution become stable, we will extend the use of the library to other components.

      Risks and Assumptions

      • Risk that execution of some commands can hang: To minimize this risk a command will be executed only for an allotted time and interrupted after that.
      • Running out of disk space on a host: The plan is to archive information, restrict the amount of saved information, and check free disk space before information collection.
      • Tools unavailable on a platform or host: If a tool is not available on a particular host or platform, the commands which depend on the missing tools will be skipped and a warning message will be added to the log file. Another possible solution is to download required tools from a known tools repository.
      • System resource exhaustion: Some failures can cause exhaustion of different types of system resources (CPU, memory, disk-space, etc.) or be caused by a lock of resources. Since it won't be possible to run commands to gather information in these situations, command execution will be skipped to prevent further system degradation.
      • Getting process trees in Java: Getting the process tree in Java requires the new process API described in JEP 102. Using the JDK under test as the stable JDK (i.e., the JDK which runs the jtreg test harness) may interfere with test results. To mitigate this, we will develop an alternative process-tree implementation. That implementation will simplify backporting this project into JDK 8.

        Issue Links

          Activity

          iignatyev Igor Ignatyev created issue -
          iignatyev Igor Ignatyev made changes -
          Field Original Value New Value
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures,

          Goals
          -----
          On test failures and timeouts get information about

           * free disk space


          Motivation
          ----------

          Description
          -----------

          Risks and Assumptions
          ---------------------
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures,

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - get next information
            - free disk space
            - other process
            - CPU load
            - I/O system load
            - open ports, connections
            - native and java stack of test processes if they are alive
            - core dump of alive test processes
            - last system message
            - java heap for alive test java processes

          Motivation
          ----------
          It's relative hard to


          Description
          -----------

          Risks and Assumptions
          ---------------------
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures,

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - get next information
            - free disk space
            - other process
            - CPU load
            - I/O system load
            - open ports, connections
            - native and java stack of test processes if they are alive
            - core dump of alive test processes
            - last system message
            - java heap for alive test java processes

          Motivation
          ----------
          It's relative hard to


          Description
          -----------

          Risks and Assumptions
          ---------------------
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures,

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - get next information
            - free disk space
            - other process
            - CPU load
            - I/O system load
            - open ports, connections
            - native and java stack of test processes if they are alive
            - core dump of alive test processes
            - last system message
            - java heap for alive test java processes
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relative hard especially without information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTHanress provides a possibility to execute a custom code on test timeout (timeout handler) and on different events from test execution(observer)

          Develop a tool which executes platform-specific commands to gather all required information.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          Test execution scripts should be updated to register the timeout handler and observer.

          Risks and Assumptions
          ---------------------
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures,

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - get next information
            - free disk space
            - other process
            - CPU load
            - I/O system load
            - open ports, connections
            - native and java stack of test processes if they are alive
            - core dump of alive test processes
            - last system message
            - java heap for alive test java processes
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relative hard especially without information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTHanress provides a possibility to execute a custom code on test timeout (timeout handler) and on different events from test execution(observer)

          Develop a tool which executes platform-specific commands to gather all required information.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          Test execution scripts should be updated to register the timeout handler and observer.

          Risks and Assumptions
          ---------------------
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures,

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - get next information
            - free disk space
            - other process
            - CPU load
            - I/O system load
            - open ports, connections
            - native and java stack of test processes if they are alive
            - core dump of alive test processes
            - last system message
            - java heap for alive test java processes
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relative hard especially without information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTHanress provides a possibility to execute a custom code on test timeout (timeout handler) and on different events from test execution(http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73 "observer")

          Develop a tool which executes platform-specific commands to gather all required information.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          Test execution scripts should be updated to register the timeout handler and observer.

          Risks and Assumptions
          ---------------------
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures,

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - get next information
            - free disk space
            - other process
            - CPU load
            - I/O system load
            - open ports, connections
            - native and java stack of test processes if they are alive
            - core dump of alive test processes
            - last system message
            - java heap for alive test java processes
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relative hard especially without information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTHanress provides a possibility to execute a custom code on test timeout (timeout handler) and on different events from test execution(http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73 "observer")

          Develop a tool which executes platform-specific commands to gather all required information.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          Test execution scripts should be updated to register the timeout handler and observer.

          Risks and Assumptions
          ---------------------
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures,

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - get next information
            - free disk space
            - other process
            - CPU load
            - I/O system load
            - open ports, connections
            - native and java stack of test processes if they are alive
            - core dump of alive test processes
            - last system message
            - java heap for alive test java processes
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relative hard especially without information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout ([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifices available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          Test execution scripts should be updated to register the timeout handler and observer.


          Risks and Assumptions
          ---------------------
          There is risk that execution some commands can work too long or timeout.
          To minimize that risk all commands will be executed only allotted time, after that time it should be interrupted.
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures,

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - get next information
            - free disk space
            - other process
            - CPU load
            - I/O system load
            - open ports, connections
            - native and java stack of test processes if they are alive
            - core dump of alive test processes
            - last system message
            - java heap for alive test java processes
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relative hard especially without information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout ([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifices available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          Test execution scripts should be updated to register the timeout handler and observer.


          Risks and Assumptions
          ---------------------
          There is risk that execution some commands can work too long or timeout.
          To minimize that risk all commands will be executed only allotted time, after that time it should be interrupted.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures,

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout ([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          Test execution scripts should be updated to register the timeout handler and observer.


          Risks and Assumptions
          ---------------------
          There is risk that execution of some commands can work too long or timeout.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          iignatyev Igor Ignatyev made changes -
          Fix Version/s 9 [ 14949 ]
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures,

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout ([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          Test execution scripts should be updated to register the timeout handler and observer.


          Risks and Assumptions
          ---------------------
          There is risk that execution of some commands can work too long or timeout.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures,

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in top-level closed repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.


          Risks and Assumptions
          ---------------------
          There is risk that execution of some commands can work too long or timeout.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          iignatyev Igor Ignatyev made changes -
          Labels a360_na
          iignatyev Igor Ignatyev made changes -
          Link This issue is blocked by JDK-8075647 [ JDK-8075647 ]
          iignatyev Igor Ignatyev made changes -
          Link This issue is blocked by JDK-8075648 [ JDK-8075648 ]
          iignatyev Igor Ignatyev made changes -
          Link This issue is blocked by JDK-8075649 [ JDK-8075649 ]
          iignatyev Igor Ignatyev made changes -
          Discussion jdk_confidential_ww_grp at oracle dot com
          lmesnik Leonid Mesnik made changes -
          Assignee Leonid Mesnik [ lmesnik ] Kirill Shirokov [ kshiroko ]
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures,

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in top-level closed repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.


          Risks and Assumptions
          ---------------------
          There is risk that execution of some commands can work too long or timeout.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps of alive test processes.
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in top-level closed repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps of alive test processes.
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in top-level closed repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps of alive test processes.
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in top-level closed repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          iignatyev Igor Ignatyev made changes -
          Assignee Kirill Shirokov [ kshiroko ] Igor Ignatyev [ iignatyev ]
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps of alive test processes.
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in top-level closed repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps of alive test processes.
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in top-level closed repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps of alive test processes.
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in top-level closed repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps of alive test processes.
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          iignatyev Igor Ignatyev made changes -
          Exposure Closed [ 19105 ] Open [ 19104 ]
          iignatyev Igor Ignatyev made changes -
          Discussion jdk_confidential_ww_grp at oracle dot com hotspot-dev at openjdk dot java dot net
          iignatyev Igor Ignatyev made changes -
          Discussion hotspot-dev at openjdk dot java dot net hotspot-dev at openjdk dot java dot net, core dash libs dash dev at openjdk dot java dot net
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps of alive test processes.
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          iignatyev Igor Ignatyev made changes -
          Link This issue is blocked by INTJDK-7614996 [ INTJDK-7614996 ]
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any of binary bundles should into repositories.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any of binary bundles should into repositories.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles should into repositories.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles should into repositories.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          shurailine Aleksandre Iline made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests execute concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests executed concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          shurailine Aleksandre Iline made changes -
          Reviewed By [shurailine]
          iignatyev Igor Ignatyev made changes -
          Summary Improve jtreg test failures troubleshooting Improve test failures troubleshooting for JTReg test suites
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - co-locate a tool which gather that information with product
           - obtain following information
            - native and java stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, the tests executed concurrently.
          That makes it's hard to reproduce these failures.

          Description
          -----------
          JTReg provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - native and java stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.

          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - native and java stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.

          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          Makefiles should be updated to build them and bundle as a part of test bundle.

          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - native and java stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          To distinguish test processes, the tool should create a process tree with an original test process as a root.
          Makefiles should be updated to build them and bundle as a part of test bundle.


          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - native and java stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          To distinguish test processes, the tool should create a process tree with an original test process as a root.
          Makefiles should be updated to build them and bundle as a part of test bundle.


          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - native and java stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          To distinguish test processes, the tool should create a process tree with an original test process as a root.
          Makefiles should be updated to build them and bundle as a part of test bundle.


          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Getting process tree in java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, writing process tree formation without JEP 102 is an acceptable trade-off.
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - native and java stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          To distinguish test processes, the tool should create a process tree with an original test process as a root.
          Makefiles should be updated to build them and bundle as a part of test bundle.


          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Getting process tree in java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, writing process tree formation without JEP 102 is an acceptable trade-off.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - native and java stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          To distinguish test processes, the tool should create a process tree with an original test process as a root.
          Makefiles should be updated to build them and bundle as a part of test bundle.


          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Getting process tree in java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this improvement into JDK 8.
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - native and java stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          To distinguish test processes, the tool should create a process tree with an original test process as a root.
          Makefiles should be updated to build them and bundle as a part of test bundle.


          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Getting process tree in java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this improvement into JDK 8.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - native and java stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          To distinguish test processes, the tool should create a process tree with an original test process as a root.
          Makefiles should be updated to build them and bundle as a part of test bundle.


          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Testing
          -------
          In the purpose of stabilizing, the tool will be run in one of nightlies for a couple of weeks.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Getting process tree in java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this improvement into JDK 8.
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - native and java stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          To distinguish test processes, the tool should create a process tree with an original test process as a root.
          Makefiles should be updated to build them and bundle as a part of test bundle.


          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Testing
          -------
          In the purpose of stabilizing, the tool will be run in one of nightlies for a couple of weeks.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Getting process tree in java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this improvement into JDK 8.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - C- and java-stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          To distinguish test processes, the tool should create a process tree with an original test process as a root.
          Makefiles should be updated to build them and bundle as a part of test bundle.


          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Testing
          -------
          In the purpose of stabilizing, the tool will be run in one of nightlies for a couple of weeks.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Getting process tree in java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this improvement into JDK 8.
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - C- and java-stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          To distinguish test processes, the tool should create a process tree with an original test process as a root.
          Makefiles should be updated to build them and bundle as a part of test bundle.


          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Testing
          -------
          In the purpose of stabilizing, the tool will be run in one of nightlies for a couple of weeks.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Getting process tree in java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this improvement into JDK 8.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - C- and java-stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          To collect information about JVM states, available diagnostic commands should be used.
          [JEP 228](JDK-8043764) adds several new commands which can be useful, e.g. 'print_vm_state' to gather information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          To distinguish test processes, the tool should create a process tree with an original test process as a root.
          Makefiles should be updated to build them and bundle as a part of test bundle.


          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Testing
          -------
          In the purpose of stabilizing, the tool will be run in one of nightlies for a couple of weeks.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Getting process tree in java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this improvement into JDK 8.
          iignatyev Igor Ignatyev made changes -
          Link This issue relates to JDK-8043764 [ JDK-8043764 ]
          iignatyev Igor Ignatyev made changes -
          Link This issue relates to JDK-8046092 [ JDK-8046092 ]
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - C- and java-stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          To collect information about JVM states, available diagnostic commands should be used.
          [JEP 228](JDK-8043764) adds several new commands which can be useful, e.g. 'print_vm_state' to gather information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          To distinguish test processes, the tool should create a process tree with an original test process as a root.
          Makefiles should be updated to build them and bundle as a part of test bundle.


          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Testing
          -------
          In the purpose of stabilizing, the tool will be run in one of nightlies for a couple of weeks.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Getting process tree in java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this improvement into JDK 8.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - C- and java-stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          To collect information about JVM states, available diagnostic commands should be used.
          [JEP 228](JDK-8043764) adds several new commands which can be useful, e.g. 'print_vm_state' to gather information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          To distinguish test processes, the tool should create a process tree with an original test process as a root.
          Makefiles should be updated to build them and bundle as a part of test bundle.


          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Testing
          -------
          In the purpose of stabilizing, the tool will be run in one of nightlies for a couple of weeks.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent future system degradation.
          - Getting process tree in java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this improvement into JDK 8.
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - C- and java-stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          To collect information about JVM states, available diagnostic commands should be used.
          [JEP 228](JDK-8043764) adds several new commands which can be useful, e.g. 'print_vm_state' to gather information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          To distinguish test processes, the tool should create a process tree with an original test process as a root.
          Makefiles should be updated to build them and bundle as a part of test bundle.


          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Testing
          -------
          In the purpose of stabilizing, the tool will be run in one of nightlies for a couple of weeks.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent future system degradation.
          - Getting process tree in java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this improvement into JDK 8.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - C- and java-stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          To collect information about JVM states, available diagnostic commands should be used.
          [JEP 228](JDK-8043764) adds several new commands which can be useful, e.g. `print_vm_state` to gather information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          To distinguish test processes, the tool should create a process tree with an original test process as a root.
          Makefiles should be updated to build them and bundle as a part of test bundle.


          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Testing
          -------
          In the purpose of stabilizing, the tool will be run in one of nightlies for a couple of weeks.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent future system degradation.
          - Getting process tree in java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this improvement into JDK 8.
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - C- and java-stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          To collect information about JVM states, available diagnostic commands should be used.
          [JEP 228](JDK-8043764) adds several new commands which can be useful, e.g. `print_vm_state` to gather information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          To distinguish test processes, the tool should create a process tree with an original test process as a root.
          Makefiles should be updated to build them and bundle as a part of test bundle.


          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Testing
          -------
          In the purpose of stabilizing, the tool will be run in one of nightlies for a couple of weeks.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host if there are many cores dumped.
          The plan is to archive dumps, restrict the number of saved cores or check how many free disk space we have before dumping the core.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent future system degradation.
          - Getting process tree in java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this improvement into JDK 8.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - C- and java-stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          To collect information about JVM states, available diagnostic commands should be used.
          [JEP 228](JDK-8043764) adds several new commands which can be useful, e.g. `print_vm_state` to gather information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          To distinguish test processes, the tool should create a process tree with an original test process as a root.
          Makefiles should be updated to build them and bundle as a part of test bundle.


          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Testing
          -------
          In the purpose of stabilizing, the tool will be run in one of nightlies for a couple of weeks.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how many free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent future system degradation.
          - Getting process tree in java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this improvement into JDK 8.
          iignatyev Igor Ignatyev made changes -
          Status Draft [ 10001 ] Submitted [ 10002 ]
          briangoetz Brian Goetz made changes -
          Reviewed By [shurailine] [shurailine, briangoetz]
          iignatyev Igor Ignatyev made changes -
          Assignee Igor Ignatyev [ iignatyev ] Mark Reinhold [ mr ]
          iignatyev Igor Ignatyev made changes -
          Assignee Mark Reinhold [ mr ] Igor Ignatyev [ iignatyev ]
          agarciar Aurelio Garcia-Ribeyro made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - C- and java-stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          To collect information about JVM states, available diagnostic commands should be used.
          [JEP 228](JDK-8043764) adds several new commands which can be useful, e.g. `print_vm_state` to gather information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          To distinguish test processes, the tool should create a process tree with an original test process as a root.
          Makefiles should be updated to build them and bundle as a part of test bundle.


          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Testing
          -------
          In the purpose of stabilizing, the tool will be run in one of nightlies for a couple of weeks.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how many free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent future system degradation.
          - Getting process tree in java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this improvement into JDK 8.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - C- and java-stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          To collect information about JVM states, available diagnostic commands should be used.
          [JEP 228](JDK-8043764) adds several new commands which can be useful, e.g. `print_vm_state` to gather information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          To distinguish test processes, the tool should create a process tree with an original test process as a root.
          Makefiles should be updated to build them and bundle as a part of test bundle.


          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Testing
          -------
          In the purpose of stabilizing, the tool will be run in one of nightlies for a couple of weeks.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent future system degradation.
          - Getting process tree in java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this improvement into JDK 8.
          skonchad Sandeep Konchady made changes -
          Summary Improve test failures troubleshooting for JTReg test suites Improve test failure troubleshooting for JTReg test suites
          iignatyev Igor Ignatyev made changes -
          Link This issue is blocked by JDK-8081071 [ JDK-8081071 ]
          iignatyev Igor Ignatyev made changes -
          Due Date 2015-08-02
          Integration Due 2015-07-12
          iignatyev Igor Ignatyev made changes -
          Integration Due 2015-07-12 2015-07-28
          iignatyev Igor Ignatyev made changes -
          Alert Status Yellow [ 2 ]
          Alert Reason Waiting for candidate review. Marking this as yellow and following the importance guideline suggested at release team for review and approve to next state.
          iignatyev Igor Ignatyev made changes -
          Alert Reason Waiting for candidate review. Marking this as yellow and following the importance guideline suggested at release team for review and approve to next state.
          iignatyev Igor Ignatyev made changes -
          Assignee Igor Ignatyev [ iignatyev ] Mark Reinhold [ mr ]
          iignatyev Igor Ignatyev made changes -
          Alert Reason Marking this as yellow and following the importance guideline suggested at release team for review and approve to next state
          iignatyev Igor Ignatyev made changes -
          Link This issue is blocked by JDK-8132030 [ JDK-8132030 ]
          iignatyev Igor Ignatyev made changes -
          Link This issue relates to CODETOOLS-7901480 [ CODETOOLS-7901480 ]
          iignatyev Igor Ignatyev made changes -
          Due Date 2015-08-02 2015-09-28
          Integration Due 2015-07-28 2015-09-14
          avoytilo Aleksey Voytilov (Inactive) made changes -
          Link This issue is blocked by CODETOOLS-7901452 [ CODETOOLS-7901452 ]
          iklam Ioi Lam made changes -
          Link This issue relates to JDK-8079586 [ JDK-8079586 ]
          iignatyev Igor Ignatyev made changes -
          Due Date 2015-09-28 2015-11-23
          Integration Due 2015-09-14 2015-11-09
          iignatyev Igor Ignatyev made changes -
          Due Date 2015-11-23 2015-09-28
          Integration Due 2015-11-09 2015-09-14
          iignatyev Igor Ignatyev made changes -
          Due Date 2015-09-28 2015-11-23
          iignatyev Igor Ignatyev made changes -
          Integration Due 2015-09-14 2015-11-09
          iignatyev Igor Ignatyev made changes -
          Alert Status Yellow [ 2 ] Red [ 3 ]
          iignatyev Igor Ignatyev made changes -
          Alert Reason Marking this as yellow and following the importance guideline suggested at release team for review and approve to next state Waiting for candidate review
          mr Mark Reinhold made changes -
          Status Submitted [ 10002 ] Draft [ 10001 ]
          iignatyev Igor Ignatyev made changes -
          Assignee Mark Reinhold [ mr ] Igor Ignatyev [ iignatyev ]
          iignatyev Igor Ignatyev made changes -
          Alert Reason Waiting for candidate review proofreading/correction
          iignatyev Igor Ignatyev made changes -
          Alert Status Red [ 3 ] Yellow [ 2 ]
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a tool which obtains following information
            - C- and java-stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the tool near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of environment during the failures.
          Such failures can depend on test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is the bunch of failures causes by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Recent JTReg version provides a possibility to execute a custom code on test timeout([timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java)) and on different events from test execution([observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73))

          Develop a tool which executes platform-specific commands to gather all required information.
          To collect information about JVM states, available diagnostic commands should be used.
          [JEP 228](JDK-8043764) adds several new commands which can be useful, e.g. `print_vm_state` to gather information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          A timeout handler and observer which execute that tool also should be developed.
          The observer should execute the tool only if test status is failed.
          The tool, observer and timeout handler will be placed in test directory in top-level repository.
          To distinguish test processes, the tool should create a process tree with an original test process as a root.
          Makefiles should be updated to build them and bundle as a part of test bundle.


          Test execution scripts should be updated to register the timeout handler and observer.

          Implementation mustn't integrate any binary bundles into repositories.

          Testing
          -------
          In the purpose of stabilizing, the tool will be run in one of nightlies for a couple of weeks.

          Risks and Assumptions
          ---------------------

          - There is risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and the warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent future system degradation.
          - Getting process tree in java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this improvement into JDK 8.
          Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a library which obtains following information
            - C- and java-stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the library near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of the environment during the failures.
          Such failures can depend on a test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is a bunch of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          There are two entry points to extend JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather required information.
          To accomplish this, we need to develop custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be gathered by leveraging platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since there are cases when a test creates other processes (including java), we need to gather process related information not only about the test process, but also about all its child processes.
          To find such processes, the library creates a process tree with an original test process as a root.

          In order to integrate usage of the library into the regular Open JDK developers workflow and to make it possible to use across all the components, the library sources should be placed in `test` directory in the top-level repository and makefiles should be updated to build them and bundle as a part of a test bundle.


          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for future investigation in case of test failures, timeouts.

          Goals
          -----
           - update test execution to collect information on test failures and timeouts
           - develop a library which obtains following information
            - C- and java-stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the library near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of the environment during the failures.
          Such failures can depend on a test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is a bunch of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          There are two entry points to extend JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather required information.
          To accomplish this, we need to develop custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be gathered by leveraging platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since there are cases when a test creates other processes (including java), we need to gather process related information not only about the test process, but also about all its child processes.
          To find such processes, the library creates a process tree with an original test process as a root.

          In order to integrate usage of the library into the regular Open JDK developers workflow and to make it possible to use across all the components, the library sources should be placed in `test` directory in the top-level repository and makefiles should be updated to build them and bundle as a part of a test bundle.


          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          Summary
          -------
          Collect information which can be useful for further investigation in case of test failures, timeouts.

          Goals
          -----
           - develop a library which obtains the following information in case of test failure, timeout
            - C- and java-stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the library near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of the environment during the failures.
          Such failures can depend on a test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is a bunch of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather required information.
          To accomplish this, we need to develop custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be gathered by leveraging platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since there are cases when a test creates other processes (including java), we need to gather process related information not only about the test process, but also about all its child processes.
          To find such processes, the library creates a process tree with an original test process as a root.

          In order to integrate usage of the library into the regular Open JDK developers workflow and to make it possible to use across all the components, the library sources should be placed in `test` directory in the top-level repository and makefiles should be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for further investigation in case of test failures, timeouts.

          Goals
          -----
           - develop a library which obtains the following information in case of test failure, timeout
            - C- and java-stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - java heap statistic for alive test java processes
            - last system/kernel messages
            - other processes
            - CPU load
            - I/O system load
            - free disk space
            - open sockets
            - open file descriptors
            - established network connections
            - list of logged in users
           - locate the library near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of the environment during the failures.
          Such failures can depend on a test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is a bunch of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather required information.
          To accomplish this, we need to develop custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be gathered by leveraging platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since there are cases when a test creates other processes (including java), we need to gather process related information not only about the test process, but also about all its child processes.
          To find such processes, the library creates a process tree with an original test process as a root.

          In order to integrate usage of the library into the regular Open JDK developers workflow and to make it possible to use across all the components, the library sources should be placed in `test` directory in the top-level repository and makefiles should be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          Summary
          -------
          Collect information which can be useful for further investigation in case of test failures, timeouts.

          Goals
          -----
           - develop a library which automatically obtains the following information in case of test failure, timeout
            - C- and Java-stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - Java heap statistic for alive test Java processes
            - Environment information
             - running processes
             - CPU, I/O loads
             - open files, sockets
             - free disk space, memory
             - last system messages, events
           - locate the library sources near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of the environment during the failures.
          Such failures can depend on a test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is a bunch of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather required information.
          To accomplish this, we need to develop custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be gathered by leveraging platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since there are cases when a test creates other processes (including java), we need to gather process related information not only about the test process, but also about all its child processes.
          To find such processes, the library creates a process tree with an original test process as a root.

          In order to integrate usage of the library into the regular Open JDK developers workflow and to make it possible to use across all the components, the library sources should be placed in `test` directory in the top-level repository and makefiles should be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Collect information which can be useful for further investigation in case of test failures, timeouts.

          Goals
          -----
           - develop a library which automatically obtains the following information in case of test failure, timeout
            - C- and Java-stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - Java heap statistic for alive test Java processes
            - Environment information
             - running processes
             - CPU, I/O loads
             - open files, sockets
             - free disk space, memory
             - last system messages, events
           - locate the library sources near the product code

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of the environment during the failures.
          Such failures can depend on a test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is a bunch of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather required information.
          To accomplish this, we need to develop custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be gathered by leveraging platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since there are cases when a test creates other processes (including java), we need to gather process related information not only about the test process, but also about all its child processes.
          To find such processes, the library creates a process tree with an original test process as a root.

          In order to integrate usage of the library into the regular Open JDK developers workflow and to make it possible to use across all the components, the library sources should be placed in `test` directory in the top-level repository and makefiles should be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          Summary
          -------
          Automatically collect diagnostic information which can be useful for further investigation in case of test failures, timeouts.

          Goals
          -----
           - develop a library which automatically obtains the following information in case of test failure, timeout
            - C- and Java-stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - Java heap statistic for alive test Java processes
            - Environment information
             - running processes
             - CPU, I/O loads
             - open files, sockets
             - free disk space, memory
             - last system messages, events
           - locate the library sources near the product code
           - make the usage of the library easy for OpenJDK developers

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of the environment during the failures.
          Such failures can depend on a test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is a bunch of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather required information.
          To accomplish this, we need to develop custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be gathered by leveraging platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since there are cases when a test creates other processes (including java), we need to gather process related information not only about the test process, but also about all its child processes.
          To find such processes, the library creates a process tree with an original test process as a root.

          In order to integrate usage of the library into the regular workflow of Open JDK developers and to make it possible to use across all the JDK components, the library sources should be placed in `test` directory in the top-level repository and makefiles should be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Automatically collect diagnostic information which can be useful for further investigation in case of test failures, timeouts.

          Goals
          -----
           - develop a library which automatically obtains the following information in case of test failure, timeout
            - C- and Java-stack of test processes if they are alive
            - core dumps (minidumps on windows) of alive test processes
            - Java heap statistic for alive test Java processes
            - Environment information
             - running processes
             - CPU, I/O loads
             - open files, sockets
             - free disk space, memory
             - last system messages, events
           - locate the library sources near the product code
           - make the usage of the library easy for OpenJDK developers

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of the environment during the failures.
          Such failures can depend on a test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is a bunch of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather required information.
          To accomplish this, we need to develop custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be gathered by leveraging platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since there are cases when a test creates other processes (including java), we need to gather process related information not only about the test process, but also about all its child processes.
          To find such processes, the library creates a process tree with an original test process as a root.

          In order to integrate usage of the library into the regular workflow of Open JDK developers and to make it possible to use across all the JDK components, the library sources should be placed in `test` directory in the top-level repository and makefiles should be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          Summary
          -------
          Automatically collect diagnostic information which can be useful for further investigation in case of test failures, timeouts.

          Goals
          -----
           - Develop a library which automatically obtains the following information in case of test failure, timeout
            - C- and Java-stack of test processes if they are alive
            - Core dumps (minidumps on windows) of alive test processes
            - Java heap statistic for alive test Java processes
            - Environment information
             - Running processes
             - CPU, I/O loads
             - Open files, sockets
             - Free disk space, memory
             - Last system messages, events
           - Locate the library sources near the product code
           - Make the usage of the library easy for OpenJDK developers

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of the environment during the failures.
          Such failures can depend on a test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is a bunch of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather required information.
          To accomplish this, we need to develop custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be gathered by leveraging platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since there are cases when a test creates other processes (including java), we need to gather process related information not only about the test process, but also about all its child processes.
          To find such processes, the library creates a process tree with an original test process as a root.

          In order to integrate usage of the library into the regular workflow of Open JDK developers and to make it possible to use across all the JDK components, the library sources should be placed in `test` directory in the top-level repository and makefiles should be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Automatically collect diagnostic information which can be useful for further investigation in case of test failures, timeouts.

          Goals
          -----
           - Develop a library which automatically obtains the following information in case of test failure, timeout
            - C- and Java-stack of test processes if they are alive
            - Core dumps (minidumps on windows) of alive test processes
            - Java heap statistic for alive test Java processes
            - Environment information
             - Running processes
             - CPU, I/O loads
             - Open files, sockets
             - Free disk space, memory
             - Last system messages, events
           - Locate the library sources near the product code
           - Make the usage of the library easy for OpenJDK developers

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of the environment during the failures.
          Such failures can depend on a test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is a bunch of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather required information.
          To accomplish this, we need to develop custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be gathered by leveraging platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since there are cases when a test creates other processes (including java), we need to gather process related information not only about the test process, but also about all its child processes.
          To find such processes, the library creates a process tree with an original test process as a root.

          In order to integrate usage of the library into the regular workflow of Open JDK developers and to make it possible to use across all the JDK components, the library sources should be placed in `test` directory in the top-level repository and makefiles should be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          Summary
          -------
          Automatically collect diagnostic information which can be useful for further investigation in case of test failures, timeouts.

          Goals
          -----
          - Develop a library which automatically obtains the following information in case of test failure, timeout
           - C- and Java-stack of test processes if they are alive
           - Core dumps (minidumps on windows) of alive test processes
           - Java heap statistic for alive test Java processes
           - Environment information
            - Running processes
            - CPU, I/O loads
            - Open files, sockets
            - Free disk space, memory
            - Last system messages, events
          - Locate the library sources near the product code.
          - Make the usage of the library easy for OpenJDK developers.

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of the environment during the failures.
          Such failures can depend on a test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is a bunch of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather required information.
          To accomplish this, we need to develop custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be gathered by leveraging platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since there are cases when a test creates other processes (including java), we need to gather process related information not only about the test process, but also about all its child processes.
          To find such processes, the library creates a process tree with an original test process as a root.

          In order to integrate usage of the library into the regular workflow of Open JDK developers and to make it possible to use across all the JDK components, the library sources should be placed in `test` directory in the top-level repository and makefiles should be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          stsmirno Stanislav Smirnov made changes -
          Description Summary
          -------
          Automatically collect diagnostic information which can be useful for further investigation in case of test failures, timeouts.

          Goals
          -----
          - Develop a library which automatically obtains the following information in case of test failure, timeout
           - C- and Java-stack of test processes if they are alive
           - Core dumps (minidumps on windows) of alive test processes
           - Java heap statistic for alive test Java processes
           - Environment information
            - Running processes
            - CPU, I/O loads
            - Open files, sockets
            - Free disk space, memory
            - Last system messages, events
          - Locate the library sources near the product code.
          - Make the usage of the library easy for OpenJDK developers.

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of the environment during the failures.
          Such failures can depend on a test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is a bunch of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather required information.
          To accomplish this, we need to develop custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be gathered by leveraging platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since there are cases when a test creates other processes (including java), we need to gather process related information not only about the test process, but also about all its child processes.
          To find such processes, the library creates a process tree with an original test process as a root.

          In order to integrate usage of the library into the regular workflow of Open JDK developers and to make it possible to use across all the JDK components, the library sources should be placed in `test` directory in the top-level repository and makefiles should be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          Summary
          -------
          Automatically collect diagnostic information which can be useful for further investigation in case of test failures, timeouts.

          Goals
          -----
          asd
          asdasd
          asdas

           - List item

           - Develop a library which automatically obtains the following information in case of test failure, timeout
           - C- and Java-stack of test processes if they are alive
           - Core dumps (minidumps on windows) of alive test processes
           - Java heap statistic for alive test Java processes
           - Environment information
           - Running processes
           - CPU, I/O loads
           - Open files, sockets
           - Free disk space, memory
           - Last system messages, events
           - Locate the library sources near the product code.
           - Make the usage of the library easy for OpenJDK developers.

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of the environment during the failures.
          Such failures can depend on a test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is a bunch of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather required information.
          To accomplish this, we need to develop custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be gathered by leveraging platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since there are cases when a test creates other processes (including java), we need to gather process related information not only about the test process, but also about all its child processes.
          To find such processes, the library creates a process tree with an original test process as a root.

          In order to integrate usage of the library into the regular workflow of Open JDK developers and to make it possible to use across all the JDK components, the library sources should be placed in `test` directory in the top-level repository and makefiles should be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          stsmirno Stanislav Smirnov made changes -
          Description Summary
          -------
          Automatically collect diagnostic information which can be useful for further investigation in case of test failures, timeouts.

          Goals
          -----
          asd
          asdasd
          asdas

           - List item

           - Develop a library which automatically obtains the following information in case of test failure, timeout
           - C- and Java-stack of test processes if they are alive
           - Core dumps (minidumps on windows) of alive test processes
           - Java heap statistic for alive test Java processes
           - Environment information
           - Running processes
           - CPU, I/O loads
           - Open files, sockets
           - Free disk space, memory
           - Last system messages, events
           - Locate the library sources near the product code.
           - Make the usage of the library easy for OpenJDK developers.

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of the environment during the failures.
          Such failures can depend on a test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is a bunch of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather required information.
          To accomplish this, we need to develop custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be gathered by leveraging platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since there are cases when a test creates other processes (including java), we need to gather process related information not only about the test process, but also about all its child processes.
          To find such processes, the library creates a process tree with an original test process as a root.

          In order to integrate usage of the library into the regular workflow of Open JDK developers and to make it possible to use across all the JDK components, the library sources should be placed in `test` directory in the top-level repository and makefiles should be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          Summary
          -------
          Automatically collect diagnostic information which can be useful for further investigation in case of test failures, timeouts.

          Goals
          -----

           - Develop a library which automatically obtains the following information in case of test failure, timeout
           - C- and Java-stack of test processes if they are alive
           - Core dumps (minidumps on windows) of alive test processes
           - Java heap statistic for alive test Java processes
           - Environment information
           - Running processes
           - CPU, I/O loads
           - Open files, sockets
           - Free disk space, memory
           - Last system messages, events
           - Locate the library sources near the product code.
           - Make the usage of the library easy for OpenJDK developers.

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of the environment during the failures.
          Such failures can depend on a test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is a bunch of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather required information.
          To accomplish this, we need to develop custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be gathered by leveraging platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since there are cases when a test creates other processes (including java), we need to gather process related information not only about the test process, but also about all its child processes.
          To find such processes, the library creates a process tree with an original test process as a root.

          In order to integrate usage of the library into the regular workflow of Open JDK developers and to make it possible to use across all the JDK components, the library sources should be placed in `test` directory in the top-level repository and makefiles should be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          stsmirno Stanislav Smirnov made changes -
          Description Summary
          -------
          Automatically collect diagnostic information which can be useful for further investigation in case of test failures, timeouts.

          Goals
          -----

           - Develop a library which automatically obtains the following information in case of test failure, timeout
           - C- and Java-stack of test processes if they are alive
           - Core dumps (minidumps on windows) of alive test processes
           - Java heap statistic for alive test Java processes
           - Environment information
           - Running processes
           - CPU, I/O loads
           - Open files, sockets
           - Free disk space, memory
           - Last system messages, events
           - Locate the library sources near the product code.
           - Make the usage of the library easy for OpenJDK developers.

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of the environment during the failures.
          Such failures can depend on a test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is a bunch of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather required information.
          To accomplish this, we need to develop custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be gathered by leveraging platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since there are cases when a test creates other processes (including java), we need to gather process related information not only about the test process, but also about all its child processes.
          To find such processes, the library creates a process tree with an original test process as a root.

          In order to integrate usage of the library into the regular workflow of Open JDK developers and to make it possible to use across all the JDK components, the library sources should be placed in `test` directory in the top-level repository and makefiles should be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          Summary
          -------
          Automatically collect diagnostic information which can be useful for further investigation in case of test failures, timeouts.

          Goals
          -----

           - Develop a library which automatically obtains the following information in case of test failure, timeout
            - C- and Java-stack of test processes if they are alive
            - Core dumps (minidumps on windows) of alive test processes
            - Java heap statistic for alive test Java processes
            - Environment information
            - Running processes
            - CPU, I/O loads
            - Open files, sockets
            - Free disk space, memory
            - Last system messages, events
           - Locate the library sources near the product code.
           - Make the usage of the library easy for OpenJDK developers.

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of the environment during the failures.
          Such failures can depend on a test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is a bunch of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather required information.
          To accomplish this, we need to develop custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be gathered by leveraging platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since there are cases when a test creates other processes (including java), we need to gather process related information not only about the test process, but also about all its child processes.
          To find such processes, the library creates a process tree with an original test process as a root.

          In order to integrate usage of the library into the regular workflow of Open JDK developers and to make it possible to use across all the JDK components, the library sources should be placed in `test` directory in the top-level repository and makefiles should be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          avoytilo Aleksey Voytilov (Inactive) made changes -
          Summary Improve test failure troubleshooting for JTReg test suites Improved JTReg Test Failure Troubleshooting
          avoytilo Aleksey Voytilov (Inactive) made changes -
          Description Summary
          -------
          Automatically collect diagnostic information which can be useful for further investigation in case of test failures, timeouts.

          Goals
          -----

           - Develop a library which automatically obtains the following information in case of test failure, timeout
            - C- and Java-stack of test processes if they are alive
            - Core dumps (minidumps on windows) of alive test processes
            - Java heap statistic for alive test Java processes
            - Environment information
            - Running processes
            - CPU, I/O loads
            - Open files, sockets
            - Free disk space, memory
            - Last system messages, events
           - Locate the library sources near the product code.
           - Make the usage of the library easy for OpenJDK developers.

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of the environment during the failures.
          Such failures can depend on a test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is a bunch of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather required information.
          To accomplish this, we need to develop custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be gathered by leveraging platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since there are cases when a test creates other processes (including java), we need to gather process related information not only about the test process, but also about all its child processes.
          To find such processes, the library creates a process tree with an original test process as a root.

          In order to integrate usage of the library into the regular workflow of Open JDK developers and to make it possible to use across all the JDK components, the library sources should be placed in `test` directory in the top-level repository and makefiles should be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          Summary
          -------
          Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures, timeouts.

          Goals
          -----

          Gather the following information to help diagnose test failures and timeouts

           - For Java processes which are still running on a host after test failure or timeout
            - C- and Java-stack
            - Core dumps (minidumps on windows)
            - Heap statistics
           - Environment information
            - Running processes
            - CPU, I/O loads
            - Open files, sockets
            - Free disk space, memory
            - Last system messages, events

          We will develop a library that provides this functionality and place the library sources with the product code.

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of the environment during the failures.
          Such failures can depend on a test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is a bunch of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather required information.
          To accomplish this, we need to develop custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be gathered by leveraging platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since there are cases when a test creates other processes (including java), we need to gather process related information not only about the test process, but also about all its child processes.
          To find such processes, the library creates a process tree with an original test process as a root.

          In order to integrate usage of the library into the regular workflow of Open JDK developers and to make it possible to use across all the JDK components, the library sources should be placed in `test` directory in the top-level repository and makefiles should be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          avoytilo Aleksey Voytilov (Inactive) made changes -
          Description Summary
          -------
          Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures, timeouts.

          Goals
          -----

          Gather the following information to help diagnose test failures and timeouts

           - For Java processes which are still running on a host after test failure or timeout
            - C- and Java-stack
            - Core dumps (minidumps on windows)
            - Heap statistics
           - Environment information
            - Running processes
            - CPU, I/O loads
            - Open files, sockets
            - Free disk space, memory
            - Last system messages, events

          We will develop a library that provides this functionality and place the library sources with the product code.

          Motivation
          ----------
          Troubleshooting of intermittent failures is relatively hard especially when there is no information about the state of the environment during the failures.
          Such failures can depend on a test execution order, concurrence.
          That makes it's hard to reproduce these failures.
          For instance, there is a bunch of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather required information.
          To accomplish this, we need to develop custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be gathered by leveraging platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information should be stored as artifacts available for later inspection.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since there are cases when a test creates other processes (including java), we need to gather process related information not only about the test process, but also about all its child processes.
          To find such processes, the library creates a process tree with an original test process as a root.

          In order to integrate usage of the library into the regular workflow of Open JDK developers and to make it possible to use across all the JDK components, the library sources should be placed in `test` directory in the top-level repository and makefiles should be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          Summary
          -------
          Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures, timeouts.

          Goals
          -----

          Gather the following information to help diagnose test failures and timeouts

           - For Java processes which are still running on a host after test failure or timeout
            - C- and Java-stack
            - Core dumps (minidumps on windows)
            - Heap statistics
           - Environment information
            - Running processes
            - CPU, I/O loads
            - Open files, sockets
            - Free disk space, memory
            - Last system messages, events

          We will develop a library that provides this functionality and place the library sources with the product code.

          Motivation
          ----------
          It is hard to troubleshoot intermittent test failures when there is no information about testing environment.
          Such test failures often depend on test execution order and concurrence, which makes it hard to reproduce them.
          For example, there is a set of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather diagnostic information and develop a custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be collected by executing platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information will be stored as artifacts available for later inspection together with test results.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since tests may create other processes (including Java), information about test processes and their child processes will be collected.
          To find such processes, the library creates a process tree with an original test process as a root.

          Library sources will be placed in `test` directory in the top-level repository and makefiles will be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          avoytilo Aleksey Voytilov (Inactive) made changes -
          Description Summary
          -------
          Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures, timeouts.

          Goals
          -----

          Gather the following information to help diagnose test failures and timeouts

           - For Java processes which are still running on a host after test failure or timeout
            - C- and Java-stack
            - Core dumps (minidumps on windows)
            - Heap statistics
           - Environment information
            - Running processes
            - CPU, I/O loads
            - Open files, sockets
            - Free disk space, memory
            - Last system messages, events

          We will develop a library that provides this functionality and place the library sources with the product code.

          Motivation
          ----------
          It is hard to troubleshoot intermittent test failures when there is no information about testing environment.
          Such test failures often depend on test execution order and concurrence, which makes it hard to reproduce them.
          For example, there is a set of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather diagnostic information and develop a custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be collected by executing platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information will be stored as artifacts available for later inspection together with test results.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since tests may create other processes (including Java), information about test processes and their child processes will be collected.
          To find such processes, the library creates a process tree with an original test process as a root.

          Library sources will be placed in `test` directory in the top-level repository and makefiles will be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will create a dedicated regular testing which uses this library.
          When the results and test execution will become stable, we will extend the usage of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          Summary
          -------
          Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures, timeouts.

          Goals
          -----

          Gather the following information to help diagnose test failures and timeouts

           - For Java processes which are still running on a host after test failure or timeout
            - C- and Java-stack
            - Core dumps (minidumps on windows)
            - Heap statistics
           - Environment information
            - Running processes
            - CPU, I/O loads
            - Open files, sockets
            - Free disk space, memory
            - Last system messages, events

          We will develop a library that provides this functionality and place the library sources with the product code.

          Motivation
          ----------
          It is hard to troubleshoot intermittent test failures when there is no information about testing environment.
          Such test failures often depend on test execution order and concurrence, which makes it hard to reproduce them.
          For example, there is a set of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather diagnostic information and develop a custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be collected by executing platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information will be stored as artifacts available for later inspection together with test results.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since tests may create other processes (including Java), information about test processes and their child processes will be collected.
          To find such processes, the library creates a process tree with an original test process as a root.

          Library sources will be placed in `test` directory in the top-level repository and makefiles will be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will setup a dedicated regular testing which uses this library.
          When the results and test execution becomes stable, we will extend the use of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          avoytilo Aleksey Voytilov (Inactive) made changes -
          Description Summary
          -------
          Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures, timeouts.

          Goals
          -----

          Gather the following information to help diagnose test failures and timeouts

           - For Java processes which are still running on a host after test failure or timeout
            - C- and Java-stack
            - Core dumps (minidumps on windows)
            - Heap statistics
           - Environment information
            - Running processes
            - CPU, I/O loads
            - Open files, sockets
            - Free disk space, memory
            - Last system messages, events

          We will develop a library that provides this functionality and place the library sources with the product code.

          Motivation
          ----------
          It is hard to troubleshoot intermittent test failures when there is no information about testing environment.
          Such test failures often depend on test execution order and concurrence, which makes it hard to reproduce them.
          For example, there is a set of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather diagnostic information and develop a custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be collected by executing platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information will be stored as artifacts available for later inspection together with test results.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since tests may create other processes (including Java), information about test processes and their child processes will be collected.
          To find such processes, the library creates a process tree with an original test process as a root.

          Library sources will be placed in `test` directory in the top-level repository and makefiles will be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will setup a dedicated regular testing which uses this library.
          When the results and test execution becomes stable, we will extend the use of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize that risk a command will be executed only allotted time, after that time it should be interrupted.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information or check how much free disk space we have before gathering.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missed tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from binary artifatory.
          - Resource exhaustion.
          Some failures can cause exhaust different type of resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In those situations, it won't be possible to run commands to gather information, so commands execution should be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using this API implies using the tested JDK as a stable JDK -- JDK which runs JTreg test harness.
          Given the facts that there are bugs in JVM, core-libs, new process API, this can lead to discredit of all test results.
          Another aspect of problems with using tested JDK as a stable one is modularity.
          Though JTreg is able to run tests in a modular environment, JTreg itself cannot be run in such an environment.
          In addition, the gathering tool can also be incompatible with Jigsaw, e.g. it can use an internal API.
          Taking into account all these risks, having alternative implementation for process tree formation without JEP 102 is an acceptable trade-off.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          Summary
          -------
          Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures, timeouts.

          Goals
          -----

          Gather the following information to help diagnose test failures and timeouts

           - For Java processes which are still running on a host after test failure or timeout
            - C- and Java-stack
            - Core dumps (minidumps on windows)
            - Heap statistics
           - Environment information
            - Running processes
            - CPU, I/O loads
            - Open files, sockets
            - Free disk space, memory
            - Last system messages, events

          We will develop a library that provides this functionality and place the library sources with the product code.

          Motivation
          ----------
          It is hard to troubleshoot intermittent test failures when there is no information about testing environment.
          Such test failures often depend on test execution order and concurrence, which makes it hard to reproduce them.
          For example, there is a set of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather diagnostic information and develop a custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be collected by executing platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information will be stored as artifacts available for later inspection together with test results.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since tests may create other processes (including Java), information about test processes and their child processes will be collected.
          To find such processes, the library creates a process tree with an original test process as a root.

          Library sources will be placed in `test` directory in the top-level repository and makefiles will be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will setup a dedicated regular testing which uses this library.
          When the results and test execution becomes stable, we will extend the use of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize this risk a command will be executed only for allotted time and interrupted after that.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information and check free disk space before information collection.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missing tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from a binary artifatory.
          - System resource exhaustion.
          Some failures can cause exhaustion of different types of system resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In these situations, it won't be possible to run commands to gather information, so command execution will be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using JDK under test as stable JDK (JDK which runs JTReg test harness) may discredit test results.
          To mitigate this, we are going to develop alternative implementation of process tree formation.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          avoytilo Aleksey Voytilov (Inactive) made changes -
          Description Summary
          -------
          Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures, timeouts.

          Goals
          -----

          Gather the following information to help diagnose test failures and timeouts

           - For Java processes which are still running on a host after test failure or timeout
            - C- and Java-stack
            - Core dumps (minidumps on windows)
            - Heap statistics
           - Environment information
            - Running processes
            - CPU, I/O loads
            - Open files, sockets
            - Free disk space, memory
            - Last system messages, events

          We will develop a library that provides this functionality and place the library sources with the product code.

          Motivation
          ----------
          It is hard to troubleshoot intermittent test failures when there is no information about testing environment.
          Such test failures often depend on test execution order and concurrence, which makes it hard to reproduce them.
          For example, there is a set of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather diagnostic information and develop a custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be collected by executing platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information will be stored as artifacts available for later inspection together with test results.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since tests may create other processes (including Java), information about test processes and their child processes will be collected.
          To find such processes, the library creates a process tree with an original test process as a root.

          Library sources will be placed in `test` directory in the top-level repository and makefiles will be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will setup a dedicated regular testing which uses this library.
          When the results and test execution becomes stable, we will extend the use of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize this risk a command will be executed only for allotted time and interrupted after that.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information and check free disk space before information collection.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missing tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from a binary artifatory.
          - System resource exhaustion.
          Some failures can cause exhaustion of different types of system resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In these situations, it won't be possible to run commands to gather information, so command execution will be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using JDK under test as stable JDK (JDK which runs JTReg test harness) may discredit test results.
          To mitigate this, we are going to develop alternative implementation of process tree formation.
          Using this implementation if JEP 102 is unavailable will also simplify backporting this project into JDK 8.
          Summary
          -------
          Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures, timeouts.

          Goals
          -----

          Gather the following information to help diagnose test failures and timeouts

           - For Java processes which are still running on a host after test failure or timeout
            - C- and Java-stack
            - Core dumps (minidumps on windows)
            - Heap statistics
           - Environment information
            - Running processes
            - CPU, I/O loads
            - Open files, sockets
            - Free disk space, memory
            - Last system messages, events

          We will develop a library that provides this functionality and place the library sources with the product code.

          Motivation
          ----------
          It is hard to troubleshoot intermittent test failures when there is no information about testing environment.
          Such test failures often depend on test execution order and concurrence, which makes it hard to reproduce them.
          For example, there is a set of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather diagnostic information and develop a custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be collected by executing platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information will be stored as artifacts available for later inspection together with test results.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since tests may create other processes (including Java), information about test processes and their child processes will be collected.
          To find such processes, the library creates a process tree with an original test process as a root.

          Library sources will be placed in `test` directory in the top-level repository and makefiles will be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will setup a dedicated regular testing which uses this library.
          When the results and test execution becomes stable, we will extend the use of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize this risk a command will be executed only for allotted time and interrupted after that.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information and check free disk space before information collection.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missing tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from a binary artifatory.
          - System resource exhaustion.
          Some failures can cause exhaustion of different types of system resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In these situations, it won't be possible to run commands to gather information, so command execution will be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using JDK under test as stable JDK (JDK which runs JTReg test harness) may discredit test results.
          To mitigate this, we are going to develop alternative implementation of process tree formation.
          This implementation will also simplify backporting this project into JDK 8.
          avoytilo Aleksey Voytilov (Inactive) made changes -
          Description Summary
          -------
          Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures, timeouts.

          Goals
          -----

          Gather the following information to help diagnose test failures and timeouts

           - For Java processes which are still running on a host after test failure or timeout
            - C- and Java-stack
            - Core dumps (minidumps on windows)
            - Heap statistics
           - Environment information
            - Running processes
            - CPU, I/O loads
            - Open files, sockets
            - Free disk space, memory
            - Last system messages, events

          We will develop a library that provides this functionality and place the library sources with the product code.

          Motivation
          ----------
          It is hard to troubleshoot intermittent test failures when there is no information about testing environment.
          Such test failures often depend on test execution order and concurrence, which makes it hard to reproduce them.
          For example, there is a set of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather diagnostic information and develop a custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be collected by executing platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information will be stored as artifacts available for later inspection together with test results.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since tests may create other processes (including Java), information about test processes and their child processes will be collected.
          To find such processes, the library creates a process tree with an original test process as a root.

          Library sources will be placed in `test` directory in the top-level repository and makefiles will be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will setup a dedicated regular testing which uses this library.
          When the results and test execution becomes stable, we will extend the use of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize this risk a command will be executed only for allotted time and interrupted after that.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information and check free disk space before information collection.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missing tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from a binary artifatory.
          - System resource exhaustion.
          Some failures can cause exhaustion of different types of system resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In these situations, it won't be possible to run commands to gather information, so command execution will be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using JDK under test as stable JDK (JDK which runs JTReg test harness) may discredit test results.
          To mitigate this, we are going to develop alternative implementation of process tree formation.
          This implementation will also simplify backporting this project into JDK 8.
          Summary
          -------
          Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures, timeouts.

          Goals
          -----

          Gather the following information to help diagnose test failures and timeouts

           - For Java processes which are still running on a host after test failure or timeout
            - C- and Java-stack
            - Core dumps (minidumps on windows)
            - Heap statistics
           - Environment information
            - Running processes
            - CPU, I/O loads
            - Open files, sockets
            - Free disk space, memory
            - Last system messages, events

          We will develop a library that provides this functionality and place the library sources with the product code.

          Motivation
          ----------
          It is hard to troubleshoot intermittent test failures when there is no information about testing environment.
          Such test failures often depend on test execution order and concurrence, which makes it hard to reproduce them.
          For example, there is a set of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather diagnostic information and develop a custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be collected by executing platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information will be stored as artifacts available for later inspection together with test results.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since tests may create other processes (including Java), information about test processes and their child processes will be collected.
          To find such processes, the library creates a process tree with an original test process as a root.

          Library sources will be placed in `test` directory in the top-level repository and makefiles will be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will setup a dedicated regular testing which uses this library.
          When the results and test execution becomes stable, we will extend the use of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize this risk a command will be executed only for allotted time and interrupted after that.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information and check free disk space before information collection.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missing tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from a binary artifatory.
          - System resource exhaustion.
          Some failures can cause exhaustion of different types of system resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In these situations, it won't be possible to run commands to gather information, so command execution will be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using JDK under test as stable JDK (JDK which runs JTReg test harness) may discredit test results.
          To mitigate this, we are going to develop alternative implementation of process tree formation.
          This implementation will simplify backporting this project into JDK 8.
          iignatyev Igor Ignatyev made changes -
          Description Summary
          -------
          Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures, timeouts.

          Goals
          -----

          Gather the following information to help diagnose test failures and timeouts

           - For Java processes which are still running on a host after test failure or timeout
            - C- and Java-stack
            - Core dumps (minidumps on windows)
            - Heap statistics
           - Environment information
            - Running processes
            - CPU, I/O loads
            - Open files, sockets
            - Free disk space, memory
            - Last system messages, events

          We will develop a library that provides this functionality and place the library sources with the product code.

          Motivation
          ----------
          It is hard to troubleshoot intermittent test failures when there is no information about testing environment.
          Such test failures often depend on test execution order and concurrence, which makes it hard to reproduce them.
          For example, there is a set of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We are going to use this extension mechanism to gather diagnostic information and develop a custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be collected by executing platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to hs_err files.
          Gathered information will be stored as artifacts available for later inspection together with test results.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since tests may create other processes (including Java), information about test processes and their child processes will be collected.
          To find such processes, the library creates a process tree with an original test process as a root.

          Library sources will be placed in `test` directory in the top-level repository and makefiles will be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will setup a dedicated regular testing which uses this library.
          When the results and test execution becomes stable, we will extend the use of the library to other components.

          Risks and Assumptions
          ---------------------
          - There is a risk that execution of some commands can work too long or hang.
          To minimize this risk a command will be executed only for allotted time and interrupted after that.
          - Running out of disk space on a host.
          The plan is to archive information, restrict the amount of saved information and check free disk space before information collection.
          - Tools unavailability on a platform, host.
          In that case, the commands which depend on the missing tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from a binary artifatory.
          - System resource exhaustion.
          Some failures can cause exhaustion of different types of system resources (CPU, memory, disk-space, etc) or be caused by such an exhaustion.
          In these situations, it won't be possible to run commands to gather information, so command execution will be skipped to prevent further system degradation.
          - Getting process tree in Java requires new process API -- [JEP 102](JDK-8046092).
          Using JDK under test as stable JDK (JDK which runs JTReg test harness) may discredit test results.
          To mitigate this, we are going to develop alternative implementation of process tree formation.
          This implementation will simplify backporting this project into JDK 8.
          Summary
          -------
          Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures and timeouts.

          Goals
          -----

          Gather the following information to help diagnose test failures and timeouts

           - For Java processes which are still running on a host after test failure or timeout
            - C- and Java-stack
            - Core dumps (minidumps on windows)
            - Heap statistics
           - Environment information
            - Running processes
            - CPU, I/O loads
            - Open files, sockets
            - Free disk space, memory
            - Last system messages, events

          We will develop a library that provides this functionality and co-locate the library sources with the product code.

          Motivation
          ----------
          It is difficult to troubleshoot intermittent test failures when there is no information about testing environment.
          Such test failures often depend on test execution order and concurrence, which makes it extremely difficult to reproduce them.
          For example, there is a set of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We will use this extension mechanism to gather diagnostic information and develop a custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be collected by executing platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to `hs_err` files.
          Gathered information will be stored as artifacts available for later inspection together with test results.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since tests may create other processes (including Java), information about test processes and their child processes will be collected.
          To find such processes, the library creates a process tree with an original test process as a root.

          Library sources will be placed in `test` directory in the top-level repository and makefiles will be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will schedule regular testing which uses this library.
          When the results and test execution becomes stable, we will extend the use of the library to other components.

          Risks and Assumptions
          ---------------------
          - *Risk that execution of some commands can hang:*
          To minimize this risk a command will be executed only for an allotted time and interrupted after that.
          - *Running out of disk space on a host:*
          Plan is to archive information, restrict the amount of saved information and check free disk space before information collection.
          - *Tools unavailability on a platform or host:*
          If a tool is not available on a particular host or platform, the commands which depend on the missing tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from a known tools repository.
          - *System resource exhaustion:*
          Some failures can cause exhaustion of different types of system resources (CPU, memory, disk-space, etc) or be caused by a lock of resources.
          Since it won't be possible to run commands to gather information in these situations, command execution will be skipped to prevent further system degradation.
          - *Getting process tree in Java:*
          Getting process tree in Java requires a new process API described in [JEP 102](JDK-8046092).
          Using the JDK under test as the stable JDK (JDK which runs JTReg test harness) may discredit test results.
          To mitigate this, we are going to develop alternative implementation of process tree formation.
          This implementation will simplify backporting this project into JDK 8.
          iignatyev Igor Ignatyev made changes -
          Alert Status Yellow [ 2 ] Red [ 3 ]
          Alert Reason proofreading/correction Waiting for candidate review
          Assignee Igor Ignatyev [ iignatyev ] Mark Reinhold [ mr ]
          iignatyev Igor Ignatyev made changes -
          Assignee Mark Reinhold [ mr ] Igor Ignatyev [ iignatyev ]
          iignatyev Igor Ignatyev made changes -
          Status Draft [ 10001 ] Submitted [ 10002 ]
          iignatyev Igor Ignatyev made changes -
          Assignee Igor Ignatyev [ iignatyev ] Mark Reinhold [ mr ]
          iignatyev Igor Ignatyev made changes -
          Link This issue is blocked by JDK-8132961 [ JDK-8132961 ]
          iignatyev Igor Ignatyev made changes -
          Integration Due 2015-11-09 2015-11-16
          Due Date 2015-11-23 2015-11-30
          mr Mark Reinhold made changes -
          Description Summary
          -------
          Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures and timeouts.

          Goals
          -----

          Gather the following information to help diagnose test failures and timeouts

           - For Java processes which are still running on a host after test failure or timeout
            - C- and Java-stack
            - Core dumps (minidumps on windows)
            - Heap statistics
           - Environment information
            - Running processes
            - CPU, I/O loads
            - Open files, sockets
            - Free disk space, memory
            - Last system messages, events

          We will develop a library that provides this functionality and co-locate the library sources with the product code.

          Motivation
          ----------
          It is difficult to troubleshoot intermittent test failures when there is no information about testing environment.
          Such test failures often depend on test execution order and concurrence, which makes it extremely difficult to reproduce them.
          For example, there is a set of failures caused by suspicious kill signal: [JDK-8064522](JDK-8064522), [JDK-8067653](JDK-8067653), [JDK-8072031](JDK-8072031).

          Description
          -----------
          Currently, there are two extensions points in JTReg.
          The first one is [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which JTReg executes when a test timeouts.
          The second one is [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73) which implements observer design pattern to track different events from a test execution.
          We will use this extension mechanism to gather diagnostic information and develop a custom observer and timeout handler for JTReg.

          Information about environment and non-Java processes will be collected by executing platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](JDK-8043764), e.g. `print_vm_state` command which collects information similar to `hs_err` files.
          Gathered information will be stored as artifacts available for later inspection together with test results.
          The observer will collect the information on `finishedTest` event and only if test status is failed.

          Since tests may create other processes (including Java), information about test processes and their child processes will be collected.
          To find such processes, the library creates a process tree with an original test process as a root.

          Library sources will be placed in `test` directory in the top-level repository and makefiles will be updated to build them and bundle as a part of a test bundle.

          Testing
          -------
          In the purpose of stabilizing, we will schedule regular testing which uses this library.
          When the results and test execution becomes stable, we will extend the use of the library to other components.

          Risks and Assumptions
          ---------------------
          - *Risk that execution of some commands can hang:*
          To minimize this risk a command will be executed only for an allotted time and interrupted after that.
          - *Running out of disk space on a host:*
          Plan is to archive information, restrict the amount of saved information and check free disk space before information collection.
          - *Tools unavailability on a platform or host:*
          If a tool is not available on a particular host or platform, the commands which depend on the missing tools will be skipped and a warning message will be added into the log file.
          Another possible solution is to download required tools from a known tools repository.
          - *System resource exhaustion:*
          Some failures can cause exhaustion of different types of system resources (CPU, memory, disk-space, etc) or be caused by a lock of resources.
          Since it won't be possible to run commands to gather information in these situations, command execution will be skipped to prevent further system degradation.
          - *Getting process tree in Java:*
          Getting process tree in Java requires a new process API described in [JEP 102](JDK-8046092).
          Using the JDK under test as the stable JDK (JDK which runs JTReg test harness) may discredit test results.
          To mitigate this, we are going to develop alternative implementation of process tree formation.
          This implementation will simplify backporting this project into JDK 8.
          Summary
          -------

          Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures and timeouts.

          Goals
          -----

          Gather the following information to help diagnose test failures and timeouts:

           - For Java processes which are still running on a host after test failure or timeout:
            - C and Java stacks
            - Core dumps (minidumps on Windows)
            - Heap statistics
           - Environment information:
            - Running processes
            - CPU and I/O loads
            - Open files and sockets
            - Free disk space and memory
            - Most recent system messages and events

          We will develop a library that provides this functionality and co-locate the library sources with the product code.

          Motivation
          ----------

          It is difficult to troubleshoot intermittent test failures when there is no information about the testing environment.
          Such test failures often depend on test execution order and concurrence, which makes it extremely difficult to reproduce them.

          Description
          -----------

          Currently, there are two extensions points in the `jtreg` test harness.
          The first one is the [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which `jtreg` runs when a test times out.
          The second one is the [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73), which implements the observer design pattern to track different events in a test run.
          We will use these extension points to gather diagnostic information and develop a custom observer and timeout handler for `jtreg`.

          Information about environment and non-Java processes will be collected by running platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](http://openjdk.java.net/jeps/228), e.g., the `print_vm_state` command which collects information similar to `hs_err` files.
          The information gathered will be stored for later inspection together with test results.
          The observer will collect the information on `finishedTest` events when tests fail.

          Since tests may create other processes, information about test processes and their child processes will be collected.
          To find such processes, the library will create a process tree with the original test process at the root.

          Library sources will be placed in the `test` directory in the top-level repository, and makefiles will be updated to build them and bundle them as a part of test bundles.

          Testing
          -------

          We will schedule regular testing which uses this library.
          When the results and test execution become stable, we will extend the use of the library to other components.

          Risks and Assumptions
          ---------------------

          - *Risk that execution of some commands can hang:*
          To minimize this risk a command will be executed only for an allotted time and interrupted after that.
          - *Running out of disk space on a host:*
          The plan is to archive information, restrict the amount of saved information, and check free disk space before information collection.
          - *Tools unavailability on a platform or host:*
          If a tool is not available on a particular host or platform, the commands which depend on the missing tools will be skipped and a warning message will be added to the log file.
          Another possible solution is to download required tools from a known tools repository.
          - *System resource exhaustion:*
          Some failures can cause exhaustion of different types of system resources (CPU, memory, disk-space, etc) or be caused by a lock of resources.
          Since it won't be possible to run commands to gather information in these situations, command execution will be skipped to prevent further system degradation.
          - *Getting process trees in Java:*
          Getting the process tree in Java requires the new process API described in [JEP 102](http://openjdk.java.net/jeps/102).
          Using the JDK under test as the stable JDK (i.e., the JDK which runs the `jtreg` test harness) may interfere with test results.
          To mitigate this, we will develop an alternative process-tree implementation.
          This implementation will simplify backporting this project into JDK 8.
          mr Mark Reinhold made changes -
          Status Submitted [ 10002 ] Candidate [ 10003 ]
          JEP Number 279
          mr Mark Reinhold made changes -
          Security Confidential [ 10000 ]
          JEP Type Infrastructure [ 19100 ] Feature [ 19103 ]
          Summary Improved JTReg Test Failure Troubleshooting JEP 279: Improve Test-Failure Troubleshooting
          Alert Status Red [ 3 ] Green [ 1 ]
          Alert Reason Waiting for candidate review
          Assignee Mark Reinhold [ mr ] Igor Ignatyev [ iignatyev ]
          iignatyev Igor Ignatyev made changes -
          Assignee Igor Ignatyev [ iignatyev ] Mikael Vidstedt [ mikael ]
          mikael Mikael Vidstedt made changes -
          Endorsed By Mikael Vidstedt [ mikael ]
          mikael Mikael Vidstedt made changes -
          Assignee Mikael Vidstedt [ mikael ] Igor Ignatyev [ iignatyev ]
          iignatyev Igor Ignatyev made changes -
          Status Candidate [ 10003 ] Proposed to Target [ 10004 ]
          iignatyev Igor Ignatyev made changes -
          Assignee Igor Ignatyev [ iignatyev ] Mark Reinhold [ mr ]
          iignatyev Igor Ignatyev made changes -
          Integration Due 2015-11-16 2015-11-30
          iignatyev Igor Ignatyev made changes -
          Due Date 2015-11-30 2015-12-07
          apikalev Andrey Pikalev made changes -
          Labels a360_na a360_na no-tck
          iignatyev Igor Ignatyev made changes -
          Integration Due 2015-11-30 2015-12-03
          ecaspole Eric Caspole made changes -
          Link This issue is blocked by JDK-8068210 [ JDK-8068210 ]
          iignatyev Igor Ignatyev made changes -
          Alert Status Green [ 1 ] Yellow [ 2 ]
          Alert Reason pending target review; won't meet due date in case the review's done after Nov 25th
          iignatyev Igor Ignatyev made changes -
          Alert Reason pending target review; won't meet due date in case the review's done after Nov 25th target review is taking more than expected. new dates set assuming the JEP will be targeted by Dec 3.
          Due Date 2015-12-07 2015-12-24
          Integration Due 2015-12-03 2015-12-17
          Alert Status Yellow [ 2 ] Red [ 3 ]
          mr Mark Reinhold made changes -
          Description Summary
          -------

          Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures and timeouts.

          Goals
          -----

          Gather the following information to help diagnose test failures and timeouts:

           - For Java processes which are still running on a host after test failure or timeout:
            - C and Java stacks
            - Core dumps (minidumps on Windows)
            - Heap statistics
           - Environment information:
            - Running processes
            - CPU and I/O loads
            - Open files and sockets
            - Free disk space and memory
            - Most recent system messages and events

          We will develop a library that provides this functionality and co-locate the library sources with the product code.

          Motivation
          ----------

          It is difficult to troubleshoot intermittent test failures when there is no information about the testing environment.
          Such test failures often depend on test execution order and concurrence, which makes it extremely difficult to reproduce them.

          Description
          -----------

          Currently, there are two extensions points in the `jtreg` test harness.
          The first one is the [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which `jtreg` runs when a test times out.
          The second one is the [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73), which implements the observer design pattern to track different events in a test run.
          We will use these extension points to gather diagnostic information and develop a custom observer and timeout handler for `jtreg`.

          Information about environment and non-Java processes will be collected by running platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](http://openjdk.java.net/jeps/228), e.g., the `print_vm_state` command which collects information similar to `hs_err` files.
          The information gathered will be stored for later inspection together with test results.
          The observer will collect the information on `finishedTest` events when tests fail.

          Since tests may create other processes, information about test processes and their child processes will be collected.
          To find such processes, the library will create a process tree with the original test process at the root.

          Library sources will be placed in the `test` directory in the top-level repository, and makefiles will be updated to build them and bundle them as a part of test bundles.

          Testing
          -------

          We will schedule regular testing which uses this library.
          When the results and test execution become stable, we will extend the use of the library to other components.

          Risks and Assumptions
          ---------------------

          - *Risk that execution of some commands can hang:*
          To minimize this risk a command will be executed only for an allotted time and interrupted after that.
          - *Running out of disk space on a host:*
          The plan is to archive information, restrict the amount of saved information, and check free disk space before information collection.
          - *Tools unavailability on a platform or host:*
          If a tool is not available on a particular host or platform, the commands which depend on the missing tools will be skipped and a warning message will be added to the log file.
          Another possible solution is to download required tools from a known tools repository.
          - *System resource exhaustion:*
          Some failures can cause exhaustion of different types of system resources (CPU, memory, disk-space, etc) or be caused by a lock of resources.
          Since it won't be possible to run commands to gather information in these situations, command execution will be skipped to prevent further system degradation.
          - *Getting process trees in Java:*
          Getting the process tree in Java requires the new process API described in [JEP 102](http://openjdk.java.net/jeps/102).
          Using the JDK under test as the stable JDK (i.e., the JDK which runs the `jtreg` test harness) may interfere with test results.
          To mitigate this, we will develop an alternative process-tree implementation.
          This implementation will simplify backporting this project into JDK 8.
          Summary
          -------

          Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures and timeouts.

          Goals
          -----

          Gather the following information to help diagnose test failures and timeouts:

           - For Java processes which are still running on a host after test failure or timeout:
            - C and Java stacks
            - Core dumps (minidumps on Windows)
            - Heap statistics
           - Environment information:
            - Running processes
            - CPU and I/O loads
            - Open files and sockets
            - Free disk space and memory
            - Most recent system messages and events

          We will develop a library that provides this functionality and co-locate the library sources with the product code.

          Motivation
          ----------

          It is difficult to troubleshoot intermittent test failures when there is no information about the testing environment.
          Such test failures often depend on test execution order and concurrence, which makes it extremely difficult to reproduce them.

          Description
          -----------

          Currently, there are two extension points in the `jtreg` test harness.
          The first one is the [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which `jtreg` runs when a test times out.
          The second one is the [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73), which implements the observer design pattern to track different events in a test run.
          We will use these extension points to gather diagnostic information and develop a custom observer and timeout handler for `jtreg`.

          Information about environment and non-Java processes will be collected by running platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](http://openjdk.java.net/jeps/228), e.g., the `print_vm_state` command which collects information similar to `hs_err` files.
          The information gathered will be stored for later inspection together with test results.
          The observer will collect the information on `finishedTest` events when tests fail.

          Since tests may create other processes, information about test processes and their child processes will be collected.
          To find such processes, the library will create a process tree with the original test process at the root.

          Library sources will be placed in the `test` directory in the top-level repository, and makefiles will be updated to build them and bundle them as a part of test bundles.

          Testing
          -------

          We will schedule regular testing which uses this library.
          When the results and test execution become stable, we will extend the use of the library to other components.

          Risks and Assumptions
          ---------------------

          - *Risk that execution of some commands can hang:*
          To minimize this risk a command will be executed only for an allotted time and interrupted after that.
          - *Running out of disk space on a host:*
          The plan is to archive information, restrict the amount of saved information, and check free disk space before information collection.
          - *Tools unavailable on a platform or host:*
          If a tool is not available on a particular host or platform, the commands which depend on the missing tools will be skipped and a warning message will be added to the log file.
          Another possible solution is to download required tools from a known tools repository.
          - *System resource exhaustion:*
          Some failures can cause exhaustion of different types of system resources (CPU, memory, disk-space, etc.) or be caused by a lock of resources.
          Since it won't be possible to run commands to gather information in these situations, command execution will be skipped to prevent further system degradation.
          - *Getting process trees in Java:*
          Getting the process tree in Java requires the new process API described in [JEP 102](http://openjdk.java.net/jeps/102).
          Using the JDK under test as the stable JDK (i.e., the JDK which runs the `jtreg` test harness) may interfere with test results.
          To mitigate this, we will develop an alternative process-tree implementation.
          That implementation will simplify backporting this project into JDK 8.
          mr Mark Reinhold made changes -
          Labels a360_na no-tck a360_na jdk9-ptt-2015-12-03 no-tck
          mr Mark Reinhold made changes -
          Status Proposed to Target [ 10004 ] Targeted [ 10005 ]
          iignatyev Igor Ignatyev made changes -
          Integration Due 2015-12-17 2015-12-31
          Alert Status Red [ 3 ] Green [ 1 ]
          Due Date 2015-12-24 2016-01-14
          iignatyev Igor Ignatyev made changes -
          Alert Reason target review is taking more than expected. new dates set assuming the JEP will be targeted by Dec 3.
          iignatyev Igor Ignatyev made changes -
          Assignee Mark Reinhold [ mr ] Igor Ignatyev [ iignatyev ]
          iignatyev Igor Ignatyev made changes -
          Alert Reason integrated into jdk9/hs-comp 2015-12-17, expected to be in jdk9/jdk9 in 2 weeks
          rpallath Rajendrakumar Pallath made changes -
          Labels a360_na jdk9-ptt-2015-12-03 no-tck a360_na jdk9-ptt-2015-12-03 no-tck toi=no
          avoytilo Aleksey Voytilov (Inactive) made changes -
          Due Date 2016-01-14 2016-01-28
          avoytilo Aleksey Voytilov (Inactive) made changes -
          Integration Due 2015-12-31 2016-01-21
          iignatyev Igor Ignatyev made changes -
          Status Targeted [ 10005 ] Integrated [ 10007 ]
          iignatyev Igor Ignatyev made changes -
          Alert Reason integrated into jdk9/hs-comp 2015-12-17, expected to be in jdk9/jdk9 in 2 weeks integrated into jdk9/jdk9 at 2016-01-27
          complication depends on jtreg 4.1 b13 promotion
          iignatyev Igor Ignatyev made changes -
          Due Date 2016-01-28 2016-02-29
          iignatyev Igor Ignatyev made changes -
          Alert Reason integrated into jdk9/jdk9 at 2016-01-27
          complication depends on jtreg 4.1 b13 promotion
          integrated into jdk9/jdk9 at 2016-01-27
          completion depends on jtreg 4.1 b13 promotion
          thartmann Tobias Hartmann made changes -
          Link This issue relates to JDK-8149465 [ JDK-8149465 ]
          iignatyev Igor Ignatyev made changes -
          Link This issue is blocked by JDK-8132965 [ JDK-8132965 ]
          mrkam Alexander Kuznetcov (Inactive) made changes -
          Link This issue relates to JDK-8151671 [ JDK-8151671 ]
          iignatyev Igor Ignatyev made changes -
          Due Date 2016-02-29 2016-04-16
          jgodinez Jennifer Godinez (Inactive) made changes -
          Workflow JEP Workflow [ 4777535 ] JEP Workflow INFRA-2743 [ 4887869 ]
          jgodinez Jennifer Godinez (Inactive) made changes -
          Workflow JEP Workflow INFRA-2743 [ 4887869 ] JEP Workflow [ 4888618 ]
          iignatyev Igor Ignatyev made changes -
          Due Date 2016-04-16 2016-05-10
          iignatyev Igor Ignatyev made changes -
          Status Integrated [ 10007 ] Completed [ 10008 ]
          avoytilo Aleksey Voytilov (Inactive) made changes -
          Alert Reason integrated into jdk9/jdk9 at 2016-01-27
          completion depends on jtreg 4.1 b13 promotion
          mr Mark Reinhold made changes -
          Description Summary
          -------

          Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures and timeouts.

          Goals
          -----

          Gather the following information to help diagnose test failures and timeouts:

           - For Java processes which are still running on a host after test failure or timeout:
            - C and Java stacks
            - Core dumps (minidumps on Windows)
            - Heap statistics
           - Environment information:
            - Running processes
            - CPU and I/O loads
            - Open files and sockets
            - Free disk space and memory
            - Most recent system messages and events

          We will develop a library that provides this functionality and co-locate the library sources with the product code.

          Motivation
          ----------

          It is difficult to troubleshoot intermittent test failures when there is no information about the testing environment.
          Such test failures often depend on test execution order and concurrence, which makes it extremely difficult to reproduce them.

          Description
          -----------

          Currently, there are two extension points in the `jtreg` test harness.
          The first one is the [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which `jtreg` runs when a test times out.
          The second one is the [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73), which implements the observer design pattern to track different events in a test run.
          We will use these extension points to gather diagnostic information and develop a custom observer and timeout handler for `jtreg`.

          Information about environment and non-Java processes will be collected by running platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](http://openjdk.java.net/jeps/228), e.g., the `print_vm_state` command which collects information similar to `hs_err` files.
          The information gathered will be stored for later inspection together with test results.
          The observer will collect the information on `finishedTest` events when tests fail.

          Since tests may create other processes, information about test processes and their child processes will be collected.
          To find such processes, the library will create a process tree with the original test process at the root.

          Library sources will be placed in the `test` directory in the top-level repository, and makefiles will be updated to build them and bundle them as a part of test bundles.

          Testing
          -------

          We will schedule regular testing which uses this library.
          When the results and test execution become stable, we will extend the use of the library to other components.

          Risks and Assumptions
          ---------------------

          - *Risk that execution of some commands can hang:*
          To minimize this risk a command will be executed only for an allotted time and interrupted after that.
          - *Running out of disk space on a host:*
          The plan is to archive information, restrict the amount of saved information, and check free disk space before information collection.
          - *Tools unavailable on a platform or host:*
          If a tool is not available on a particular host or platform, the commands which depend on the missing tools will be skipped and a warning message will be added to the log file.
          Another possible solution is to download required tools from a known tools repository.
          - *System resource exhaustion:*
          Some failures can cause exhaustion of different types of system resources (CPU, memory, disk-space, etc.) or be caused by a lock of resources.
          Since it won't be possible to run commands to gather information in these situations, command execution will be skipped to prevent further system degradation.
          - *Getting process trees in Java:*
          Getting the process tree in Java requires the new process API described in [JEP 102](http://openjdk.java.net/jeps/102).
          Using the JDK under test as the stable JDK (i.e., the JDK which runs the `jtreg` test harness) may interfere with test results.
          To mitigate this, we will develop an alternative process-tree implementation.
          That implementation will simplify backporting this project into JDK 8.
          Summary
          -------

          Automatically collect diagnostic information which can be used for further troubleshooting in case of test failures and timeouts.

          Goals
          -----

          Gather the following information to help diagnose test failures and timeouts:

            - For Java processes which are still running on a host after test failure or timeout:
              - C and Java stacks
              - Core dumps (minidumps on Windows)
              - Heap statistics
            - Environment information:
              - Running processes
              - CPU and I/O loads
              - Open files and sockets
              - Free disk space and memory
              - Most recent system messages and events

          We will develop a library that provides this functionality and co-locate the library sources with the product code.

          Motivation
          ----------

          It is difficult to troubleshoot intermittent test failures when there is no information about the testing environment.
          Such test failures often depend on test execution order and concurrence, which makes it extremely difficult to reproduce them.

          Description
          -----------

          Currently, there are two extension points in the `jtreg` test harness.
          The first one is the [timeout handler](http://hg.openjdk.java.net/code-tools/jtreg/file/5cb7831aea2e/src/share/classes/com/sun/javatest/regtest/TimeoutHandler.java), which `jtreg` runs when a test times out.
          The second one is the [observer](http://hg.openjdk.java.net/code-tools/jtharness/file/6e16c1cff0cd/src/com/sun/javatest/Harness.java#l73), which implements the observer design pattern to track different events in a test run.
          We will use these extension points to gather diagnostic information and develop a custom observer and timeout handler for `jtreg`.

          Information about environment and non-Java processes will be collected by running platform-specific commands.
          Gathering information about Java processes will be done via available diagnostic commands which are heavily extended by [JEP 228](http://openjdk.java.net/jeps/228), e.g., the `print_vm_state` command which collects information similar to `hs_err` files.
          The information gathered will be stored for later inspection together with test results.
          The observer will collect the information on `finishedTest` events when tests fail.

          Since tests may create other processes, information about test processes and their child processes will be collected.
          To find such processes, the library will create a process tree with the original test process at the root.

          Library sources will be placed in the `test` directory in the top-level repository, and makefiles will be updated to build them and bundle them as a part of test bundles.

          Testing
          -------

          We will schedule regular testing which uses this library.
          When the results and test execution become stable, we will extend the use of the library to other components.

          Risks and Assumptions
          ---------------------

          - *Risk that execution of some commands can hang:*
          To minimize this risk a command will be executed only for an allotted time and interrupted after that.
          - *Running out of disk space on a host:*
          The plan is to archive information, restrict the amount of saved information, and check free disk space before information collection.
          - *Tools unavailable on a platform or host:*
          If a tool is not available on a particular host or platform, the commands which depend on the missing tools will be skipped and a warning message will be added to the log file.
          Another possible solution is to download required tools from a known tools repository.
          - *System resource exhaustion:*
          Some failures can cause exhaustion of different types of system resources (CPU, memory, disk-space, etc.) or be caused by a lock of resources.
          Since it won't be possible to run commands to gather information in these situations, command execution will be skipped to prevent further system degradation.
          - *Getting process trees in Java:*
          Getting the process tree in Java requires the new process API described in [JEP 102](http://openjdk.java.net/jeps/102).
          Using the JDK under test as the stable JDK (i.e., the JDK which runs the `jtreg` test harness) may interfere with test results.
          To mitigate this, we will develop an alternative process-tree implementation.
          That implementation will simplify backporting this project into JDK 8.
          rgallard Raymond Gallardo made changes -
          Labels a360_na jdk9-ptt-2015-12-03 no-tck toi=no a360_na docsnoimpact jdk9-ptt-2015-12-03 no-tck toi=no
          iignatyev Igor Ignatyev made changes -
          Resolution Delivered [ 17 ]
          Status Completed [ 10008 ] Closed [ 6 ]

            People

            • Assignee:
              iignatyev Igor Ignatyev
              Reporter:
              iignatyev Igor Ignatyev
              Owner:
              Igor Ignatyev
              Reviewed By:
              Aleksandre Iline, Brian Goetz
              Endorsed By:
              Mikael Vidstedt
            • Votes:
              0 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

              • Due:
                Created:
                Updated:
                Resolved:
                Integration Due: