Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8254129

IR Test Framework to support regex-based matching on the IR in JTreg compiler tests

    XMLWordPrintable

    Details

    • Type: Enhancement
    • Status: Resolved
    • Priority: P4
    • Resolution: Fixed
    • Affects Version/s: 16, 17
    • Fix Version/s: 17
    • Component/s: hotspot
    • Labels:
    • Subcomponent:
    • Resolved In Build:
      b26

      Description

      Edit: README file can be found at: https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/compiler/lib/ir_framework/README.md

      Umbrella RFE for testing based on IR nodes/shapes.

      A lot of tests are written based on an optimization/modification/fix on the IR. However, if these tests pass we still do not know if the originally intended IR modifications are still applied correctly. The question is how we could check IR modifications in a test to make sure future changes do not break these (e.g. a node transformation is not done anymore etc.). Something in this direction was already done in Valhalla (see framework [1] and example test [2]). Maybe this can be used as a starting point.

      [1] https://github.com/openjdk/valhalla/blob/e9c78ce4fcfd01361c35883e0d68f9ae5a80d079/test/hotspot/jtreg/compiler/valhalla/inlinetypes/InlineTypeTest.java
      [2] https://github.com/openjdk/valhalla/blob/e9c78ce4fcfd01361c35883e0d68f9ae5a80d079/test/hotspot/jtreg/compiler/valhalla/inlinetypes/TestBasicFunctionality.java#L67

      Description of the current framework in Valhalla
      =============================
      The goal of this framework is to verify that the code generated by C2 is optimized as expected.
      Current tests either only verify correctness (jtreg tests) or overall performance (jmh benchmarks)
      but a failing optimization does not necessarily trigger a drop in performance (especially if there
      are no fine-tuned and targeted microbenchmarks). For example, we had many regressions in C2 in the
      past that no one noticed because they would not show up in the larger benchmarks (which doesn't mean
      that these optimizations were useless). Also, such regressions are often hidden/amortized by other
      optimizations in the same build/release and therefore no one noticed.

      Since inline types are a completely new feature of the Java language, the specific C2 optimizations
      we added were not covered by any existing benchmarks. And even with targeted microbenchmarks, it's
      not easy to capture breakages that naturally happen when prototyping. I therefore wrote a framework
      that allows to define "match rules" on the Intermediate Representation (IR) generated by C2 to make
      sure new optimizations/transformations are applied as expected. For example, one can now write a
      correctness test that is annotated with a match rule that verifies that the compiled code does not
      contain any allocations (which is one of the main optimizations for inline types). If, against
      expectations, the compiled code does contain an allocation, the match rule will fail and report the
      error right away.

      The current implementation of the framework is rather simple (1000 lines of Java code) and works by
      parsing/searching the textual output of the C2 IR (no changes to the VM code required). Of course,
      there are several ways to improve this but for our inline type use cases this turned out to work
      just fine.

      Ideally, the inline type specific framework would be refactored or re-written to be more generic and
      upstreamed into mainline such that it can still be used by our inline type specific tests.


      Current framework state summarized
      =======================
      - Started out small and simple, gradually became more powerful with more functionality and features, also got a little more complex to understand it by handling many testing scenarios
      - More than just IR matching, it's now a testing framework with the capability to also verify IR by matching IR node patterns or make claims about the presence or absence or nodes, the number of specific nodes etc.
      - Framework and tests are tightly coupled, profound knowledge of the internals of the framework are required to write new tests
      - Highly depends on inline types

      Goals
      ====
      - Generalize the framework to also use it in mainline (upstream)
      - Still need to support all tests currently present in Valhalla's testing (~750 tests)
        * Update all tests using the old framework to use the new framework without removing functionality (tests should still work the same way)
      - Do not modify old tests but rather add a second version for a test that could benefit from IR validation

      Outline and next steps
      ==============
      - Familiarize with existing framework in Valhalla
      - Define a clear a simple interface between the new framework and the tests to be written using it
        * Should be easy to add new tests without needing to know internals of framework
      - Implement new framework by taking over/adapting/rewriting/refactoring the old framework into a test library and make it more generic such that it can be used by all compiler tests
        * Valhalla probably needs to add some inline type specific things to the framework again like specific inline node types (but that should be easy to add/remove again when moving to mainline)
        * Gradually convert all tests using the old framework to use the new one. The new framework gets "free" testing by all the ~750 inline type test that are already there and with each newly one added.
        * No need to maintain two frameworks
        * Add separate tests to verify the correctness of the framework (e.g. a test for each matching rule that should match and not match etc.)
      - Add new general non-Valhalla specific tests to further test and utilize the framework
      - Upstream the new framework into mainline
      - Extend the framework to support more match rules and additional functionality.
      - Add match rules to existing mainline tests (or write new tests for existing optimizations) and use it with tests for new features.

      Open Questions
      ==========
      - Current IR matching is done with PrintIdeal output, should we consider IGV xml output?
        * Need to be careful not to do duplicated work if IGV gets an updated format at some point
      - Should we add matching on additional flags?


      Update February 3rd, 2021
      ================
      Compiler team internal presentation of current state: http://cr.openjdk.java.net/~chagedorn/TestFramework/TestFramework.pdf
      GitHub development branch: https://github.com/openjdk/valhalla/compare/lworld...chhagedorn:TestingFramework

      There was a compiler team internal presentation of the current state of the new test framework with IR verification. There was a general agreement to keep things simple in a first version to cover the basics and already provide a good way to start writing tests with the framework. A summary of the framework:
      - Lightweight testing framework
      - IR verification with simple Regex matching on PrintIdeal and PrintOptoAssembly
      - Easy to use, method annotation based
      - Well suited for small/easy to medium sized tests

      Summary of discussion/improvement possibilities:
      - A way to parse the IR (where simple regex matching is not expressive enough) to query it. Some examples
        - Search for patterns (e.g. node X after node Y)
        - Search for other IR properties (e.g. offsets)
        - Prune uninteresting nodes
        - Apply matching only in specific loops, for example in the hottest loop
        => would be nice to have an API to query information about IR. PrintIdeal/PrintOptoAssembly are limited.
      - Use annotations on classes instead on methods together with interfaces. This could be used for more complex tests where you can implement, for example, IR verification methods yourself to get some more control
      - Forbid/let test fail if it deoptimizes (could be done with JFR, PrintCompilation or LogCompilation)
      - Use drivers in Jtreg to simplify test setups
      - Add vector nodes to standard IR nodes to choose from
      - Use IGV xml files instead of PrintIdeal/PrintOptoAssemlby
      - Should not spend time trying to convert old tests but rather encourage people to start writing new tests with it

        Attachments

          Issue Links

          1.
          Convert Valhalla tests using the old framework to the new framework Sub-task Resolved Ekaterina Pavlova  
          2.
          Add more documentation for the test framework Sub-task Closed Christian Hagedorn  
          3.
          Provide more default IR regexes Sub-task Open Cesar Soares  
          4.
          Parse the IR to perform queries on it Sub-task Open Unassigned  
          5.
          Add verifications for additional/other VM flags Sub-task Open Unassigned  
          6.
          Provide more stress/debug framework flags Sub-task Open Unassigned  
          7.
          Explore additional check possibilities with @IR annotations Sub-task Open Unassigned  
          8.
          Use new IR Test Framework to create tests for C2 IGV transformations Sub-task Open Cesar Soares  
          9.
          Add check/verfication for impossible constraints Sub-task Open Unassigned  
          10.
          Perform IR verification after specific compiler phases Sub-task Open Unassigned  
          11.
          Add whitelist matching as opposed to blacklist matching with failOn Sub-task Open Unassigned  
          12.
          Add attribute to name an IR rule Sub-task Open Unassigned  
          13.
          Simplify the Arguments annotation when all arguments have the same Argument value Sub-task Open Unassigned  
          14.
          [IR Framework] Improve exception message string for final exception thrown by framework Sub-task Open Christian Hagedorn  
          15.
          Add flag to stop execution on the first failure Sub-task Open Christian Hagedorn  
          16.
          [IR Framework] Make AbstractInfo.getRandom() static Sub-task Resolved Christian Hagedorn  
          17.
          Allow modification of whitelist via command line Sub-task Open Christian Hagedorn  
          18.
          Allow adding IR verification rules via a command line flag Sub-task Open Christian Hagedorn  
          19.
          IR Test Framework should not trigger class loading Sub-task Open Christian Hagedorn  

            Activity

              People

              Assignee:
              chagedorn Christian Hagedorn
              Reporter:
              chagedorn Christian Hagedorn
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: