Details

    • Type: Enhancement
    • Status: Resolved
    • Priority: P3
    • Resolution: Fixed
    • Affects Version/s: hs14
    • Fix Version/s: hs14
    • Component/s: hotspot
    • Labels:
    • Subcomponent:
    • Resolved In Build:
      b07
    • CPU:
      generic
    • OS:
      generic

      Backports

        Description

        The server compiler currently computes block frequencies based on branch estimates and probabilities collected from profiling data.

        The frequency data can be leveraged create a block layout that:
          - reduces the number of branches emitted
          - improves icache locality
          - rotates loops to end with a conditional branch
          - naturally moves uncommon code sequences out of line without using an arbitrary frequency cutoff

        While only modest improvement in performance should be expected, a reduction in the methods for which "spaghetti code" is emitted, should make assembly code more readable.

          Issue Links

            Activity

            Hide
            rasbold Chuck Rasbold added a comment -
            BT2:EVALUATION

            Create a post-register allocation pass that drives block layout by edge frequencies.
            Show
            rasbold Chuck Rasbold added a comment - BT2:EVALUATION Create a post-register allocation pass that drives block layout by edge frequencies.
            Show
            jprtbugupd JPRT Bug Updates (Inactive) added a comment - BT2:EVALUATION http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/72c5366e5d86
            Hide
            rasbold Chuck Rasbold added a comment -
            BT2:PUBLIC COMMENTS

            Reorder the basic blocks with the intent of minimizing the number of jumps
            along frequently executed edges. By focusing on keeping frequently executed
            blocks together, infrequent blocks should naturally be moved out of line. The
            new pass is enabled by a flag, otherwise, the old functionality is employed.

            The basic algorithm:

            - Find all CFG edges that could be embodied by a branch or fallthroough
            - Compute an edge frequencies using block frequencies and branch probabilities
            - Make a list of edges sorted the in descending order by frequency
            - Create a Trace (an ordereed list of basic blocks) for every basic block
            - For each edge, in descending frequency
              - if the edge is from the tail of one trace to the head of another
                - if the traces are different, append the traces
                - else, process edges as a back-branch (and possibly rotate loop)
            - Make diamonds, when it is deemed profitable to merge one trace into the middle of another
            - Append remaining traces end-to-end if an unprocessed edge indicates transfer between traces.
            - Order remaining traces by frequency of first block

            There are three new flags:

            - BlockLayoutByFrequency: enables new algorithm (on by default)
            - BlockLayoutRotateLoops: allows a back branch to be embodied as a
              fall-through (off)
            - BlockLayoutMinDiamondPercentage</b>: minimum % of
              the lesser side of a diamond that will allow the two sides to be
              merged into a single trace

            Other changes:

            Add classes for PhaseBlockLayout, Trace, and CFGEdge

            In Estimate_Block_Frequency(), when the layout pass is enabled, do a
            pre-estimate pass that lowers branch probabilities instead of a
            post-pass that adjusts block frequencies. Also, delete code that could
            scale block frequencies against something other than a single method
            entry.

            Divide remove_emopty() into two parts. Empty blocks are moved before
            layout; flow if fixed after layout.

            Revise loop alignment code. Loop alignments are computed before the
            output pass because if a loop is rotated, loop-heads are no longer
            guaranteed to be the backbranch target..

            Add infrastructure to allow frequency based layout on a
            method-by-method basis.
            Show
            rasbold Chuck Rasbold added a comment - BT2:PUBLIC COMMENTS Reorder the basic blocks with the intent of minimizing the number of jumps along frequently executed edges. By focusing on keeping frequently executed blocks together, infrequent blocks should naturally be moved out of line. The new pass is enabled by a flag, otherwise, the old functionality is employed. The basic algorithm: - Find all CFG edges that could be embodied by a branch or fallthroough - Compute an edge frequencies using block frequencies and branch probabilities - Make a list of edges sorted the in descending order by frequency - Create a Trace (an ordereed list of basic blocks) for every basic block - For each edge, in descending frequency   - if the edge is from the tail of one trace to the head of another     - if the traces are different, append the traces     - else, process edges as a back-branch (and possibly rotate loop) - Make diamonds, when it is deemed profitable to merge one trace into the middle of another - Append remaining traces end-to-end if an unprocessed edge indicates transfer between traces. - Order remaining traces by frequency of first block There are three new flags: - BlockLayoutByFrequency: enables new algorithm (on by default) - BlockLayoutRotateLoops: allows a back branch to be embodied as a   fall-through (off) - BlockLayoutMinDiamondPercentage</b>: minimum % of   the lesser side of a diamond that will allow the two sides to be   merged into a single trace Other changes: Add classes for PhaseBlockLayout, Trace, and CFGEdge In Estimate_Block_Frequency(), when the layout pass is enabled, do a pre-estimate pass that lowers branch probabilities instead of a post-pass that adjusts block frequencies. Also, delete code that could scale block frequencies against something other than a single method entry. Divide remove_emopty() into two parts. Empty blocks are moved before layout; flow if fixed after layout. Revise loop alignment code. Loop alignments are computed before the output pass because if a loop is rotated, loop-heads are no longer guaranteed to be the backbranch target.. Add infrastructure to allow frequency based layout on a method-by-method basis.

              People

              • Assignee:
                rasbold Chuck Rasbold
                Reporter:
                rasbold Chuck Rasbold
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Imported:
                  Indexed: