Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-7054211

No loop unrolling done in jdk7b144 for a test update() while loop

    Details

    • Type: Bug
    • Status: Closed
    • Priority: P3
    • Resolution: Fixed
    • Affects Version/s: 7
    • Fix Version/s: hs22
    • Component/s: hotspot
    • Subcomponent:
    • Resolved In Build:
      b06
    • CPU:
      generic
    • OS:
      generic
    • Verification:
      Not verified

      Backports

        Description

        A perf regression of approximately 13.5% was observed in jdk7b144, when compared with jdk6u25 (score of 553MB/s vs 623MB/s). The
        benchmark is actually from the hadoop common community and its a pure java crc32 implementation of update (I changed the
        test to only spew scores for 65536 bytes size).
        It has a while loop and when I looked at the generated code for jdk7b144
        and compared it to jdk6u25, I saw that there was loop unrolling done in jdk6, so then I tried setting
        -XX:LoopUnrollLimit=0 for both (to bring them to a common ground). This didn't change jdk7 at all (as expected), and
        dropped jdk6's score to 601MB/s (so now the diff is 8.7%). The I tried the with XX-UseLoopPredicate and the score for
        jdk6 dropped a bit to 610MB/s and score for jdk7 increased a bit to 572MB/s (so now the difference is 6.6%). Combining
        both loopunrolllimit=0 and -looppredicate I get 528MB/s for jdk6 and 542MB/s for jdk7 which is little confusing to me...

        I have attached the generate outputs (Solaris Studio print) for both.

        Here's the source:

        line# 59:public void update(byte[] b, int off, int len) {
        line# 60: while(len > 7) {
        line# 61: int c0 = b[off++] ^ crc;
        line# 62: int c1 = b[off++] ^ (crc >>>= 8);
        line# 63: int c2 = b[off++] ^ (crc >>>= 8);
        line# 64: int c3 = b[off++] ^ (crc >>>= 8);
        line# 65: crc = (T8_7[c0 & 0xff] ^ T8_6[c1 & 0xff])
        line# 66: ^ (T8_5[c2 & 0xff] ^ T8_4[c3 & 0xff]);
        line# 67:
        line# 68: crc ^= (T8_3[b[off++] & 0xff] ^ T8_2[b[off++] & 0xff])
        line# 69: ^ (T8_1[b[off++] & 0xff] ^ T8_0[b[off++] & 0xff]);
        line# 70:
        line# 71: len -= 8;
        line# 72: }
        line# 73: while(len > 0) {
        line# 74: crc = (crc >>> 8) ^ T8_0[(crc ^ b[off++]) & 0xff];
        line# 75: len--;
        line# 76: }
        line# 77: }

        Matching sections: line 2b0 onwards in jdk6u25.txt and line e0 onwards for jdk7b144.txt.

          Attachments

            Issue Links

              Activity

                People

                • Assignee:
                  kvn Vladimir Kozlov
                  Reporter:
                  mbeckwit Monica Beckwith (Inactive)
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  0 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved:
                    Imported:
                    Indexed: