Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8159052

AArch64: Optimise unaligned copies in pd_disjoint_words and pd_conjoint_words

    Details

    • Subcomponent:
    • CPU:
      aarch64
    • OS:
      linux

      Description

      The asm vsns of pd_disjoint_words and pd_conjoint_words are not optimised for unaligned accesses.

      Some partners' HW has large penalties for unaligned accesses.

      This issue proposes opimising pd_disjoint_words and pd_conjoint_words for unaligned access.

      Performance on platforms which support unaligned accesses without penalty should not be affected by this change.

        Activity

        Hide
        zmajo Zoltan Majo (Inactive) added a comment -
        To me it seems that pd_disjoint_words and pd_conjoint_words have a large set of users (not just the compiler or compiler-related components). Therefore it is most likely better to address this issue within the runtime (and not compiler).
        Show
        zmajo Zoltan Majo (Inactive) added a comment - To me it seems that pd_disjoint_words and pd_conjoint_words have a large set of users (not just the compiler or compiler-related components). Therefore it is most likely better to address this issue within the runtime (and not compiler).
        Hide
        gtriantafill George Triantafillou added a comment -
        Is this change going to be completed for jdk 10?
        Show
        gtriantafill George Triantafillou added a comment - Is this change going to be completed for jdk 10?
        Hide
        kbarrett Kim Barrett added a comment -
        I think there is something else going on here, and that the proposed change to support unaligned copies by the word copy functions should not be made.

        pd_disjoint_words and pd_conjoint_words provide the private implementations of the corresponding Copy::{disjoint,conjoint}_words. That public API has a precondition that the pointers are word-aligned, with asserts to ensure that in debug builds. Hence the pd_ functions never (that I could find) get called with unaligned arguments. If the pd_ functions *are* somehow called with unaligned arguments, *that* is a bug and should be fixed.

        As to why the test timings showed some improvement from the proposed change, I have no idea. Perhaps the process of adding unaligned support "accidentally" fixed some lurking performance bug in the existing code. Or maybe just measurement issues; trying to decently measure something low-level like this by running some Java stress test is difficult. It would be better to directly measure the differences using a focused low-level test, perhaps written using the gtest facilities.
        Show
        kbarrett Kim Barrett added a comment - I think there is something else going on here, and that the proposed change to support unaligned copies by the word copy functions should not be made. pd_disjoint_words and pd_conjoint_words provide the private implementations of the corresponding Copy::{disjoint,conjoint}_words. That public API has a precondition that the pointers are word-aligned, with asserts to ensure that in debug builds. Hence the pd_ functions never (that I could find) get called with unaligned arguments. If the pd_ functions *are* somehow called with unaligned arguments, *that* is a bug and should be fixed. As to why the test timings showed some improvement from the proposed change, I have no idea. Perhaps the process of adding unaligned support "accidentally" fixed some lurking performance bug in the existing code. Or maybe just measurement issues; trying to decently measure something low-level like this by running some Java stress test is difficult. It would be better to directly measure the differences using a focused low-level test, perhaps written using the gtest facilities.
        Hide
        enevill Ed Nevill added a comment -
        Hi Kim,

        Unaligned here refers to 64 bit alignment. Some ARM HW has large penalties copying data that is 4 byte aligned, but not 8 byte aligned.

        However I can find no traction for making this change so I am quite happy to see this closed.

        All the best,
        Ed.
        Show
        enevill Ed Nevill added a comment - Hi Kim, Unaligned here refers to 64 bit alignment. Some ARM HW has large penalties copying data that is 4 byte aligned, but not 8 byte aligned. However I can find no traction for making this change so I am quite happy to see this closed. All the best, Ed.
        Hide
        kbarrett Kim Barrett added a comment - - edited
        [~enevill] The alignment requirement that is being asserted by the public API is "word" alignment, e.g. 4 bytes in a 32bit build and 8 bytes in a 64bit build. (A 32bit build could potentially be run on 64bit hardware, and perhaps see a penalty. But since this change is for aarch64, which is 64bit only, that isn't an issue.)

        The measurement results are interesting though, and suggest there might be something else going on that could be worth investigating further by the relevant platform owners.
        Show
        kbarrett Kim Barrett added a comment - - edited [~enevill] The alignment requirement that is being asserted by the public API is "word" alignment, e.g. 4 bytes in a 32bit build and 8 bytes in a 64bit build. (A 32bit build could potentially be run on 64bit hardware, and perhaps see a penalty. But since this change is for aarch64, which is 64bit only, that isn't an issue.) The measurement results are interesting though, and suggest there might be something else going on that could be worth investigating further by the relevant platform owners.

          People

          • Assignee:
            aph Andrew Haley
            Reporter:
            enevill Ed Nevill
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated: