Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8214751

X86: Support for VNNI Instructions

    Details

    • Type: Enhancement
    • Status: Resolved
    • Priority: P4
    • Resolution: Fixed
    • Affects Version/s: 12
    • Fix Version/s: 12
    • Component/s: hotspot
    • Labels:
    • Subcomponent:
    • Resolved In Build:
      b24
    • CPU:
      x86

      Backports

        Description

        This is VNNI VPDPWSSD instruction support with autovectorization.

        It can vectorize this operation in the loop:
        out[i] += ((in1[2*i] * in2[2*i]) + (in1[2*i+1] * in2[2*i+1]));

        This patch is useful for AI ML/DL applications such as convolution based Neural Nets.

        More information on VNNI can be found here:
        https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
        Code contributed by: razvan.a.lupusoru@intel.com and vdeshpande(vivek.r.deshpande@intel.com)

        The initial performance gains with micro on skylake with AVX3 is 10.8x.
         and it generates
        vmovdqu xmm3, xmmword ptr [rbp+r8*2+0x10]
        vmovdqu xmm6, xmmword ptr [rdx+r8*2+0x10]
        vpmaddwd xmm3, xmm6, xmm3
        vpaddd xmm3, xmm3, xmmword ptr [r9+rdi*4+0x10]
        vmovdqu xmmword ptr [r9+rdi*4+0x10], xmm3

        It can generate vpdpwssd instruction on cascadelake.

        The webrev is here:
        http://cr.openjdk.java.net/~vdeshpande/8214751/VNNI/webrev.00/

          Attachments

            Issue Links

              Activity

                People

                • Assignee:
                  vdeshpande Vivek Deshpande
                  Reporter:
                  vdeshpande Vivek Deshpande
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  3 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: