Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8253230

G1 20% slower than Parallel in JRuby rubykon benchmark

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: P3
    • Resolution: Unresolved
    • Affects Version/s: 16
    • Fix Version/s: 18
    • Component/s: hotspot
    • Subcomponent:
      gc

      Description

      On the jruby bug tracker there is a bug report about later JDKs 20% slower than latest (e.g. JDK 14). (https://github.com/jruby/jruby/issues/5789 via https://twitter.com/headius/status/1297992914832769024).

      The main reason is the change of the default GC in JDK9; however the difference is abnormally high so reporting it here. The typical observed difference for known outliers is around 10%.

      After some tuning, i.e. setting -Xms == -Xmx, using 32M regions, the difference can be tuned a bit to ~13-15% difference.

      One suspicion are the barriers as reported by [~shade] (in that bug report):

      "Tested with recent JDK 13 EA and multiple collectors. Judging from GC logs, it is heavily-allocating, but fairly young-gc workload. Both Parallel and G1 run very short Young GCs during the run, taking about 1% of total time, which means allocation pressure itself is not the issue here."

      Local results:
       # score [% of options
                              baseline]
       1 parallel 17,26 100,0% -Xmx1500m (oob)
       2 g1 13,64 79,0% -Xmx1500m (oob)

       3 parallel 17,16 100,0% -Xmx1500m -Xms1500m -Xmn1000m
       4 g1 13,99 81,5% -Xmx1500m -Xms1500m -Xmn1000m
       5 g1 14,36 83,7% -Xmx1500m -Xms1500m -Xmn1000m (rerun)
       6 g1 15,13 88,2% -Xmx1500m -Xms1500m -Xmn1000m -XX:G1HeapRegionSize=32m
       7 g1 14,90 86,8% -Xmx1500m -Xms1500m -Xmn1000m -XX:G1HeapRegionSize=32m (rerun)

       8 parallel 13,81 100,0% graal -Xmx1500m -Xms1500m -Xmn1000m -XX:G1HeapRegionSize=32m
       9 g1 13,11 94,9% graal -Xmx1500m -Xms1500m -Xmn1000m -XX:G1HeapRegionSize=32m

      The interesting runs are 8 and 9, with graal. Seems like it's slower overall, but it also does not show a big difference (5%) in performance. So potentially there is an issue with C2 optimizations that only kicks in with Parallel GC's (small) barriers.

      Some initial playing with -XX:MaxInlineSize and -XX:FreqInlineSize did not yield interesting results.

      Reproduction:
      * Download JRuby from https://www.jruby.org/download
      * Clone https://github.com/PragTob/rubykon
      * Run jruby -Xcompile.invokedynamic=true -J-Xmx1500m benchmark/mcts_avg.rb

      JRuby will pick up the VM pointed to by JAVA_HOME; you can check which with "jruby -v".



        Attachments

          Issue Links

            Activity

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              tschatzl Thomas Schatzl
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated: