Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8220465

Use shadow regions for faster ParallelGC full GCs

    Details

    • Type: Enhancement
    • Status: Resolved
    • Priority: P4
    • Resolution: Fixed
    • Affects Version/s: 13
    • Fix Version/s: 14
    • Component/s: hotspot
    • Labels:
    • Subcomponent:
      gc
    • Resolved In Build:
      b27

      Description

      [Problem described and patch submitted by Haoyu Li.]

      ParallelGC implements a compacting algorithm to do the full GC. We find that this algorithm leads in terrible GC thread utilization (like only 8% on Derby benchmark in SPECjvm2008 suite). There are serious dependencies between heap regions, i.e., a region is available to receive live objects from its source regions only after it has been collected. The work stealing does not solve this problem, as idle GC threads cannot steal anything because most regions are unavailable to collect.

      Optimization:
      We propose using shadow regions to solve this problem. The basic idea is to let GC threads collect unavailable regions in advance by copying their live data into newly allocated empty regions, i.e., shadow regions, to resolve the region dependencies. The contents of shadow regions will be copied back to the corresponding regions later. With our approach, GC threads can keep working most of the time without suffering from any work stealing failure (except the work stealing failure happened in the end of a full GC).

      By default, the shadow regions are allocated from off-heap memory. However, we notice that the to-space in young gen is always empty, so we can reduce off-heap allocations by using the empty regions in to-space to play the role of shadow regions. And if the ScavengeBeforeFullGC option is on, regions in eden space may also be used as shadow regions.

      Evaluation:
      We evaluate the full GC performance with our patch on DaCapo, SPECjvm2008, JOlden benchmark suits, and the results shows that shadow region optimization could improve full GC throughput by 2.1X on average, up to 3.2X.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                sjohanss Stefan Johansson
                Reporter:
                kbarrett Kim Barrett
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: