Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8272083

G1: Record iterated range for BOT performance during card scan

    XMLWordPrintable

    Details

    • Type: Enhancement
    • Status: Open
    • Priority: P4
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: hotspot
    • Subcomponent:
      gc

      Description

      Created on behalf of yude.lyd@alibaba-inc.com
      ----
      When we call block_start(addr), we ask the BOT to give us the starting address of the block that covers 'addr'.
      I think BOT is not efficiently handling the queries if we call block_start(addr) sequentially, that is, call it multiple times
      with ascending addresses and close to each other.


      For example, for two addresses addr1 and addr2, we might have the following heap layout:


      | q | addr1 | addr2
      v v v
      ——————————————————————————————————————————
      | | |
      ——————————————————————————————————————————
      ^
      | gc alloc block ------------------- |


      It's possible that BOT only records the large gc-allocated blocks but not individual objects in them.
      So when we call block_start() with either addr1 or addr2, it will return q.
      BOT has the ability to fix itself. When we call block_start(addr1), it will fix from q to addr1, looking
      for all of the real objects in the range and update the entries. After that BOT will not return the
      gc-allocated block address q but the real objects' addresses.
      So calling block_start(addr1) and block_start(addr2) in different orders result in different work:

      Case 1: {
      call block_start(addr1)

      block_start returns q

      enter slow path {
        fix bot from q to addr1
        return head_of(addr1)
      }

      call block_start(addr2)

      block_start returns q

      enter slow path {
        fix bot from q to addr2
        return head_of(addr2)
      }
      }

      Case 2: {
      call block_start(addr2)

      block_start returns q

      enter slow path {
        fix bot from q to addr2
        return head_of(addr2)
      }

      call block_start(addr1)

      will not enter slow path because range q to addr1 is already fixed
      return head_of(addr1)
      }


      The difference between Case 1 and 2 is the repeated traversal between q and addr1.
      This affects G1ScanHRForRegionClosure::scan_heap_roots and potentially any future code that walks card table for dirty oops in
      ascending address order. It won't matter a lot if the address are sparse enough. But, say if we do -XX:G1ConcRefinementGreenZone=1000000
      and log the function forward_to_block_containing_addr_slow(), we will find there are some entries with the same q.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              yyang Yi Yang
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated: