Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8205924

ZGC: Premature OOME due to failure to expand backing file

    XMLWordPrintable

    Details

    • Subcomponent:
      gc
    • Resolved In Build:
      b22

      Backports

        Description

        ZGC currently assumes that there will be enough space available on the backing file system to hold the max heap size (-Xmx). However, this might not be true. For example, the backing filesystem might have been misconfigured or space on that filesystem might be used by some other process. In this situation, ZGC will try (and fail) to map more memory every time a new page needs to be allocated (assuming that request can't be satisfied by the page case). As a result, we fail to flush the page cache, which in turn means we throw a premature OOME and we continuously take the performance hit by making unnecessary fallocate() syscalls that will never succeed. We should instead detect this situation, flush the page cache and avoid making further fallocate() calls.

        This issue has been seen now and then in various tests (e.g. RunThese30M and Kitchensink), typically on machines running older kernels without support for memfd_create(), where we fall back to using /dev/shm, which sometimes doesn't have enough space to hold the given max heap size (default tmpfs size is 50% of the RAM in the machine).

          Attachments

            Issue Links

              Activity

                People

                Assignee:
                pliden Per Liden
                Reporter:
                pliden Per Liden
                Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                  Dates

                  Created:
                  Updated:
                  Resolved: