Thread A (Relocation):
The membar guarantees that the contents of the forwarded object are ready after a forwarding entry is loaded. Since load_object_content(ref) depends on the result of load_forwarding_table(), load_acquire can be safely changed to a simple load.
Thread B (Remapping/Relocation):
ref = load_forwarding_table(); // acquire (current version) -> relaxed (our proposal)
Our experiment on heapothesys demonstrates >5% time reduction spent on concurrent mark/relocation on AArch64.