Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8218628

Add detailed message to NullPointerException describing what is null.


    • Subcomponent:
    • Understanding:
      Fix Understood


      This Enhancement implements the algorithm to compute the null-detail message specified in JEP JDK-8220715: "Helpful NullPointerExceptions".

      Further parts of the JEP are implemented in JDK-8221077.

      The messages printed are described in the JEP. A list of examples is attached to the JEP and can be found here:

      [Messages if classfiles contain debug info.](http://cr.openjdk.java.net/~goetz/wr19/8218628-exMsg-NPE/12/output_with_debug_info.txt) [Messages if classfiles contain no debug info.](http://cr.openjdk.java.net/~goetz/wr19/8218628-exMsg-NPE/12/output_no_debug_info.txt)

      This RFE gives some details on the implementation on top of what is described in the JEP.

      ### Modifications to class NullPointerException

      The JEP states that the null-detail message is computed delayed and on-demand only.

      This is implemented by adding a getMessage() method to NullPointerException. If the field Throwable.detailMessage is empty, it calls native method `getExtendedNPEMessage()` that is implemented in the virtual machine. This method returns a String Object containing the null-detail message or null if it failed to compute the message.

      ### Basic algorithm to compute the message

      This algorithm is passed the bytecodes of a method and the bytecode index where the exception occurred.

      First, the information to step backwards over the bytecodes is computed by a dataflow analysis. This is implemented in the constructor of class `ExceptionMessageBuilder`.

      For each bytecode, it builds a stack of `StackSlotAnalysisData` called `SimulatedOperandStack` describing each operand stack slot. The `StackSlotAnalysisData` is the analysis information for a stack slot and contains the bytecode index of the producer and the Java type of the value on the stack. It maintains an array indexed by bytecode indices containing these stacks.

      The analysis walks forward over the bytecodes simulating the effects of the bytecode. The simulation first duplicates the stack assigned to this bytecode. Then it pops as many `StackSlotAnalysisData` entries from the copied `SimulatedOperandStack` as the bytecode would pop stack slots. It then pushes, according to the semantics of the bytecode, new `StackSlotAnalysisData` entries on the stack and assigns the copy to the next bytecode. Let's look at a possible bytecode `8: getfield #13` at bytecode index 8. Let's assume there are already 5 slots on the stack for bytecode 8. The analysis then copies the stack of depth 5. The getfield bytecode pops an object reference, thus the entry at the top of the `SimulatedOperandStack` must be a reference. This entry is removed. If the constant pool entry #13 describes a reference field, a new entry <reference, 8> is pushed. The next bytecode is at index 11 and gets the new stack assigned.

      If bytecode 11 is another getfield and causes a NullPointerException, the operand slot 5 contained null. The `StackSlotAnalysisData` of the `SimulatedOperandStack` is <reference, 8> and tells that bytecode 8 pushed the null. This way we can step backwards over the bytecodes.

      The stack for the first bytecode is initialized to be empty. At bytecodes altering the control flow the stack is just copied to the target. In case of several control flow targets it is copied to all. At a merge point of control flow, the stacks of the predecessors must be merged. All stack slots that are not identical are set to be undefined. Thus, walking backwards past control flow merges is not always possible.

      The access paths computed by this algorithm usually stem from expressions in Java code. Expressions usually do not contain control flow. (An exemption is the ? operator.) Therefore the algorithm rarely tries to walk back past control flow merges.

      Assembling the first part of the message is implemented in `print_NPE_failed_action()`. It takes the bytecode at the index passed to the algorithm. A switch over the bytecodes listed in Table 1 of the JEP prints this part of the message.

      Assembling the second part of the message is implemented in `print_NPE_cause()`. `print_NPE_cause()` is called with a bytecode index `bci` and an operand stack slot `slot`. It first steps back to the bytecode that pushed this operand stack slot. The index of this bytecode is taken from the corresponding `StackSlotAnalysisData` at `bci` and `slot`.

      When `print_NPE_cause()` is called from the main algorithm, it steps back to the
      bytecode that pushed the null value. The slot needed for this stepping is computed by `get_NPE_null_slot()` which is, similar to `print_NPE_cause()`, a switch over the bytecodes listed in Table 1 of the JEP.

      The message part of `print_NPE_cause()` is generated by a switch over the bytecodes listed in Table 2 of the JEP. It calls itself recursively to compute access paths covering several bytecodes. For array loads, it even does two recursive calls: to the bytecode that pushed the array reference and to the bytecode that pushed the array index.

      `print_NPE_cause()` keeps a counter of the steps walked back to limit its complexity and to limit the size of the access path printed. It walks back at most 5 steps.

      `print_NPE_cause()` does cover all the bytecodes listed in Table 2 of the JEP. Arithmetic bytecodes as `iadd` and casts as `l2i` can be encountered in array index expressions but are not implemented. As stated in the JEP, '...' is printed instead.

      All message text is printed directly to a StringStream passed to the algorithm.

      ### Computing the message delayed

      The algorithm to compute the message needs the bytecodes and the bytecode index of where the exception occurred. Obviously, both is available in internal data sturcutres of the virtual machine when the exception is thrown.

      As stated in the JEP, the message is computed delayed, though. Thus, the bytecodes and the bytecode index must be preserved until the message is computed.

      Instead of preserving this information for the exception, we rely on an internal data structure of Throwable, the backtrace. A Throwable contains an array of StackTraceElements representing the stack when the exception occurred. This array is also computed delayed and on demand only. The information needed to compute this delayed is preserved in the backtrace data structure that contains references to the intenal representation of methods and bytecode indices.

      The top frame representation of this backtrace data structure contains just what we need to compute the exception delayed.

      The method `java_lang_Throwable::get_method_and_bci()` accesses this data structure and returns a pointer to the method containing the bytecodes we need and the bci.

      As we compute the message delayed, the method can be unloaded or rewritten in the meantime. If so, the bytecodes are not found and `get_method_and_bci()` fails. No message is returned. Similary, flag OmitStackTraceInFastThrow is on per default. If the stack trace is omitted for an NPE, the backtrace data structure is missing and no message is returned.

      The internal backtrace data structure cannot be serialized. Therefore, if a Throwable is serialized, the Java-level StackTraceElements[] is computed. As specified in the JEP, this does not happen for the null-detail message of NPE. Thus, the message of the NPE will be empty after deserialization, and the algorithm to compute it will be called if the message is
      accessed. `get_method_and_bci()` will fail as the backtrace data structure is not there and no message is returned.

      ### Detecting cases where NPE is thrown explicitly

      The null-detail message is not printed if the exception was thrown by the user.

      If this is the case, the bytecode at the bytecode index passed to the algorithm is an invoke of the NullPointerException::<init> method. This is recognized in `get_NPE_null_slot()` when we first look at the bytecode and null is returned for the message.

      Alternatively, the NullPointerException can be constructed via JNI. If so, the holder of the method is NativeConstructorAccessorImpl. This is checked after obtaining the method with the bytecodes.

      ### Handling hidden frames

      Exceptions can print the stack trace of when the exception occured.
      There can be methods on the stack that have been generated by
      the runtime. To not confuse the reader of the stackTrace,
      these are omitted (if -XX:-ShowHiddenFrames, which is default.)

      The frames are already dropped when the backtrace datastructure is

      If a NullPointerException is raised in a method whose stack
      frame will be hidden, a message based on the wrong method
      is printed. This is because the message is generated on base
      of the backtrace, which lacks the real top frame.

      This change adds a java.lang.Boolean(true) to the backtrace
      in case the real top frame is a hidden one and is dropped.

      When the NullPointerException message text is generated, this is
      checked and the message is skipped if the proper frame is not

      I handle this in a bug itself to keep the discussion of
      the basic feature seperated from this technical issue.

      The problem exists independent of the language chosen
      to implement generating the message.

      ### A flag to switch off the feature.

      Add a manageable flag called -XX:SuppressCodeDetailsInExceptionMessages to
      configure the content of exception messages as above new message.
      The default value is "true", i.e., the message is off per default.
      This was requested by Oracle.


          Issue Links



              • Assignee:
                goetz Goetz Lindenmaier
                goetz Goetz Lindenmaier
              • Votes:
                1 Vote for this issue
                9 Start watching this issue


                • Created: