Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8130227

JEP 274: Enhanced Method Handles

    Details

    • Author:
      Michael Haupt
    • JEP Type:
      Feature
    • Exposure:
      Open
    • Subcomponent:
    • Scope:
      SE
    • Discussion:
      mlvm dash dev at openjdk dot java dot net
    • Alert Status:
       Green
    • JEP Number:
      274

      Description

      Summary

      Enhance the MethodHandle, MethodHandles, and MethodHandles.Lookup classes of the java.lang.invoke package to ease common use cases and enable better compiler optimizations by means of new MethodHandle combinators and lookup refinement.

      Goals

      • In the MethodHandles class in the java.lang.invoke package, provide new MethodHandle combinators for loops and try/finally blocks.

      • Enhance the MethodHandle and MethodHandles classes with new MethodHandle combinators for argument handling.

      • Implement new lookups for interface methods and, optionally, super constructors in the MethodHandles.Lookup class.

      Non-Goals

      • With the exception of possibly-required native functionality, VM-level extensions and enhancements, specifically compiler optimizations, are a non-goal.

      • Extensions at the Java language level are explicitly out of scope.

      Motivation

      In a thread on the mlvm-dev mailing list (part 1, part 2) developers have discussed possible extensions to the MethodHandle, MethodHandles, and MethodHandles.Lookup classes in the java.lang.invoke package to make the realization of common use cases easier, and also to allow for use cases that are deemed important but are currently not supported.

      The extensions proposed below not only allow for more concise usage of the MethodHandle API, but they also reduce the amount of MethodHandle instances created in some cases. This, in turn, will facilitate better optimizations on behalf of the VM's compiler.

      Combinators for More Statements

      Loops. The MethodHandles class provides no abstractions for loop construction from MethodHandle instances. There should be a means for constructing loops from MethodHandles representing the loop's body, as well as initialization and condition, or count.

      Try/finally blocks. MethodHandles also provides no abstraction for try/finally blocks. A method to construct such blocks from method handles representing the try and finally parts should be provided.

      Better Argument Handling

      Argument spreading. With MethodHandle.asSpreader(Class<?> arrayType, int arrayLength), there exists an operation to create a method handle that will spread the contents of a trailing array argument to a number of arguments. An additional asSpreader method should be provided, allowing to expand a number of arguments contained in an array anywhere in a method signature to a number of distinct arguments.

      Argument collection. The method MethodHandle.asCollector(Class<?> arrayType, int arrayLength) produces a handle that collects the trailing arrayLength arguments into an array. There is no means for achieving the same for a number of arguments elsewhere in a method signature. There should be an additional asCollector method that supports this.

      Argument folding. The folding combinator, foldArguments(MethodHandle target, MethodHandle combinator), does not allow to control the position in the argument list at which folding should start. A position argument should be added; the number of arguments to fold is implicitly given as the number of arguments the combinator accepts.

      More Lookup Functions

      Non-abstract methods in interfaces. Currently, a use case such as this one will fail at run-time at the indicated position:

      interface I1 {
          default void m() { System.err.println("I1.m"); }
      }
      
      interface I2 {
          default void m() { System.err.println("I2.m"); }
      }
      
      class C implements I1, I2 {
          public void m() { I2.super.m(); System.err.println("C.m"); }
      }
      
      public class IfcSuper {
          public static void main(String[] args) throws Throwable {
              C c = new C();
              MethodHandles.Lookup l = MethodHandles.lookup();
              MethodType t = MethodType.methodType(void.class);
              // This lookup will fail with an IllegalAccessException.
              MethodHandle di1m = l.findSpecial(I1.class, "m", t, C.class);
              ci1m.invoke(c);
          }
      }

      It should, however, be possible to construct MethodHandles that bind to non-abstract methods in interfaces.

      Class lookup. Finally, the lookup API should allow for looking up classes from different contexts, which is currently not possible. In the MethodHandles area, all required access checks are done at lookup time (as opposed to run-time, as is the case with reflection). Classes are passed in terms of their .class instance. To facilitate lookups with a certain control over the context, e.g., across module boundaries, there should be a lookup method that delivers a Class instance with the right restrictions for further use in MethodHandle combinators.

      Description

      Combinators for Loops

      Most Generic Loop Abstraction

      The core abstractions for loops include an initialization of the loop, a predicate to check, and a body to evaluate. The most generic MethodHandle combinator for creating a loop, to be added to MethodHandles, is as follows:

      MethodHandle loop(MethodHandle[]... clauses)

      Constructs a method handle representing a loop with several loop variables that are updated and checked upon each iteration. Upon termination of the loop due to one of the predicates, a corresponding finalizer is run and delivers the loop's result, which is the return value of the resulting handle.

      Intuitively, every loop is formed by one or more "clauses", each specifying a local iteration value and/or a loop exit. Each iteration of the loop executes each clause in order. A clause can optionally update its iteration variable; it can also optionally perform a test and conditional loop exit. In order to express this logic in terms of method handles, each clause will determine four actions:

      • Before the loop executes, the initialization of an iteration variable or loop invariant local.

      • When a clause executes, an update step for the iteration variable.

      • When a clause executes, a predicate execution to test for loop exit.

      • If a clause causes a loop exit, a finalizer execution to compute the loop's return value.

      Some of these clause parts may be omitted according to certain rules, and useful default behavior is provided in this case. See below for a detailed description.

      Each clause function, with the exception of clause initializers, is able to observe the entire loop state, because it will be passed all current iteration variable values, as well as all incoming loop parameters. Most clause functions will not need all of this information, but they will be formally connected as if by dropArguments.

      Given a set of clauses, there is a number of checks and adjustments performed to connect all the parts of the loop. They are spelled out in detail in the steps below. In these steps, every occurrence of the word "must" corresponds to a place where IllegalArgumentException may be thrown if the required constraint is not met by the inputs to the loop combinator. The term "effectively identical", applied to parameter type lists, means that they must be identical, or else one list must be a proper prefix of the other.

      Step 0: Determine clause structure.

      • The clause array (of type MethodHandle[][] must be non-null and contain at least one element.

      • The clause array may not contain nulls or sub-arrays longer than four elements.

      • Clauses shorter than four elements are treated as if they were padded by null elements to length four. Padding takes place by appending elements to the array.

      • Clauses with all nulls are disregarded.

      • Each clause is treated as a four-tuple of functions, called "init", "step", "pred", and "fini".

      Step 1A: Determine iteration variables.

      • Examine init and step function return types, pairwise, to determine each clause's iteration variable type.

      • If both functions are omitted, use void; else if one is omitted, use the other's return type; else use the common return type (they must be identical).

      • Form the list of return types (in clause order), omitting all occurrences of void.

      • This list of types is called the "common prefix".

      Step 1B: Determine loop parameters.

      • Examine init function parameter lists.

      • Omitted init functions are deemed to have null parameter lists.

      • All init function parameter lists must be effectively identical.

      • The longest parameter list (which is necessarily unique) is called the "common suffix".

      Step 1C: Determine loop return type.

      • Examine fini function return types, disregarding omitted fini functions.

      • If there are no fini functions, use void as the loop return type.

      • Otherwise, use the common return type of the fini functions; they must all be identical.

      Step 1D: Check other types.

      • There must be at least one non-omitted pred function.

      • Every non-omitted pred function must have a boolean return type.

      (Implementation Note: Steps 1A, 1B, 1C, 1D are logically independent of each other, and may be performed in any order.)

      Step 2: Determine parameter lists.

      • The parameter list for the resulting loop handle will be the "common suffix".

      • The parameter list for init functions will be adjusted to the "common suffix". (Note that their parameter lists are already effectively identical to the common suffix.)

      • The parameter list for non-init (step, pred, and fini) functions will be adjusted to the common prefix followed by the common suffix, called the "common parameter sequence".

      • Every non-init, non-omitted function parameter list must be effectively identical to the common parameter sequence.

      Step 3: Fill in omitted functions.

      • If an init function is omitted, use a constant function of the appropriate null/zero/false/void type. (For this purpose, a constant void is simply a function which does nothing and returns void; it can be obtained from another constant function by type conversion via MethodHandle.asType type.)

      • If a step function is omitted, use an identity function of the clause's iteration variable type; insert dropped argument parameters before the identity function parameter for the non-void iteration variables of preceding clauses. (This will turn the loop variable into a local loop invariant.)

      • If a pred function is omitted, the corresponding fini function must also be omitted.

      • If a pred function is omitted, use a constant true function. (This will keep the loop going, as far as this clause is concerned.)

      • If a fini function is omitted, use a constant null/zero/false/void function of the loop return type.

      Step 4: Fill in missing parameter types.

      • At this point, every init function parameter list is effectively identical to the common suffix, but some lists may be shorter. For every init function with a short parameter list, pad out the end of the list by dropping arguments.

      • At this point, every non-init function parameter list is effectively identical to the common parameter sequence, but some lists may be shorter. For every non-init function with a short parameter list, pad out the end of the list by dropping arguments.

      Final observations.

      • After these steps, all clauses have been adjusted by supplying omitted functions and arguments.

      • All init functions have a common parameter type list, which the final loop handle will also have.

      • All fini functions have a common return type, which the final loop handle will also have.

      • All non-init functions have a common parameter type list, which is the common parameter sequence, of (non-void) iteration variables followed by loop parameters.

      • Each pair of init and step functions agrees in their return types.

      • Each non-init function will be able to observe the current values of all iteration variables, by means of the common prefix.

      Loop execution.

      • When the loop is called, the loop input values are saved in locals, to be passed (as the common suffix) to every clause function. These locals are loop invariant.

      • Each init function is executed in clause order (passing the common suffix) and the non-void values are saved (as the common prefix) into locals. These locals are loop varying (unless their steps are identity functions, as noted above).

      • All function executions (except init functions) will be passed the common parameter sequence, consisting of the non-void iteration values (in clause order) and then the loop inputs (in argument order).

      • The step and pred functions are then executed, in clause order (step before pred), until a pred function returns false.

      • The non-void result from a step function call is used to update the corresponding loop variable. The updated value is immediately visible to all subsequent function calls.

      • If a pred function returns false, the corresponding fini function is called, and the resulting value is returned from the loop as a whole.

      The semantics of a MethodHandle l returned from loop are as follows:

      l(arg*) =>
      {
          let v* = init*(arg*);
          for (;;) {
              for ((v, s, p, f) in (v*, step*, pred*, fini*)) {
                  v = s(v*, arg*);
                  if (!p(v*, arg*)) {
                      return f(v*, arg*);
                  }
              }
          }
      }

      Based on this most generic abstraction of loops, several convenient combinators should be added to MethodHandles. They are discussed in the following.

      Simple while and do-while Loops

      These combinators will be added to MethodHandles:

      MethodHandle whileLoop(MethodHandle init, MethodHandle pred, MethodHandle body)
      
      MethodHandle doWhileLoop(MethodHandle init, MethodHandle body, MethodHandle pred)

      The semantics of invoking the MethodHandle object wl returned from whileLoop are as follows:

      wl(arg*) =>
      {
          let r = init(arg*);
          while (pred(r, arg*)) { r = body(r, arg*); }
          return r;
      }

      For a MethodHandle dwl returned from doWhileLoop, the semantics are as follows:

      dwl(arg*) =>
      {
          let r = init(arg*);
          do { r = body(r, arg*); } while (pred(r, arg*));
          return r;
      }

      This scheme imposes some restrictions on the signatures that the three constituent MethodHandles can have:

      1. The return type of the initializer init, is also the return type of the body body and of the entire loop, as well as the type of the first argument of the predicate pred and the body body.

      2. The return type of the predicate pred must be boolean.

      Counting Loops

      For convenience, the following loop combinators will also be provided:

      • MethodHandle countedLoop(MethodHandle iterations, MethodHandle init, MethodHandle body)

        A MethodHandle cl returned from countedLoop has the following semantics:

        cl(arg*) =>
        {
            let end = iterations(arg*);
            let r = init(arg*);
            for (int i = 0; i < end; i++) {
                r = body(i, r, arg*);
            }
            return r;
        }
      • MethodHandle countedLoop(MethodHandle start, MethodHandle end, MethodHandle init, MethodHandle body)

        A MethodHandle cl returned from this variant of countedLoop has the following semantics:

        cl(arg*) =>
        {
            let s = start(arg*);
            let e = end(arg*);
            let r = init(arg*);
            for (int i = s; i < e; i++) {
                r = body(i, r, arg*);
            }
            return r;
        }

      In these two cases, the type of the first argument of body must be int, and the return types of init and body as well as the second argument of body must be the same.

      Iteration Over Data Structures

      Furthermore, a loop combinator for iteration is helpful:

      • MethodHandle iteratedLoop(MethodHandle iterator, MethodHandle init, MethodHandle body)

        A MethodHandle it returned from iteratedLoop has the following semantics:

        it(arg*) =>
        {
            let it = iterator(arg*);
            let v = init(arg*);
            for (T t : it) {
                v = body(t, v, a);
            }
            return v;
        }

      Remarks

      More convenience loop combinators are conceivable.

      While the semantics of continue can easily be emulated by returning from the body, it is an open question how the semantics of break can be emulated. This could be achieved by using a dedicated exception (e.g., LoopMethodHandle.BreakException).

      Combinator for try/finally Blocks

      To facilitate the construction of functionality with try/finally semantics from MethodHandles, the following new combinator will be introduced to MethodHandles:

      MethodHandle tryFinally(MethodHandle target, MethodHandle cleanup)

      The semantics of invoking a MethodHandle tf returned from tryFinally are as follows:

      tf(arg*) =>
      {
          Throwable t;
          Object r;
          try {
              r = target(arg*);
          } catch (Throwable x) {
              t = x;
              throw x;
          } finally {
              r = cleanup(t, r, arg*);
          }
          return r;
      }

      That is, the return type of the resulting MethodHandle will be that of the target handle. Both the target and the cleanup must have matching argument lists, with the extension for cleanup that it accepts one Throwable argument and the - possibly intermediate - result. In case an exception was thrown during the execution of target, this argument will hold that exception.

      Combinators for Argument Handling

      As additions to the existing API in MethodHandles, the following methods will be introduced:

      • Addition to the class MethodHandle - new instance method:

        MethodHandle asSpreader(int pos, Class<?> arrayType, int arrayLength)

        In the signature of the result, at position pos, expect arrayLength arguments of type arrayType. In the result, insert an array consuming arrayLength arguments of this MethodHandle. If the signature of this does not have enough arguments at that position, or if the position does not exist in the signature, raise an appropriate exception.

        For example, if the signature of this is (Ljava/lang/String;IIILjava/lang/Object;)V, calling asSpreader(int[].class, 1, 3) will lead to the resulting signature (Ljava/lang/String;[ILjava/lang/Object;)V.

      • Addition to the class MethodHandle - new instance method:

        MethodHandle asCollector(int pos, Class<?> arrayType, int arrayLength)

        In the signature of this, at position pos, expect an array argument. In the signature of the result, at position pos, there will be arrayLength arguments of the type of that array. All arguments before pos are not affected. All arguments after pos are shifted to the right by arrayLength. It is expected that the arguments to be spread are available in the array at run-time; in case they are not, an ArrayIndexOutOfBoundsException is thrown.

        For example, if the signature of this is (Ljava/lang/String;[ILjava/lang/Object;)V, calling asCollector(int[].class, 1, 3) will lead to the resulting signature (Ljava/lang/String;IIILjava/lang/Object;)V.

      • Addition to the class MethodHandles - new static method:

        MethodHandle foldArguments(MethodHandle target, int pos, MethodHandle combiner)

        The resulting MethodHandle will, when invoked, act like the existing method foldArguments(MethodHandle target, MethodHandle combiner) with the difference that the already existing method implies a folding position of 0, while the proposed new method allows for specifying a folding position other than 0.

        For example, if the target signature is (ZLjava/lang/String;ZI)I, and the combiner signature is (ZI)Ljava/lang/String;, calling foldArguments(target, 1, combiner) will lead to the resulting signature (ZZI)I, and the second and third (boolean and int) arguments will be folded into a String upon each invocation.

      These new combinators will be implemented using existing abstractions and API. If required, non-public API will be modified.

      Lookups

      The implementation of the method MethodHandles.Lookup.findSpecial(Class<?> refc, String name, MethodType type, Class<?> specialCaller) will be modified to allow for finding super-callable methods on interfaces. While this is not a change of the API as such, its documented behaviour changes significantly.

      Also, the MethodHandles.Lookup class will be extended with the following two methods:

      • Class<?> findClass(String targetName)

        This retrieves an instance of Class<?> representing the desired target class identified by the targetName. The lookup applies the restrictions defined by the implicit access context. In case the access is not possible, the method raises an appropriate exception.

      • Class<?> accessClass(Class<?> targetClass)

        This attempts to access the given class, applying the restrictions defined by the implicit access context. In case the access is not possible, the method raises an appropriate exception.

      Risks and Assumptions

      As this is a purely additive API extension, no code that existing clients of the MethodHandle API use will be negatively affected. The proposed extensions also do not rely on any other ongoing development.

      Unit tests for all of the above API extensions will be provided.

      Dependences

      This JEP is related to JEP 193 (Variable Handles), and a certain amount of overlap is possible since VarHandles depend on the MethodHandle API. This will be addressed in collaboration with the owner of JEP 193.

      The JBS issue on JSR 292 enhancements for maintenance releases can be considered a starting point for this JEP, which distills from that issue those points upon which agreement has been reached.

        Issue Links

          Activity

          Hide
          alanb Alan Bateman added a comment - - edited
          [ I see lookupClass(Class<?>) has been renamed to findClass(String) in the latest revision so updating a comment on that ]

          I assume the proposed findClass(String) will use the loader of the caller as the initiating loader, is that right? I could imagine this needing variants like findClass(ClassLoader, String) or the proposed findClass(Module, String) to be widely useful.

          In any case, it would be good for the lookup to have at least a prototype user, ServiceLoader is one possible candidate, it is currently using Class.findClass(Module, String) in the jake forest.
          Show
          alanb Alan Bateman added a comment - - edited [ I see lookupClass(Class<?>) has been renamed to findClass(String) in the latest revision so updating a comment on that ] I assume the proposed findClass(String) will use the loader of the caller as the initiating loader, is that right? I could imagine this needing variants like findClass(ClassLoader, String) or the proposed findClass(Module, String) to be widely useful. In any case, it would be good for the lookup to have at least a prototype user, ServiceLoader is one possible candidate, it is currently using Class.findClass(Module, String) in the jake forest.
          Hide
          jrose John Rose added a comment - - edited
          [~alanb] asked: "I assume the proposed findClass(String) will use the loader of the caller as the initiating loader, is that right? I could imagine this needing variants like findClass(ClassLoader, String) or the proposed findClass(Module, String) to be widely useful."

          The short answer is "no". The long answer follows...

          The Lookup object itself contains a securely bound class (Lookup.lookupClass), which is used not only for access checking but also for scoping of names. Therefore, Lookup.findClass(String) will derive the initiating loader from the lookup class.

          If the Lookup object has private access (is full-strength) then there is no stack walking and no security manager call, beyond the natural actions performed by resolution of a constant pool reference to a CONSTANT_Class. This is a corollary of the basic design principle, that Lookup operations are competent to emulate any bytecode behavior.

          If the Lookup object does not have private access (is a weak lookup), then there may be a security manager check, to see if the class loader for the lookup-class can (in fact) be accessed. This provides a way to get at the functionality of ternary Class.forName, if you have a Class object already in ClassLoader from which you are trying to initiate a load.

          From another perspective, Lookup.findClass on a full-strength lookup has exactly the same power as Class.forName (the unary version). The difference between the two calls is that Class.forName takes its caller class as a fixed parameter (via the CallerSensitive convention), whereas Lookup.findClass takes the corresponding parameter from the lookup class of the Lookup object. Both designs allow securable, authenticated lookups, but only the newer API supports delegation, and presents the scope parameter as an explicit value (instead of an indirect stack walk). The newer API is both more powerful and easier to reason about.

          N.B. It would be a grave error to introduce Lookup API points which are CallerSensitive. The only CallerSensitive API point related to Lookup is MethodHandles.lookup(), which is a factory that converts the current caller into a securely bound Lookup object, which can then be delegated to any trusted party (such as a bootstrap method or stack walker).
          Show
          jrose John Rose added a comment - - edited [~alanb] asked: "I assume the proposed findClass(String) will use the loader of the caller as the initiating loader, is that right? I could imagine this needing variants like findClass(ClassLoader, String) or the proposed findClass(Module, String) to be widely useful." The short answer is "no". The long answer follows... The Lookup object itself contains a securely bound class (Lookup.lookupClass), which is used not only for access checking but also for scoping of names. Therefore, Lookup.findClass(String) will derive the initiating loader from the lookup class. If the Lookup object has private access (is full-strength) then there is no stack walking and no security manager call, beyond the natural actions performed by resolution of a constant pool reference to a CONSTANT_Class. This is a corollary of the basic design principle, that Lookup operations are competent to emulate any bytecode behavior. If the Lookup object does not have private access (is a weak lookup), then there may be a security manager check, to see if the class loader for the lookup-class can (in fact) be accessed. This provides a way to get at the functionality of ternary Class.forName, if you have a Class object already in ClassLoader from which you are trying to initiate a load. From another perspective, Lookup.findClass on a full-strength lookup has exactly the same power as Class.forName (the unary version). The difference between the two calls is that Class.forName takes its caller class as a fixed parameter (via the CallerSensitive convention), whereas Lookup.findClass takes the corresponding parameter from the lookup class of the Lookup object. Both designs allow securable, authenticated lookups, but only the newer API supports delegation, and presents the scope parameter as an explicit value (instead of an indirect stack walk). The newer API is both more powerful and easier to reason about. N.B. It would be a grave error to introduce Lookup API points which are CallerSensitive. The only CallerSensitive API point related to Lookup is MethodHandles.lookup(), which is a factory that converts the current caller into a securely bound Lookup object, which can then be delegated to any trusted party (such as a bootstrap method or stack walker).
          Hide
          alanb Alan Bateman added a comment -
          Thanks for the clarification, I think it's clear now.
          Show
          alanb Alan Bateman added a comment - Thanks for the clarification, I think it's clear now.

            People

            • Assignee:
              mhaupt Michael Haupt
              Reporter:
              mhaupt Michael Haupt
              Owner:
              Michael Haupt
              Reviewed By:
              Alex Buckley, Paul Sandoz, Vladimir Ivanov
              Endorsed By:
              John Rose
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Due:
                Created:
                Updated:
                Resolved:
                Integration Due: