Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8260244

Record Patterns and Array Patterns (Preview)

    Details

    • Type: JEP
    • Status: Draft
    • Priority: P3
    • Resolution: Unresolved
    • Fix Version/s: None
    • Component/s: specification
    • Labels:
      None
    • Author:
      Gavin Bierman
    • JEP Type:
      Feature
    • Exposure:
      Open
    • Subcomponent:
    • Scope:
      SE
    • Discussion:
      amber dash dev at openjdk dot java dot net

      Description

      Summary

      Enhance the Java programming language with record patterns, to deconstruct record values, and array patterns, to deconstruct array values. Record patterns, array patterns, and type patterns (from Java 16) can be nested (patterns within patterns) to significantly enhance the expressiveness and utility of pattern matching.

      Goals

      • To extend pattern matching to express more sophisticated, composable data queries

      Non-Goals

      • This JEP does not change the syntax or semantics of type patterns, introduced in Java 16.

      Motivation

      JEP 394 extended the instanceof expression to support a pattern operand, and extended the semantics to perform pattern matching. However, the only pattern supported was a type pattern. But this modest start already allows occurrences of the extremely common "instanceof-and-cast" code such as the following:

      if (obj instanceof String) {
          String s = (String)obj;
          // use s
      }

      to be replaced with the simpler use of a pattern:

      if (obj instanceof String s) {
          // use s
      }

      The operand String s is a type pattern. A value matches this type pattern if, at run-time, the value can be cast to String without raising a ClassCastException. In this case, the result of the instanceof expression is true and the pattern variable s is initialized to the String value (and is available for use in the "then" block).

      Type patterns remove many occurrences of casting at a stroke. However, they are only the first step towards a more declarative, null-safe style of programming. As Java supports new and more expressive ways of modeling data, pattern matching can streamline the use of such data by recognizing the semantic intent of the model.

      Pattern Matching and Record Classes

      Record classes are transparent carriers for data. Any code that receives an instance of a record class will usually want to extract the data, known as "components". For example, imagine that we use a type pattern to test whether a value is an instance of record class Point, and if so, we extract the x and y components from the value:

      record Point(int x, int y){ }
      
      static void printSum(Object o) {
          if (o instanceof Point p) {
              int x = p.x();
              int y = p.y();
              System.out.println(x+y);
          }
      }

      The variable p is somewhat redundant -- it is used solely to invoke the accessor methods x() and y() which return the components x and y. (Every record class has a 1:1 correspondence between accessor methods and components.) It would be better if the pattern could not only test whether a value is an instance of Point, but also extract the x and y components from the value directly, invoking their accessor methods on our behalf. In other words, code such as the following:

      record Point(int x, int y){ }
      
      void printSumWithPattern(Object o) {
          if (o instanceof Point(int x, int y)) {
              System.out.println(x+y);
          }
      }

      Point(int x, int y) is a record pattern. It lifts the declaration of local variables for extracted components into the pattern itself, and initializes those variables by invoking accessor methods when a value is matched against the pattern. In effect, a record pattern "disaggregates" an instance of a record class into the components of the record class. (Note that names are introduced only for the components, not for the Point itself. See "Named record and array patterns" for more information.)

      The real power of pattern matching, however, is that it scales powerfully to match more complicated object graphs. For example, consider the following declarations:

      record Point(int x, int y) {}
      enum Color { RED, ORANGE, YELLOW, GREEN, BLUE, INDIGO, VIOLET }
      record ColoredPoint(Point p, Color c) {}
      record Rectangle(ColoredPoint ul, ColoredPoint lr) {}

      As we have seen, extracting the components from an object can be achieved with a record pattern, as follows:

      static void printUpperLeftColoredPoint(Rectangle r) {
          if (r instanceof Rectangle(ColoredPoint ul, ColoredPoint lr)){
              System.out.println(ul);
          }
      }

      But if this method had to print the color of the ul colored point, the code becomes a little more cumbersome because it has to deal with the possibility that ul is null:

      static void printColorOfUpperLeftPoint(Rectangle r) {
          if (r instanceof Rectangle(ColoredPoint ul, ColoredPoint lr)){
              if (ul != null) {
                  return;
              }
              Color c = ul.c();
              System.out.println(c);
          }
      }

      Pattern matching lets us decompose objects without worrying about null or NullPointerException. This makes code radically clearer and safer than anything previously allowed in Java. For example, we can decompose the object graph starting at a ColoredPoint with a nested record pattern:

      static void printColorOfUpperLeftPointWithNestedPattern(Rectangle r) {
          if (r instanceof Rectangle(ColoredPoint(Point p, Color c), ColoredPoint lr)){
              System.out.println(c);
          }
      }

      The record pattern Rectangle(ColoredPoint(Point p, Color c), ColoredPoint lr) contains within it the nested record pattern ColoredPoint(Point p, Color c). A value r matches this pattern if (i) it is an instance of Rectangle, and (ii) (recursively) the value of the upper-left ColoredPoint component of r matches the pattern ColoredPoint(Point p, Color c).

      The readability of pattern matching scales with the complexity of the object graph. This is because the nesting of record patterns can extract data from objects far more smoothly and concisely than traditional imperative code. For example, to drill all the way down to a rectangle's corner coordinate, we would traditionally write:

      static void printXCoordOfUpperLeftPointBeforePatterns(Rectangle r) {
          if (r == null) {
              return;
          }
          ColoredPoint c = r.l();
          if (c == null) {
              return;
          }
          Point p = c.p();
          if (p == null) {
              return;
          }
          int x = p.x();
          System.out.println("Top-left Corner: " + x);
      }

      Pattern matching elides the "accidental complexity" involved in querying the object graph. The nesting of patterns inside other nested patterns allows us to express these kinds of query and declare local variables that, if matching succeeds, are initialized to the appropriate values:

      static void printXCoordOfUpperLeftPointWithPatterns(Rectangle r) {
          if (r instanceof Rectangle(ColoredPoint(Point(var x, var y), var c), var lr)) {
              System.out.println("Top-left Corner: " + x);
          }
      }

      In summary, record patterns promote a more declarative, null-safe, expression-oriented style of programming in Java.

      Pattern Matching and Array Types

      In a similar vein, we can consider extending pattern matching to other structural types. The next obvious candidate is array types. For example, suppose we wish to check that an Object is a String array, with at least two elements that we wish to extract and print. Using a type pattern, this can be written as follows:

      static void printFirstTwoStrings(Object o) {
          if (o instanceof String[] sa && sa.length >= 2){
      String s1 = sa[0]; String s2 = sa[1]; System.out.println(s1 + s2); } }

      The flow-sensitive scoping of pattern variables means we can use the pattern variable sa on the right-hand side of the && operator and inside the if block. However, it is laborious to check the array length before extracting array elements (similar to checking for null before accessing record components). Since extracting elements is so common, it would be better if the pattern could not only test whether a value is a String array, but also denote the expected elements directly, dereferencing the array on our behalf (and implicitly checking the length). In other words, code such as the following:

      static void printFirstTwoStringsWithArrayPattern(Object o) {
          if (o instanceof String[] { String s1, String s2, ... }){
              System.out.println(s1 + s2);
          }
      }

      String[] {String s1, String s2, ...} is an array pattern. A value matches this pattern if (1) it is a String array, and (2) it has at least two elements (the ... in the pattern means "zero or more elements"). If pattern matching succeeds then s1 is initialized to the first component of the array, and s2 is initialized to the second component. A String array value would only match the pattern String[] {String s1, String s2 } (without the ...) if it has exactly two elements.

      The syntax of an array pattern mirrors the syntax used to initialize arrays. In other words, the value of the expression new String[] { "One", "Two", "Three" } matches the pattern String[] { String s1, String s2, String s3 }.

      The compositionality of patterns allows us to freely nest array and record patterns. For example, the following method prints the sum of the x co-ordinates of the first two points stored in an array:

      static void printSumOfFirstTwoXCoords(Object o) {
          if (o instanceof Point[]{ Point(var x1, var y1), Point(var x2, var y2), ... }) {
              System.out.println(x1 + x2);
          }
      }

      Here we use an array pattern containing two nested record patterns.

      Description

      The purpose of this JEP is to extend the language of patterns beyond the simple type patterns that appear in Java SE 16, as delivered in JEP 394, and provide two new patterns -- record patterns and array patterns -- both of which support nesting of patterns.

      The grammar for patterns will become:

      Pattern:
      TypePattern
      ArrayPattern
      RecordPattern
      TypePattern:
      LocalVariableDeclaration
      ArrayPattern:
      ArrayType ArrayComponentsPattern
      ArrayComponentsPattern:
      { [ ComponentPatternList [ , ... ] ] }
      ComponentPatternList:
      ComponentPattern { , ComponentPattern }
      ComponentPattern:
      Pattern
      ArrayComponentsPattern
      RecordPattern:
      ReferenceType ( [ ArgumentPatternList ] [ , ...] )
      ArgumentPatternList :
      ArgumentPattern { , ArgumentPattern }
      ArgumentPattern:
      Pattern

      Array Patterns

      An array pattern consists of the type of the array and a (possibly empty) list of component patterns, which are used to match against the corresponding array components, ending optionally with a special ... pattern that matches any number of remaining array components (including zero).

      For example, a value successfully matching the array pattern String[] { String s1, String s2} must be a String array with exactly two elements. In contrast, a value successfully matching the array pattern String[] { String s1, String s2, ... } must be a String array containing at least two elements. The null value does not match any array pattern.

      The set of pattern variables declared by an array pattern is given by the union of the sets of pattern variables declared by the component patterns.

      Array patterns also support matching of multidimensional arrays. A value matching the array pattern String[][]{ { String s1, String s2, ...}, {String s3, String s4, ...}, ...} must be a String matrix of at least 2x2 dimension.

      A var component pattern can be used to match against a component of an array without stating the type of the component. The type of the pattern variable is inferred from the pattern itself. For example, if a value matches the array pattern String[] {var s1, ...}, then the pattern variable s1 is inferred to be of type String and will be initialized to the value of first component of the array.

      An expression is compatible with an array pattern if it is downcast compatible with the array type contained in the array pattern.

      Record patterns

      A record pattern consists of a type and a (possibly empty) list of argument patterns, which are used to match against the corresponding record components, ending optionally with a special ... pattern that matches against any number of remaining record components (including zero) in the case where the record class has a variable arity record component (which is always restricted to be the last component).

      For example, given a record declaration:

      record Point(int i, int j) {}

      A value matches the record pattern Point(int a, int b) if it an instance of the record type Point, and if so the pattern variable a is initialized with the result of invoking the accessor method corresponding to i on the value, and the pattern variable b is initialized to the result of invoking the accessor method corresponding to j on the value. The null value does not match any record pattern.

      A record pattern may use a var pattern to match against record components. In this case, the compiler infers the type of the pattern variable introduced by the var pattern. For example, the pattern Point(var a, var b) is shorthand for the pattern Point(int a, int b).

      The set of pattern variables declared by a record pattern is given by the union of the sets of pattern variables declared by the argument patterns.

      Record classes support variable arity record components. For example:

      record MultiColoredPoint(int i, int j, Color... cs) { }

      A value matches the pattern MultiColoredPoint(var a, var b, var first, ... ) if it is an instance of the type MultiColoredPoint and its cs component is an array with at least one element.

      Variable arity record patterns are actually a shorthand for a record pattern containing a nested array pattern. Thus the record pattern:

      MultiColoredPoint(var a, var b, var firstColor, var secondColor, ...)

      is, in fact, shorthand for the following record pattern containing a nested array pattern:

      MultiColoredPoint(var a, var b, Color[]{ var firstColor, var secondColor, ... })

      This shorthand mirrors the same shorthand available when creating instances of a variable arity record type. For example, the expression:

      new MultiColoredPoint(42, 0, RED, GREEN, BLUE)

      is shorthand for:

      new MultiColoredPoint(42, 0, new Color[]{ RED, GREEN, BLUE })

      An expression is compatible with a record pattern if it is downcast compatible with the record type contained in the pattern.

      Future Work

      Adding new pattern forms is an important step in a comprehensive program of enriching Java with pattern matching. Possible areas for future work (to be the subject of other JEPs) include:

      Named record and array patterns

      Both record and array patterns provide a way to deconstruct the value, but they currently do not support a means to also name the value being deconstructed as well. In other languages with similar deconstruction patterns, experience has shown that needing to both name the value and deconstruct it is relatively rare. Picking it as a default would require developers to pick dummy names, or don't care patterns (see below); both of which would add a lot of syntactic clutter.

      Other languages introduce a new pattern form, commonly referred to as an 'as' pattern, to allow a value being deconstructed to be named.

      Don't care Patterns

      It is frequently the case that there are components of a structured object for which we don't want to explicitly declare a pattern variable. For example in this method:

      void int getXfromPoint(Object o) {
          if (o instanceof Point(var x, var y)){
              return x;
          }
          return -1;
      }

      Here the pattern variable y is completely redundant. It has been proposed in other contexts that Java use the _ symbol to denote parameters that need not be named, so one possible extension would be to allow patterns such as Point(var x, var _). However, it might be possible to remove the var, or add syntactic sugar for var _.

      Enhanced Array Patterns

      Whilst the array patterns described above are useful there are clearly other features that could be added. For example, imagine matching a String array, where we are only interested in the eighth and ninth elements of the array. Currently the pattern would be something like String[]{ var dummy1, var dummy2, var dummy3, var dummy4, var dummy5, var dummy6, var dummy7, var eightElement, var ninthElement, ... } which is quite cumbersome. Some sort of index-based element would be better, e.g. String[] { [8] -> var eighthElement, [9] -> var ninthElement}.

      Deconstruction Patterns

      Record patterns allow for the "disaggregation" of values of a record type. In a future version, we hope to support this feature for all classes, not just record classes. We call such a process deconstruction, to suggest its duality with the process of construction.

      Unlike for record classes, where it is automatic to see how an instance should be deconstructed, for general classes, this will require the explicit declaration of a deconstruction pattern for the class. This pattern will declare how an instance of the class can be deconstructed.

      Side-stepping the syntactic details of declaring a deconstruction pattern, it is however immediately clear that using deconstruction patterns allows for very elegant code. For example, if we have a class Node along with subclasses IntNode (containing a single int), AddNode and MulNode (containing two nodes), and NegNode (containing a single node), we can match against a Node and act on the specific subtypes all in one step:

      int eval(Node n) {
          return switch(n) {
              case IntNode(int i) -> i;
              case NegNode(Node n) -> -eval(n);
              case AddNode(Node left, Node right) -> eval(left) + eval(right);
              case MulNode(Node left, Node right) -> eval(left) * eval(right);
              default -> throw new IllegalStateException(n);
          };
      }

      (We might also imagine that were the class Node in fact a sealed class that permits only the four subclasses above, then the compiler can deduce that the default rule does not need to be provided.)

      Today, to express ad-hoc polymorphic calculations like this, we would use the "Visitor" pattern. Using pattern matching leads to code that is transparent and straightforward.

      Dependencies

      This JEP builds on JEP 394 that was delivered in JDK 16.

        Attachments

          Activity

            People

            • Assignee:
              gbierman Gavin Bierman
              Reporter:
              gbierman Gavin Bierman
              Owner:
              Gavin Bierman
              Reviewed By:
              Brian Goetz
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated: