Details

    • Type: JEP
    • Status: Candidate
    • Priority: P3
    • Resolution: Unresolved
    • Fix Version/s: None
    • Component/s: tools
    • Labels:
      None
    • Author:
      Maurizio Cimadamore
    • JEP Type:
      Feature
    • Exposure:
      Open
    • Subcomponent:
    • Scope:
      SE
    • Discussion:
      platform dash jep dash discuss at openjdk dot java dot net
    • Effort:
      M
    • Duration:
      M
    • JEP Number:
      301

      Description

      Summary

      Enhance the expressiveness of the enum construct in the Java Language by allowing type-variables in enums (generic enums), and performing sharper type-checking for enum constants.

      Goals

      These two enhancements work together to enable enum constants to carry constant-specific type information as well as constant-specific state and behavior. There are many situations where developers have to refactor enums into classes in order to achieve the desired result; these enhancements should reduce this need.

      The following example shows how the two enhancements work together:

      enum Argument<X> { // declares generic enum
         STRING<String>(String.class), 
         INTEGER<Integer>(Integer.class), ... ;
      
         Class<X> clazz;
      
         Argument(Class<X> clazz) { this.clazz = clazz; }
      
         Class<X> getClazz() { return clazz; }
      }
      
      Class<String> cs = Argument.STRING.getClazz(); //uses sharper typing of enum constant

      Non-Goals

      This JEP targets specific enhancements to how enum constants are type-checked. As such, other enum-related features such as:

      • allow enum subclassing
      • allow enum in non-static contexts

      are outside the scope of this JEP.

      Motivation

      Java enums are a powerful construct. They allow grouping of constants - where each constant is a singleton object. Each constant can optionally declare a body, which can be used to override the behavior of the base enum declaration. In the following we will try to model the set of Java primitive types using an enum. Here's a start:

      enum Primitive {
          BYTE,
          SHORT,
          INT,
          FLOAT,
          LONG,
          DOUBLE,
          CHAR,
          BOOLEAN;
      }

      As stated above, an enum declaration is like a class, and can have constructors; we can use this feature to keep track of the boxed class and the default value of each primitive:

      enum Primitive {
          BYTE(Byte.class, 0),
          SHORT(Short.class, 0),
          INT(Integer.class, 0),
          FLOAT(Float.class, 0f),
          LONG(Long.class, 0L),
          DOUBLE(Double.class, 0d),
          CHAR(Character.class, 0),
          BOOLEAN(Boolean.class, false);
      
          final Class<?> boxClass;
          final Object defaultValue;
      
          Primitive(Class<?> boxClass, Object defaultValue) {
             this.boxClass = boxClass;
             this.defaultValue = defaultValue;
          }
      
      }

      While this is rather nice, there are some limitations: that the field boxClass is loosely typed as Class<?>, as the field type needs to be compatible with all the sharper types used by the enum constants. As a result, any attempt to do something like this:

      Class<Short> cs = SHORT.boxedClass(); //error

      Will fail with a compile-time error. Even worse, the field defaultValue has a type of Object. This is unavoidable since the field needs to be shared across multiple constants modelling different primitive types. Hence, static safety is lost, as the compiler allows code like the following:

      String s = (String)INT.defaultValue(); //ok

      Let's now try to extend the enum and add some operations to the constants modelling primitive types (for the sake of brevity, in the remainder we will only show a subset of the constants):

      enum Primitive {
          INT(Integer.class, 0) {
             int mod(int x, int y) { return x % y; }
             int add(int x, int y) { return x + y; }
          },
          FLOAT(Float.class, 0f)  {
             long add(long x, long y) { return x + y; }
          }, ... ;
      
          final Class<?> boxClass;
          final Object defaultValue;
      
          Primitive(Class<?> boxClass, Object defaultValue) {
             this.boxClass = boxClass;
             this.defaultValue = defaultValue;
          }
      
      }

      Again, this results in problems, as there's no way to do something like this:

      int seven = INT.add(3, 4); //error

      That's because the static type of INT is simply Primitive and Primitive has no member named add. So, in order to add operations to our enum, we need to add the members to the enum declaration itself, as follows:

      enum Primitive {
          INT(Integer.class, 0),
          FLOAT(Float.class, 0f), ... ;
      
          final Class<?> boxClass;
          final Object defaultValue;
      
          Primitive(Class<?> boxClass, Object defaultValue) {
             this.boxClass = boxClass;
             this.defaultValue = defaultValue;
          }
      
          int mod(int x, int y) {
             if (this == INT) {
                return x % y;
             } else {
                throw new IllegalStateException();
             }
          }
      
          int add(int x, int y) {
              if (this == INT) {
                return x + y;
             } else {
                throw new IllegalStateException();
             }
          }
      
          long add(float x, float y) {
              if (this == FLOAT) {
                return x + y;
             } else {
                throw new IllegalStateException();
             }
          }
          ...
      
      }

      But the code above has, again, several problems. First, this breaks encapsulation: suddenly, Primitive acquires a bunch of members, none of which make sense for all the constants. As a result, the implementation of each method becomes more convoluted, as the methods must check whether they have been called on the right enum constant. Type-safety is also lost, as the compiler will not detect bad usages such as:

      int zero = FLOAT.mod(50, 2); //ok

      All the problems described above can be addressed by removing specific asymmetries between enums and classes, and by refining the way in which enum constants are type-checked. More precisely:

      • allow type-parameter in enum declarations
      • do not prematurely erase sharp type-information associated with enum constants

      With these enhancements, the Primitive enum can be rewritten as follows:

      enum Primitive<X> {
          INT<Integer>(Integer.class, 0) {
             int mod(int x, int y) { return x % y; }
             int add(int x, int y) { return x + y; }
          },
          FLOAT<Float>(Float.class, 0f)  {
             long add(long x, long y) { return x + y; }
          }, ... ;
      
          final Class<X> boxClass;
          final X defaultValue;
      
          Primitive(Class<X> boxClass, X defaultValue) {
             this.boxClass = boxClass;
             this.defaultValue = defaultValue;
          }
      }

      This generic declaration is clearly more expressive than the previous one - now the enum constant Primitive.INT has a sharper parameterized type Primitive<Integer> which means that its members are also sharply typed:

      Class<Short> cs = SHORT.boxedClass(); //ok!

      Also, since type information on enum constants is not prematurely erased, the compiler can reason about membership of constants - as demonstrated below:

      int zero_int = INT.mod(50, 2); //ok
      int zero_float = FLOAT.mod(50, 2); //error

      The compiler is now able to reject the second statement as there's no member mod in the enum constant FLOAT - which guarantees extra type-safety.

      Description

      Generic enums

      As discussed in JDK-6408723, an important requirement for allowing generics in enums is that type-parameters are fully bound in the enum constant declaration. This allows for a straightforward translation scheme which can augment the one we have today - for instance, given an enum declaration like the following:

      enum Foo<X> {
         ONE<String>,
         TWO<Integer>;
      }

      The corresponding desugared code will look as follows:

      /* enum */ class Foo<X> {
         static Foo<String> ONE = ...
         static Foo<Integer> TWO = ...
      
         ...
      }

      That is, it is still possible to map each constant to a static field declaration, as type bindings are all statically known.

      It might be desirable to allow diamond on enum constant initialization - for instance:

      enum Bar<X> {
         ONE<>(Integer.class),
         TWO<>(String.class);
      
         Bar(X x) { ... }
      }

      If the diamond syntax is used, special care is required if the enum constant has a body (i.e. it is translated into an anonymous class) and the inferred type is non-denotable. As in the case for diamond with anonymous inner classes, the compiler will have to reject that case.

      Sharper typing of enum constants

      Under current rules, the static type of an enum constant is the enum type itself. Under such rules, the constants Foo.ONE and Foo.TWO above will both have the same type, namely Foo. This is undesirable for at least two reasons:

      • in case of a generic enum (as Foo), the static type of a constant is not sharp enough to capture the full type info carried by that constant
      • even in the absence of generic enum, the constant type is not sharp enough to let a client access a member that is only defined on that enum constant (see the example at the beginning of this page)

      To overcome this limitation, typing of enum constants should be redefined so that a given enum constant gets its own type. Let E be an enum declaration, and C be a (possibly generic) enum constant declaration in E. The constant C is associated with a sharper type if either of the following conditions are satisfied:

      • C is of the kind C<T1, T2 ... Tn> but declares no body; the constant sharper type is E<T1, T2 ... Tn>
      • C has a body; the constant sharper type is an anonymous type (written E.C) whose supertype is either
        • E<T1, T2, ... Tn> if C is of the kind C<T1, T2, ... Tn> and E is a generic enum
        • E, if E is non-generic

      These enhanced typing rule allow the static types for Foo.ONE and the one for Foo.TWO to be different.

      Additional Considerations

      Binary compatibility

      Let's assume we have the following enum:

      enum Test {
         A { void a() { } }
         B { void b() { } }
      }

      As we have seen, this would be translated as follows:

      /* enum */ class Test {
         static Test A = new Test() { void a() { } }
         static Test B = new Test() { void b() { } }
      }

      If we allow sharper type for enum constants, a naive approach would translate the code as follows:

      /* enum */ class Test {
         static Test$1 A = new Test() { void a() { } }
         static Test$2 B = new Test() { void b() { } }
      }

      Here, the binary incompatibility is manifest: the type of the enum constant A just changed from Test to Test$1 upon recompilation. This change is going to break non-recompiled clients using Test.

      To overcome this problem, it is better to take an erasure-based approach: while the static type of A might be the sharper type Test.A - any reference to the type of the constant gets erased to the base enum type Test. This leads to code that is binary compatible with respect to what we had before. However, if everything gets erased to Test, how is access to members of a specific enum constants implemented?

      Foo.A.a();

      It is easy to see that, if in the code above, symbolic references to A are erased to Test, the method call will not be well-typed (as Test does not have a member named a). To overcome this problem, the compiler has to insert a synthetic cast:

      checkcast Test$1
      invokevirtual Test$1::a

      This is not dissimilar with what happens when accessing members of an intersection type through erasure.

      Another orthogonal observation is that the current naming scheme for enum constants classes is too fragile - the names Test$1 and Test$2 shown above are essentially order-dependent - this means that changing the order in which enum constants are declared could lead to binary compatibility issues. More specifically, if in the code above A is swapped with B and the enum is recompiled, the client bytecode above would fail to link, as Test$1 would no longer have a member method named a. This is in stark contrast with the respect to what the JLS has to say about binary compatible evolution of enums:

      Adding or reordering constants in an enum will not break compatibility with pre-existing binaries.

      One way to preserve binary compatible evolution would be to emit order insensitive class names, such as Test$A and Test$B instead of Test$1 and Test$2. The impact of such a change with respect to reflection and serialization is discussed below.

      Serialization

      In Java, all enums are implicitly serializable, as Enum implements Serializable. We would like that the changes provide here be serialization-compatible; they should not change the serialized form. The serialization specification:

      http://docs.oracle.com/javase/6/docs/platform/serialization/spec/serial-arch.html#6469

      provides special treatment for enums; the serialized form of an enum constant is its name only, and it is not possible to customize serialization/deserialization of an enum constant. (Note that all enum constants are initialized during the <clinit>, and the Enum.valueOf method that is used by deserialization calls the enum's static values() method, which implicitly forces initialization of the base enum class (and of all the constants)).

      In other words, no compatibility problem with respect to the serialized form exists, as the serialized form already does not depend on the class name generated by the compiler.

      Reflection

      Another place where binary names come up is reflection. The following is perfectly legal reflective code:

      Class<?> c = Class.forName("Test$1");
      System.err.println(c.getName()); //prints Test$1

      While reflection has restrictions in order to prevent an enum constant to be instantiated reflectively, there's no restriction for inspecting the members of an enum constant class. Therefore, existing code using the idiom above would cease to work should we change the binary form of enum constants.

      Denotability

      Currently, an enum constant is a value, not a type. So, a legitimate question is as to whether enum constants should also be denotable types.

      The usual arguments apply here - on the one hand, having a denotable type for an enum constant makes it less magic, and allow programmer to declare variable with that type. But there are also disadvantages:

      • could make the code less readable (e.g. A a = A) - as the same ident could mean both value and type
      • not clear as to whether all enum constants get their own type; what about an enum constant that does not declare any additional member? Is its type just an alias for the base enum type?

      On the other hand, if the enum constant type is a non-denotable type, it becomes an opaque thing that programmers can only interact with indirectly (e.g. through type inference). To mitigate some of the drawbacks of a non-denotable type, it is important to note that the proposal to add local variable type inference could technically allow programmers to declare variables with the sharper enum type, even though it is non-denotable (e.g. var a = A).

      Accessibility

      There is one corner case with respect to accessibility of members through the enum sharper type. Consider the following case:

      package a;
      
      public enum Foo {
        A() { 
          public String s = "Hello!";
        };
      }
      
      package b;
      
      class Client {
         public static void main(String[] args) {
            String s = Foo.A.s; //IllegalAccessError
         }
      }

      When executing this code, the VM will issue an IllegalAccessError; the problem is that the anonymous class for the enum constant Foo$A is package-private; as a result, an attempt to access a public field in a package-private class from another package will result in an access error. To overcome this problem, the enum constant class should have same modifier as the enum class in which it is defined.

      Source compatibility

      From a source compatibility perspective, there are cases in which sharper typing could leak out as a result of an interaction between this feature and type inference - consider the following code:

      EnumSet<Test> e = EnumSet.of(Test.A);

      The code above used to behave in a relatively straightforward fashion: the static type of Test.A is simply Test, meaning that inferring the type-variable of EnumSet.of was simple, as both constraints named the type Test. But if we change the way in which Test.A is type-checked, the behavior gets more interesting: the type-variable of EnumSet.of will get two competing constraints: it must be equal to Test (form the target-type) and it must be a supertype of Test.A. Luckily, in such a scenario, type inference is smart enough to prefer the stricter equality constraint, and ends up inferring Test. All things considered, the source compatibility impact of this change is not too different from the one in JDK-8075793, where the change caused capture variables to appear in more places instead of their upper bounds.

      Risks and Assumptions

      This proposal has two main risks outlined in the sections above:

      • change in binary names of enum constants could lead to issues with core reflection
      • change in typing of enum constants could result in subtle changes in method type inference, especially in the absence of a target-type

      The first problem is probably nothing to be concerned about; as it has been shown, binary names of enum constants is currently very fragile and prone to re-ordering issues. As a result, any code that is relying on the binary name of an enum constant is inherently fragile, as it is essentially relying on the output of a specific compiler.

      The second problem is more worrisome, as it could cause potential source compatibilities. In order to detect how frequent the source incompatibility scenario described above could be, we have measured how many times the EnumSet.of method was called with various arities; for each call we kept track of whether the call occurred in a context where a target type was available. Below are the results (the measurements have been taken against the full open JDK forest).

      • Total calls to EnumSet.of: 150
        • calls with arity = 1 : 69
        • of which, without target-type: 0

      In other words, the source compatibility scenario described above does not seem to pose any serious threat.

      Dependencies

      The sharper type used for an enum constant are not necessarily denotable; these would constitute another category of non-denotable types. This may interact with the treatment of non-denotable types in JEP-286 (Local Variable Type Inference). Depending on decisions made in JEP-286 regarding non-denotable types, one might be able to say:

      var a = Argument.String;

      and have the type of a be the sharper type Argument.String rather than the coarser type Argument.

        Issue Links

          Activity

          Hide
          jrose John Rose added a comment - - edited
          If enum nested classes are going to be distinct static subtypes,
          they should also be able to carry resolvable static members.
          For example, an enumeration of primitive types should allow
          independent "public static final" constants on each enum.
          Such constants should be useable, e.g., as switch labels.
          This is a natural consequence of distinguishing enum subtypes.
          It is also a useful way to get more static information from enums.

          enum EnumCon {
              BYTE {
                  @Override Class<?> type() { return byte.class; }
                  static final int BITSIZE = 8;
              },
              INT {
                  @Override Class<?> type() { return int.class; }
                  static final int BITSIZE = 32;
              };

              abstract Class<?> type();

              public static void main(String... av) {
                  // these next two lines fail to compile under current rules,
                  // because typeof(BYTE) = EnumCon, not EnumCon$$BYTE.
                  System.out.println("BYTE.BITSIZE = "+BYTE.BITSIZE); // should print 8
                  System.out.println("INT.BITSIZE = "+INT.BITSIZE); // should print 32
              }
          }
          Show
          jrose John Rose added a comment - - edited If enum nested classes are going to be distinct static subtypes, they should also be able to carry resolvable static members. For example, an enumeration of primitive types should allow independent "public static final" constants on each enum. Such constants should be useable, e.g., as switch labels. This is a natural consequence of distinguishing enum subtypes. It is also a useful way to get more static information from enums. enum EnumCon {     BYTE {         @Override Class<?> type() { return byte.class; }         static final int BITSIZE = 8;     },     INT {         @Override Class<?> type() { return int.class; }         static final int BITSIZE = 32;     };     abstract Class<?> type();     public static void main(String... av) {         // these next two lines fail to compile under current rules,         // because typeof(BYTE) = EnumCon, not EnumCon$$BYTE.         System.out.println("BYTE.BITSIZE = "+BYTE.BITSIZE); // should print 8         System.out.println("INT.BITSIZE = "+INT.BITSIZE); // should print 32     } }
          Hide
          vromero Vicente Arturo Romero Zaldivar added a comment -
          [~jrose] I have compiled the test case you provided, I will added as a regression test. Using the current implementation of enhanced enums, it's compiling and printing the expected output.
          Show
          vromero Vicente Arturo Romero Zaldivar added a comment - [~jrose] I have compiled the test case you provided, I will added as a regression test. Using the current implementation of enhanced enums, it's compiling and printing the expected output.
          Hide
          jrose John Rose added a comment -
          Since the enum subtypes are separately defined classes, is there any reason not to allow them to implement different interfaces, as well as have different type parameters? Both are refinements of the common enum type. Allowing interfaces would enable the subtypes to share code between themselves (via default methods). Example:

          enum Primitive {
              BYTE(Byte.class) extends Bitwise<BYTE> { },
              INT(Integer.class) extends Bitwise<INT> { } ,
              FLOAT(Float.class) extends Floating<FLOAT> { },
              BOOLEAN(Boolean.class); // unique kind of type, no special supers
              interface Bitwise<T extends Primitive> {
                T bitwiseAnd(T x, T y);
                T bitwiseOr(T x, T y); ...
              } ...
          }

          Code sharing like this also pushes on the question of denoting the enum subtype inside the body of the subtype and elsewhere. This example assumes that the enum name is in scope as a type name in the body itself.
          Show
          jrose John Rose added a comment - Since the enum subtypes are separately defined classes, is there any reason not to allow them to implement different interfaces, as well as have different type parameters? Both are refinements of the common enum type. Allowing interfaces would enable the subtypes to share code between themselves (via default methods). Example: enum Primitive {     BYTE(Byte.class) extends Bitwise<BYTE> { },     INT(Integer.class) extends Bitwise<INT> { } ,     FLOAT(Float.class) extends Floating<FLOAT> { },     BOOLEAN(Boolean.class); // unique kind of type, no special supers     interface Bitwise<T extends Primitive> {       T bitwiseAnd(T x, T y);       T bitwiseOr(T x, T y); ...     } ... } Code sharing like this also pushes on the question of denoting the enum subtype inside the body of the subtype and elsewhere. This example assumes that the enum name is in scope as a type name in the body itself.
          Hide
          mcimadamore Maurizio Cimadamore added a comment -
          @John - I think extending custom interfaces, is totally doable from a language/translation strategy perspective (an enum with a body is just a class), but I fear it would be messy from a programming model perspective.

          First, enums can already implement interfaces - so you can't really opt out from the supertypes you get from the enum declaration itself:

          enum Foo implements A {
             CONST() extends B { }
          }

          Realistically, this can only mean that CONST implements _both_ A and B, which can be a tad confusing when looking at this from a syntactic perspective. But maybe that's just matter of finding a better syntax.

          Another - deeper - consequence, is that, by allowing custom _additional_ supertypes would result in a proliferation of intersection types when doing inference - for instance, cases like:

          EnumSet.of(BYTE, INT)

          would return not Primitive, but Primitive & Bitwise<Primitive>.

          Speaking more generally, I think that while enum constants (with body) are modelled as classes from a classfile perspective, their 'classness' was never meant to be too exposed in the programming model. Giving enum constants sharper types is mostly about not throwing away the static information the compiler knows about a given constant. Adding generics enum is really adding generics _to the enum declaration_ not to the constant part (that is, a constant can only instantiate the type-variables declared in the enum). So, if this proposal was about adding type-var declarations to the enum constant themselves, I'd agree with you that from there to add custom supertypes would be a short hop. But that's not what this proposal is about.
          Show
          mcimadamore Maurizio Cimadamore added a comment - @John - I think extending custom interfaces, is totally doable from a language/translation strategy perspective (an enum with a body is just a class), but I fear it would be messy from a programming model perspective. First, enums can already implement interfaces - so you can't really opt out from the supertypes you get from the enum declaration itself: enum Foo implements A {    CONST() extends B { } } Realistically, this can only mean that CONST implements _both_ A and B, which can be a tad confusing when looking at this from a syntactic perspective. But maybe that's just matter of finding a better syntax. Another - deeper - consequence, is that, by allowing custom _additional_ supertypes would result in a proliferation of intersection types when doing inference - for instance, cases like: EnumSet.of(BYTE, INT) would return not Primitive, but Primitive & Bitwise<Primitive>. Speaking more generally, I think that while enum constants (with body) are modelled as classes from a classfile perspective, their 'classness' was never meant to be too exposed in the programming model. Giving enum constants sharper types is mostly about not throwing away the static information the compiler knows about a given constant. Adding generics enum is really adding generics _to the enum declaration_ not to the constant part (that is, a constant can only instantiate the type-variables declared in the enum). So, if this proposal was about adding type-var declarations to the enum constant themselves, I'd agree with you that from there to add custom supertypes would be a short hop. But that's not what this proposal is about.
          Hide
          jjg Jonathan Gibbons added a comment -
          Can I suggest a SubTask for "javadoc updates for Enhanced Enums" ? It's not obvious to me that the existing javadoc code will support enhanced enums without any change to the standard doclet.
          Show
          jjg Jonathan Gibbons added a comment - Can I suggest a SubTask for "javadoc updates for Enhanced Enums" ? It's not obvious to me that the existing javadoc code will support enhanced enums without any change to the standard doclet.
          Hide
          vromero Vicente Arturo Romero Zaldivar added a comment -
          the current implementation does update javadoc code. Of course this is an area for which more tests are needed and could change in the future but it has been covered as part of the first iteration of the implementation
          Show
          vromero Vicente Arturo Romero Zaldivar added a comment - the current implementation does update javadoc code. Of course this is an area for which more tests are needed and could change in the future but it has been covered as part of the first iteration of the implementation

            People

            • Assignee:
              mcimadamore Maurizio Cimadamore
              Reporter:
              mcimadamore Maurizio Cimadamore
              Owner:
              Maurizio Cimadamore
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated: