Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8217938

Support new Japanese era in java.lang.Character for Java SE 12

    Details

    • Type: CSR
    • Status: Closed
    • Priority: P2
    • Resolution: Approved
    • Fix Version/s: 12
    • Component/s: core-libs
    • Labels:
      None
    • Subcomponent:
    • Compatibility Kind:
      behavioral
    • Compatibility Risk:
      low
    • Compatibility Risk Description:
      Programs that rely on the behavior of Java SE implementations before 12, which generally do not recognize the new Japanese era code point, may behave differently on Java SE 12 implementations.
    • Interface Kind:
      Java API
    • Scope:
      SE

      Description

      Summary

      Mandate support in Java SE 12 for the new Japanese era code point.

      Problem

      The Java SE 12 Platform supports Unicode 11.0 (JDK-8212120) but must also support the new Japanese era code point which Unicode 11.0 does not include. The Java SE 11 Platform was updated to support the new code point at the discretion of the implementation (JDK-8216594), but it is appropriate for the Java SE 12 Platform to mandate support for the new code point in all Java SE 12 implementations.

      Solution

      Mandate support for the new Japanese era code point in all Java SE 12 implementations. This will make every method in the Character class recognize the code point and behave consistently, which will improve the maintainability of all Java SE 12 implementations. (In contrast, the Java SE 11 Platform disallowed the code point in the isJavaIdentifierStart/Part methods, which made the implementations of those methods harder to maintain.)

      Specification

      Change the first half (before "Unicode Character Representations" header) of the Character class specification from:

        * The {@code Character} class wraps a value of the primitive
        * type {@code char} in an object. An object of type
        * {@code Character} contains a single field whose type is
        * {@code char}.
        * <p>
        * In addition, this class provides several methods for determining
        * a character's category (lowercase letter, digit, etc.) and for converting
        * characters from uppercase to lowercase and vice versa.
        * <p>
        * Character information is based on the Unicode Standard, version 11.0.0.
        * <p>
        * The methods and data of class {@code Character} are defined by
        * the information in the <i>UnicodeData</i> file that is part of the
        * Unicode Character Database maintained by the Unicode
        * Consortium. This file specifies various properties including name
        * and general category for every defined Unicode code point or
        * character range.
        * <p>
        * The file and its description are available from the Unicode Consortium at:
        * <ul>
        * <li><a href="http://www.unicode.org">http://www.unicode.org</a>
        * </ul>
        * <p>
        * The code point, U+32FF, is reserved by the Unicode Consortium
        * to represent the Japanese square character for the new era that begins
        * May 2019. Relevant methods in the Character class return the same
        * properties as for the existing Japanese era characters (e.g., U+337E for
        * "Meizi"). For the details of the code point, refer to
        * <a href="http://blog.unicode.org/2018/09/new-japanese-era.html">
        * http://blog.unicode.org/2018/09/new-japanese-era.html</a>.

      to:

       * The {@code Character} class wraps a value of the primitive
       * type {@code char} in an object. An object of class
       * {@code Character} contains a single field whose type is
       * {@code char}.
       * <p>
       * In addition, this class provides a large number of static methods for
       * determining a character's category (lowercase letter, digit, etc.)
       * and for converting characters from uppercase to lowercase and vice
       * versa.
       *
       * <h3><a id="conformance">Unicode Conformance</a></h3>
       * <p>
       * The fields and methods of class {@code Character} are defined in terms
       * of character information from the Unicode Standard, specifically the
       * <i>UnicodeData</i> file that is part of the Unicode Character Database.
       * This file specifies properties including name and category for every
       * assigned Unicode code point or character range. The file is available
       * from the Unicode Consortium at
       * <a href="http://www.unicode.org">http://www.unicode.org</a>.
       * <p> 
       * The Java SE 12 Platform uses character information from version 11.0
       * of the Unicode Standard, plus the Japanese Era code point,
       * {@code U+32FF}, from the first version of the Unicode Standard
       * after 11.0 that assigns the code point.

      Change the second paragraph of isJavaIdentifierPart(char) and isJavaIdentifierPart(int) method description from:

           * A character may be part of a Java identifier if any of the following
           * are true:

      to:

           * A character may be part of a Java identifier if any of the following
           * conditions are true:

      Change the last list item of conditions in isJavaIdentifierPart(int) method description from:

           * <li> {@link #isIdentifierIgnorable(int)
           * isIdentifierIgnorable(codePoint)} returns {@code true} for
           * the character

      to:

           * <li> {@link #isIdentifierIgnorable(int)
           * isIdentifierIgnorable(codePoint)} returns {@code true} for
           * the code point

      Change the second paragraph of isJavaLetter(char) method description from:

           * A character may start a Java identifier if and only if
           * one of the following is true:

      to:

           * A character may start a Java identifier if and only if
           * one of the following conditions is true:

      Change the second paragraph of isJavaLetterOrDigit(char) method description from:

           * A character may be part of a Java identifier if and only if any
           * of the following are true:

      to:

           * A character may be part of a Java identifier if and only if one
           * of the following conditions is true:

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                naoto Naoto Sato
                Reporter:
                dkejriwal Deepak Kejriwal (Inactive)
                Reviewed By:
                Alex Buckley, Chris Hegarty
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: