Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8273259

Character.getName doesn't follow Unicode spec for ideographs

    XMLWordPrintable

    Details

    • Subcomponent:
    • Resolved In Build:
      b15
    • CPU:
      generic
    • OS:
      generic

      Description

      A DESCRIPTION OF THE PROBLEM :
      The Unicode spec chapter 4 at
      https://www.unicode.org/versions/Unicode13.0.0/ch04.pdf gives a naming scheme on page 182, NR2, to systematically derive names for Unicode codepoints in a set of ranges.

      This naming scheme is not followed by Character.getName. rather, most of these ranges are treated like the characters have no name, and the block based derivation rules seem to be used.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Character.getName(0x2000A)

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      Return "CJK UNIFIED IDEOGRAPH-2000A"
      ACTUAL -
      Returns "CJK UNIFIED IDEOGRAPHS EXTENSION B 2000A"

      ---------- BEGIN SOURCE ----------
      Character.getName(0x2000A)
      ---------- END SOURCE ----------

      FREQUENCY : always


        Attachments

          Issue Links

            Activity

              People

              Assignee:
              naoto Naoto Sato
              Reporter:
              webbuggrp Webbug Group
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: