Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8217097

Correct UnicodeDecoder U+FFFE handling

    Details

    • Type: CSR
    • Status: Closed
    • Priority: P3
    • Resolution: Approved
    • Fix Version/s: 13
    • Component/s: core-libs
    • Labels:
      None
    • Subcomponent:
    • Compatibility Kind:
      behavioral
    • Compatibility Risk:
      low
    • Compatibility Risk Description:
      Client code that *expects* the code point to be reported as "malformed" will not work with this change, which now is not recommended by the Unicode Consortium corrigendum.
    • Interface Kind:
      Java API
    • Scope:
      SE

      Description

      Summary

      Correct the behavior of UnicodeDecoder subclasses on handling U+FFFE code point in the middle of the input buffer.

      Problem

      Currently UnicodeDecoder deals with U+FFFE in the middle of a string as "malformed" as it is a non-character. This has been correct up until Unicode 7. However Unicode 7 includes the corrigendum (http://www.unicode.org/versions/corrigendum9.html) that changed the definition of non-characters. UnicodeDecoder's behavior should be modified to conform to it.

      Solution

      Remove the piece of code in UnicodeDecoder which detects the code point in the middle and return "malformed" CodeResult, so that the UTF16 decoders (StandardCharsets.UTF_16[LE/BE]) can pass through the code point.

      Specification

      As required by the Unicode 7 Corrigendum 9, U+FFFE is passed through as a code point.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                naoto Naoto Sato
                Reporter:
                naoto Naoto Sato
                Reviewed By:
                Roger Riggs
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: