Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8252984

Remove the implNote in the DOM package description added by JDK-8249643

    Details

    • Type: CSR
    • Status: Closed
    • Priority: P3
    • Resolution: Approved
    • Fix Version/s: 16
    • Component/s: xml
    • Labels:
      None
    • Subcomponent:
    • Compatibility Risk:
      minimal
    • Compatibility Risk Description:
      None. This change removes an implNote in the package description.
    • Interface Kind:
      Java API
    • Scope:
      SE

      Description

      Summary

      Remove the implNote in the DOM package description added through JDK-8249643.

      Problem

      The implNote added through JDK-8249643 intended to clarify why the JDK implementation had a discrepancy over the DOM specification. The assessment was that the DOM specification did not follow the XML specification with regard to characters in the surrogate block. However, that assessment was incorrect since the XML specification, while excluding the surrogate block, did include character range [ #x10000-#x10FFFF ].

      Solution

      Remove the implNote added through JDK-8249643.

      Specification

      - Implementation Note:

      - The JDK implementation of LSSerializer follows the Characters section of the XML Specification in handling

      - characters output. In particular, the specification defined a character range that excluded the surrogate blocks.

      - As a result, the JDK LSSerializer writes characters in the surrogate blocks as Character References.

      - Character 0xf0 0x9f 0x9a 0xa9 (Unicode code point U+1F6A9) for example will be written as 🚩.

      - This behavior is different from what is defined in the class description of LSSerializer. The relevant section is quoted below:

      - Within the character data of a document (outside of markup), any characters that cannot be represented directly

      - are replaced with character references... Any characters that cannot be represented directly in the output character

      - encoding are serialized as numeric character references

      - The JDK implementation does not follow this definition because it is not consistent with the XML Specification

      - that defined an explicit character range with no association to the setting of the output character encoding.


        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                joehw Joe Wang
                Reporter:
                joehw Joe Wang
                Reviewed By:
                Lance Andersen, Naoto Sato, Stuart Marks
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: