Add API and Impl notes to the package description of org.w3c.dom to clarify the javadoc for get and set methods, and the discrepancy between the specification and implementation for org.w3c.dom.ls.LSSerializer.
There are two issues to be covered in this change.
The first is about the Java documents for get and set methods of properties. The format and styles for these methods in the org.w3c.dom package do not follow the javadoc standard. Instead of describing the actions these methods will perform, they were written as a definition of the field or attribute the get and set methods will operate on and covered both get and set actions within one text. They were a copy of each other that may cause confusion as if one was copied mistakenly and therefore missing.
The second issue is about the specification for org.w3c.dom.ls.LSSerializer. Within the specification, there was a requirement that a LSSerializer would output characters or character references based on the output encoding. This requirement contradicts with the XML specification where characters had a range with no association to the output encoding.
Since DOM L3 specification and the org.w3c.dom package have not been actively maintained in 16 years, it is unlikely it will be updated in the future. Within the Java SE specification therefore, a general clarification is necessary to document the issues.
Add the followings to the org.w3c.dom package description.
— An API Note explaining the structure of the existing Javadoc.
— An Impl Note explaining the deviation between the LSSerializer specification and JDK implementation.
Add the following to the org.w3c.com package description:
The documentation comments for the get and set methods within this API are written as property definitions and are shared between both methods. These methods do not follow the standard Java SE specification format. Take the Node TextContent property as an example, both getTextContent and setTextContent shared the same content that defined the TextContent property itself.
The JDK implementation of LSSerializer follows the Characters section of the XML Specification in handling characters output. In particular, the specification defined a character range that excluded the surrogate blocks. As a result, the JDK LSSerializer writes characters in the surrogate blocks as Character References. Character 0xf0 0x9f 0x9a 0xa9 (Unicode code point U+1F6A9) for example will be written as 🚩. This behavior is different from what is defined in the class description of LSSerializer. The relevant section is quoted below: Within the character data of a document (outside of markup), any characters that cannot be represented directly are replaced with character references... Any characters that cannot be represented directly in the output character encoding are serialized as numeric character references The JDK implementation does not follow this definition because it is not consistent with the XML Specification that defined an explicit character range with no association to the setting of the output character encoding.
Attache specdiffs. Convenient specdiff and webrevs can be viewed at: