Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8136602

Seemingly valid XML fails to get parsed with org.xml.sax.SAXParseException

    Details

      Description

      FULL PRODUCT VERSION :
      openjdk version "1.8.0_45-internal"
      OpenJDK Runtime Environment (build 1.8.0_45-internal-b14)
      OpenJDK 64-Bit Server VM (build 25.45-b02, mixed mode)


      ADDITIONAL OS VERSION INFORMATION :
      Linux rei2-wt 3.19.0-28-generic #30-Ubuntu SMP Mon Aug 31 15:52:51 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

      A DESCRIPTION OF THE PROBLEM :
      Parsing a simple XML fails. This program:

          public static void main(String[] args) throws Exception {
      DocumentBuilderFactory.newInstance().newDocumentBuilder().parse("/home/vyzivus/Downloads/KD.xml");
          }

      will fail to parse the attached XML with the following error message:
      [Fatal Error] KD.xml:972:25: An invalid XML character (Unicode: 0xd840) was found in the comment.
      Exception in thread "main" org.xml.sax.SAXParseException; systemId: file:///home/vyzivus/Downloads/KD.xml; lineNumber: 972; columnNumber: 25; An invalid XML character (Unicode: 0xd840) was found in the comment.
      at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
      at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:348)
      at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:177)
      at com.company.Main.main(Main.java:8)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:497)
      at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)


      When the comment line in question is removed, the parser will succeed in parsing of the XML. Apparently, the comment parser will incorrectly parse the unicode character and will even report incorrect codepoint (0xd840 instead of 2000B).

      You can download the XML in question here: http://www.baka.sk/KD.xml


      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      1. Download the KD.xml file from http://www.baka.sk/KD.xml
      2. Parse the attached XML: DocumentBuilderFactory.newInstance().newDocumentBuilder().parse("KD.xml");

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      The parse succeeds and throws no exception
      ACTUAL -
      An exception is thrown: [Fatal Error] KD.xml:972:25: An invalid XML character (Unicode: 0xd840) was found in the comment.
      Exception in thread "main" org.xml.sax.SAXParseException; systemId: file:///home/vyzivus/Downloads/KD.xml; lineNumber: 972; columnNumber: 25; An invalid XML character (Unicode: 0xd840) was found in the comment.
      at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
      at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:348)
      at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:177)
      at com.company.Main.main(Main.java:8)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:497)
      at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)

      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      package com.company;

      import javax.xml.parsers.DocumentBuilderFactory;

      public class Main {

          public static void main(String[] args) throws Exception {
              DocumentBuilderFactory.newInstance().newDocumentBuilder().parse("/home/vyzivus/Downloads/KD.xml");
          }
      }

      ---------- END SOURCE ----------

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                joehw Joe Wang
                Reporter:
                webbuggrp Webbug Group
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: