Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4843787

org.xml.sax.SAXException was thrown when parsing large file

    Details

    • Type: Bug
    • Status: Closed
    • Priority: P4
    • Resolution: Not an Issue
    • Affects Version/s: 1.4.2
    • Fix Version/s: None
    • Component/s: xml
    • Labels:

      Description


      Name: gm110360 Date: 04/07/2003


      FULL PRODUCT VERSION :
      java version "1.4.2-beta"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2-beta-b19)
      Java HotSpot(TM) Client VM (build 1.4.2-beta-b19, mixed mode)

      FULL OS VERSION :
      Microsoft Windows XP [Version 5.1.2600]

      (Note: also shown error on win98 2nd edition)

      A DESCRIPTION OF THE PROBLEM :

      Parsing a large file with many entities using SAX or DOM, an exception will be thrown: org.xml.sax.SAXException: Fatal Error: URI=null Line=595: Parser has reached the entity expansion limit "64,000" set by the Application.


      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :

      Run the source. Please email me for example test file. (testfile.xml)

      In case you don't want to email me for the file, here is how to create one:

      1) create an testfile.xml in the same directory where you run the code
      2) Paste the following:

      <?xml version='1.0' encoding='utf-8'?>
      <!--DTD for vocab -->
      <!DOCTYPE FirstNode [
      ELEMENT FirstNode (ChildNode)*
      ELEMENT ChildNode (#PCDATA)
      ]>

      <FirstNode>
      <ChildNode>
      <html><body><a name="1"></a>
      <p><b>concinnity</b></p>
      <blockquote>concinnity was Word of the Day on <a href="http://www.dictionary.com/wordoftheday/archive/2001/08/18.html">August 18, 2001</a>.</blockquote><br>
      <table border="0" cellpadding="0" cellspacing="0" width="100%"><tr><td class="src"><a href="/search?q=00-database-info&amp;db=wotd" title="Click for more information about this dictionary">Source</a>: <cite>Dictionary.com Word of the Day</cite></td></tr></table>
      <a name="2"></a>

      <TABLE><TR><TD><A NAME="C0548200"><B>con&#183;cin&#183;ni&#183;ty</B></A> &nbsp;&nbsp;<A TITLE="Click for guide to symbols." onClick="ahdpop();return false;" HREF="/help/ahd4/pronkey.html" CLASS="linksrc"><b>Pronunciation Key</b></A>&nbsp;&nbsp;(k<IMG ALT="" SRC="pronkey_files/schwa.gif" height="15" width="6" ALIGN="ABSBOTTOM">n-s<IMG
      ALT="" SRC="pronkey_files/ibreve.gif" height="15" width="7" ALIGN="ABSBOTTOM">n<IMG ALT="" SRC="pronkey_files/prime.gif" height="22" width="4" ALIGN="ABSBOTTOM"><IMG ALT="" SRC="pronkey_files/ibreve.gif" height="15" width="7" ALIGN^F
      quot; SRC="pronkey_files/emacr.gif" height="15" width="7" ALIGN="ABSBOTTOM">)<BR>
       <I>n.</I> <I>pl.</I> <B>con&#183;cin&#183;ni&#183;ties </B><OL><LI> Harmony in the arrangement or interarrangement of parts with respect to a whole.</LI>
      <LI> Studied elegance and facility in style of expression: &#147;He has what one character calls &#145;the gifts of concinnity and concision,&#146; that deft swipe with a phrase that can be so
      devastating in children&#148; (Elizabeth Ward).
      </LI>
      <LI>An instance of harmonious arrangement or studied elegance and facility.</LI>
      </OL><BR>
      <HR ALIGN="left" WIDTH="25%">[From Latin<TT> concinnit<IMG ALT="" SRC="pronkey_files/amacr.gif" height="15" width="7" ALIGN="ABSBOTTOM">s</TT>, from<TT> concinn<IMG ALT="" SRC="pronkey_files/amacr.gif" height="15" width="7" ALIGN="ABSBOTTOM">re</TT>, <I>to put in order</I>,
      from<TT> concinnus</TT>, <I>deftly joined</I>.]</TD>
      </TR></TABLE>
      <a name="3"></a>
      <b>concinnity</b><br><br>
       \Con*cin"ni*ty\, n. [L. concinnitas, fr. concinnus
         skillfully put together, beautiful. Of uncertain origin.]
         Internal harmony or fitness; mutual adaptation of parts;
         elegance; -- used chiefly of style of discourse. [R.]
      <br><br>
               An exact concinnit
      ;<table border="0" cellpadding="0" cellspacing="0" width="100%"><tr><td class="src"><a href="/search?q=00-database-info&amp;db=web1913" title="Click for more information about this dictionary">Source</a>: <cite>Webster's Revised Unabridged Dictionary, &copy; 1996, 1998 MICRA, Inc.</cite></td></tr></table>
      </body></html>
      </ChildNode>

      </FirstNode>

      3) Repeatedly copy and paste the <ChildNode>...</ChildNode> content for about 196 times inside the <FirstNode>..</FirstNode>

      When you run, the error happens after reading about 195 ChildNode.

      You can change line 30 and 31 of source:
              test.DOMRead();
              //test.SAXRead();
      to:
              //test.DOMRead();
              test.SAXRead();

      to test SAX error. In both cases, an exception was generated.


      EXPECTED VERSUS ACTUAL BEHAVIOR :
      No error.
      Exception when run

      ERROR MESSAGES/STACK TRACES THAT OCCUR :
      org.xml.sax.SAXException: Fatal Error: URI=null Line=595: Parser has reached the entity expansion limit "64,000" set by the Application.
              at TErrorHandler.fatalError(XMLError.java:198)
              at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3342)
              at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3333)
              at org.apache.crimson.parser.Parser2.expandEntityInContent(Parser2.java:2667)
              at org.apache.crimson.parser.Parser2.maybeReferenceInContent(Parser2.java:2569)
              at org.apache.crimson.parser.Parser2.content(Parser2.java:1980)
              at org.apache.crimson.parser.Parser2.maybeElement(Parser2.java:1654)
              at org.apache.crimson.parser.Parser2.content(Parser2.java:1926)
              at org.apache.crimson.parser.Parser2.maybeElement(Parser2.java:1654)
              at org.apache.crimson.parser.Parser2.parseInternal(Parser2.java:634)
              at org.apache.crimson.parser.Parser2.parse(Parser2.java:333)
              at org.apache.crimson.parser.XMLReaderImpl.parse(XMLReaderImpl.java:448)
              at org.apache.crimson.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:185)
              at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:76)
              at XMLError.DOMRead(XMLError.java:101)
              at XMLError.main(XMLError.java:30)


      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      import java.util.*;
      import org.w3c.dom.*;
      import java.io.*;

      import javax.xml.parsers.DocumentBuilder;
      import javax.xml.parsers.DocumentBuilderFactory;
      import javax.xml.parsers.FactoryConfigurationError;
      import javax.xml.parsers.ParserConfigurationException;
      import javax.xml.parsers.*;

      import org.xml.sax.SAXException;
      import org.xml.sax.SAXParseException;
      import org.xml.sax.*;
      import org.xml.sax.helpers.*;
      import org.w3c.dom.*;
      import org.w3c.dom.Document;
      import org.w3c.dom.DOMException;


      public class XMLError {

          private String fname = null;

          public XMLError(String fname) {
              this.fname = fname;
          }
          
          public static void main(String [] argv){
              XMLError test = new XMLError("testfile.xml");
              test.DOMRead();
              //test.SAXRead();
          }

      public void SAXRead(){
               System.out.println("Reading " + fname + "...");
               String data = readFile(fname);
      if(data == null){
                   System.out.println("There is no such file as " + fname);
                   return;
               }
      try{
                          SAXParserFactory factory = SAXParserFactory.newInstance();
                          factory.setValidating(true);
                          SAXParser parser = factory.newSAXParser();
                          //org.xml.sax.helpers.DefaultHandler
                          
                          parser.parse(new ByteArrayInputStream(data.getBytes()), new DefaultHandler(){
                              private CharArrayWriter contents = new CharArrayWriter();
                              private int count;
                              
                              public void characters(char[] ch, int start, int length){
                                  contents.write( ch, start, length );
                              }
                              public void endDocument(){
                                  System.out.println("Finish: " + count);
                              }
                              public void endElement(String uri, String localName, String qName) {
                                  if ( qName.equals( "ChildNode" ) ) {
                                      count++;
                                      String str = contents.toString();
                                      System.out.println("Importing... " + count + " : " + str);
                                  }
                              }
                              public void startDocument(){
                                  //contents.reset();
                                  count = 0;
                                  
                              }
                              public void startElement(String uri, String localName, String qName, Attributes attributes){
                                  contents.reset();
                                  //System.out.println("The name: " + localName + ", qName: " + qName);
                              }
                              
                          });
                      }catch(Exception ee){
                          ee.printStackTrace();
                      }
      }
          
          public void DOMRead(){
              System.out.println("Reading " + fname + "...");
              String data = readFile(fname);
              if(data == null){
                  System.out.println("There is no such file as " + fname);
                  return;
              }
              int count = 0;
              try {
                  TErrorHandler error = new TErrorHandler();
                  DocumentBuilderFactory factory =
                  DocumentBuilderFactory.newInstance();
                  factory.setValidating(true);
                  factory.setIgnoringElementContentWhitespace(true);

                  //factory.setNamespaceAware(true);
                  //factory.setExpandEntityReferences(false);

                  System.out.println("Parsing xml data...");
                  DocumentBuilder builder = factory.newDocumentBuilder();
                  builder.setErrorHandler(error);
                  Document document = builder.parse(new ByteArrayInputStream(data.getBytes()));
                  Node node;
                  node = document.getFirstChild();
                  if(node == null){
                      return;
                  }
                  System.out.println("Start importing data: ");
                  while(node != null){
                      if(node.getNodeType() == Node.ELEMENT_NODE){
                          if("FirstNode".equalsIgnoreCase(node.getNodeName())) break;
                      }
                      node = node.getNextSibling();
                  }
                  node = node.getFirstChild();
                  String str = null;
                 
                  boolean done = false;
                  while((node != null) && (!done)){
                      str = getValue(node);
                      if(str == null) break;
                      node = node.getNextSibling();
                      count++;
                      if((count % 10) == 0){
                          System.out.print(".");
                      }
                  }
              }catch(Exception e){
                  e.printStackTrace();
              }
              
              System.out.println("\n\nDone: " + count);
          }
          static public String getValue(Node node){
              if(node == null) return null;
              Node node2 = node.getFirstChild();
              if(node2 == null){
                  return "";
              }
              if(node2.getNodeType() != Node.TEXT_NODE) return null;
              return node2.getNodeValue();
          }

          public static String readFile(String fname){
              if((fname == null) || (fname.trim().length() <= 0)){
                  return null;
              }
              BufferedReader in = null;
              String str;
              StringBuffer buf = new StringBuffer();
              try{
                  in = new BufferedReader(new FileReader(fname));
                  while(in.ready()){
                      str = in.readLine();
                      if(str == null) break;
                      buf.append(str + "\n");
                  }
                  in.close();
              }catch(IOException e){
                  //e.printStackTrace();
                  return null;
              }
              return buf.toString();
          }
      }

      class TErrorHandler implements ErrorHandler {
          int errNo = 0;
          String errMessage = "";
          public void resetError(){
              errNo = 0;
              errMessage = "";
          }
          public void setError(String mesg){
              errNo = 1;
              if(mesg == null) return;
              errMessage = errMessage + "\n" + mesg;
          }
          TErrorHandler() {
          }
          private String getParseExceptionInfo(SAXParseException spe) {
              String systemId = spe.getSystemId();
              if (systemId == null) {
                  systemId = "null";
              }
              String info = "URI=" + systemId + " Line=" + spe.getLineNumber() +
              ": " + spe.getMessage();
              return info;
          }
          public void warning(org.xml.sax.SAXParseException sAXParseException) throws org.xml.sax.SAXException {
              setError("Warning: " + getParseExceptionInfo(sAXParseException));
          }
          public void error(org.xml.sax.SAXParseException sAXParseException) throws org.xml.sax.SAXException {
              String message = "Error: " + getParseExceptionInfo(sAXParseException);
              throw new SAXException(message);
          }
          public void fatalError(org.xml.sax.SAXParseException sAXParseException) throws org.xml.sax.SAXException {
              String message = "Fatal Error: " + getParseExceptionInfo(sAXParseException);
              throw new SAXException(message);
          }
      }

      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :

      None
      (Review ID: 183616)
      ======================================================================
      ###@###.### 2004-07-13

        Activity

        Hide
        rmandavasunw Ramesh Mandava (Inactive) added a comment -
        BT2:EVALUATION

        This is not a bug, but a feature which we introduced to avoid the denial of service attack. Now user can set "entityExpansionLimit" system property if they want to change the default limit which is set as 64000.
          User can also add this property to
         <JRE_HOME>/lib/jaxp.properties
        ( for example /home/x/jdk-1_4_2/jre/lib/jaxp.properties )
        along with other Factory information

        We need to inform the user who filed this bug. And we also need to make this more visible in the documentation.


        ###@###.### 2003-04-16
        Show
        rmandavasunw Ramesh Mandava (Inactive) added a comment - BT2:EVALUATION This is not a bug, but a feature which we introduced to avoid the denial of service attack. Now user can set "entityExpansionLimit" system property if they want to change the default limit which is set as 64000.   User can also add this property to  <JRE_HOME>/lib/jaxp.properties ( for example /home/x/jdk-1_4_2/jre/lib/jaxp.properties ) along with other Factory information We need to inform the user who filed this bug. And we also need to make this more visible in the documentation. ###@###.### 2003-04-16
        Hide
        nbajajsunw Neeraj Bajaj (Inactive) added a comment -
        BT2:PUBLIC COMMENTS

        This is not a bug, but a feature added to avoid Denial of Service attack.
        User can set System property "entityExpansionLimit" to give different value than the default 64000.
         User can also add this property to
        <JRE_HOME>/lib/jaxp.properties
        ( for example /home/x/jdk-1_4_2/jre/lib/jaxp.properties )



        ###@###.### 2003-04-16

        Hello,

        To give you more information what Ramesh has already added. This check was added to make
        Java platform more secure. 64000 is considered to be pretty large number for any real life application to have that much entity expansions in single XML document. However, if any application does need to have higher limit it can always do by setting SYSTEM PROPERTY 'entityExpansionLimit'. This SYSTEM PROPERTY can be used as follows..

        java -DentityExpansionLimit=100000 <command>

        You can also add it in jaxp.properties file.

        You can even set the limit to number less than 64000 if you think this limit is too large and can affect the performance of your application.

        It seems Release Notes are not giving the right details which should be fixed. I will work with documentation team to get it fixed.
        ###@###.### 2004-07-13
        Show
        nbajajsunw Neeraj Bajaj (Inactive) added a comment - BT2:PUBLIC COMMENTS This is not a bug, but a feature added to avoid Denial of Service attack. User can set System property "entityExpansionLimit" to give different value than the default 64000.  User can also add this property to <JRE_HOME>/lib/jaxp.properties ( for example /home/x/jdk-1_4_2/jre/lib/jaxp.properties ) ###@###.### 2003-04-16 Hello, To give you more information what Ramesh has already added. This check was added to make Java platform more secure. 64000 is considered to be pretty large number for any real life application to have that much entity expansions in single XML document. However, if any application does need to have higher limit it can always do by setting SYSTEM PROPERTY 'entityExpansionLimit'. This SYSTEM PROPERTY can be used as follows.. java -DentityExpansionLimit=100000 <command> You can also add it in jaxp.properties file. You can even set the limit to number less than 64000 if you think this limit is too large and can affect the performance of your application. It seems Release Notes are not giving the right details which should be fixed. I will work with documentation team to get it fixed. ###@###.### 2004-07-13
        Hide
        defectconv Defect Conversion BT2 (Inactive) added a comment -
        BT2:WORK AROUND

        Set System property "entityExpansionLimit" and can also set the "entityExpansionLimit" property in
        JRE_HOME/lib/jaxp.properties
        Show
        defectconv Defect Conversion BT2 (Inactive) added a comment - BT2:WORK AROUND Set System property "entityExpansionLimit" and can also set the "entityExpansionLimit" property in JRE_HOME/lib/jaxp.properties

          People

          • Assignee:
            nbajajsunw Neeraj Bajaj (Inactive)
            Reporter:
            gmanwanisunw Girish Manwani (Inactive)
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:
              Imported:
              Indexed: