Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-7156085

ArrayIndexOutOfBoundsException throws in UTF8Reader of SAXParser

    Details

      Backports

        Description

        Receive the follow exception with the SAXParser on parsing the XML file at
        http://download.wikimedia.org/enwiktionary/latest/enwiktionary-latest-pages-articles.xml.bz2

         
        Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 8192
                        at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:546)
                        at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1750)
                        at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.arrangeCapacity(XMLEntityScanner.java:1626)
                        at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipString(XMLEntityScanner.java:1664)
                        at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1707)
                        at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2898)
                        at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:607)
                        at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:488)
                        at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:835)
                        at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:764)
                        at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:123)
                        at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1210)
                        at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:568)
                        at org.xml.sax.helpers.ParserAdapter.parse(ParserAdapter.java:429)
                        at com.inet.jorthodictionaries.Parser.<init>(Parser.java:63)
                        at com.inet.jorthodictionaries.BookGenerator.start(BookGenerator.java:94)
                        at com.inet.jorthodictionaries.BookGenerator.main(BookGenerator.java:72)


        This problem occur with Java 6 and Java 7.


        The code look like:


        System.setProperty("entityExpansionLimit", "100000000");
        InputSource input = new InputSource(stream);
        SAXParserFactory spf = SAXParserFactory.newInstance();
        SAXParser sp = spf.newSAXParser();
        ParserAdapter pa = new ParserAdapter(sp.getParser());
        pa.setContentHandler(this);
        pa.parse(input);


        The completely code can you find in the public repository at:

        http://jortho.svn.sourceforge.net/viewvc/jortho/trunk/JOrtho/src/com/inet/jorthodictionaries/Parser.java?revision=241&view=markup

          Attachments

            Issue Links

              Activity

                People

                • Assignee:
                  martin Martin Buchholz
                  Reporter:
                  tyao Ting-Yun Ingrid Yao (Inactive)
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  3 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved:
                    Imported:
                    Indexed: