Details

    • Type: Enhancement
    • Status: Open
    • Priority: P3
    • Resolution: Unresolved
    • Affects Version/s: 1.0
    • Fix Version/s: 5.0
    • Component/s: xml
    • Labels:

      Description

      The following program:

          public void test1() throws Exception {
              XMLOutputFactory xof = XMLOutputFactory.newInstance();
              xof.setProperty(XMLOutputFactory.IS_REPAIRING_NAMESPACES,true);
              XMLStreamWriter w = xof.createXMLStreamWriter(System.out);
              w.writeStartDocument();
              w.writeStartElement("foo","root");
              w.writeEndElement();
              w.writeEndDocument();
              w.flush();
          }

      produces an output like this:

        <?xml version="1.0" ?><zdef1485862575:root xmlns:zdef1485862575="foo"></zdef1485862575:root>

      ... which is rather ugly. Looking at the code, I see that you have the following code in XMLStreamWriterImpl:

                  if(tmpPrefix == null ){
                      StringBuffer genPrefix = new StringBuffer("zdef");
                      for(int i=0; i<1 ; i++)
                          genPrefix.append(fPrefixGen.nextInt());
                      prefix = genPrefix.toString();
                      prefix = fSymbolTable.addSymbol(prefix);
                  }else{

      ... where fPrefixGen is a Random object. Besides the obviously redundant for-loop,
      why don't we just use a sequence generator that simply counts up?

      1. Random object is rather expensive to allocate. It needs to get a random seed value
         every time. A sequence counter is very cheap to allocate (just counter=0) Even worse,
         you are paying this cost even if you aren't repairing namespaces.
      2. Computing a random value is expensive. If you look at the Random.nextInt(),
         you'll see the use of multiplications, native method call, etc. Computing a sequence
         value from a counter is just "counter++".
      3. Random doesn't guarantee that prefixes are unique
         (although certainly chances of collisions is small given that you've got 32 bits.)
         Sequence counter is guaranteed to produce an unique prefix within the same document.
      4. A document gets larger than necessary.
      5. The same program produces different output every time.

      I could very well be missing some reasons why a random number is preferable, in which
      case I'd be happy to be corrected.

        Attachments

          Activity

            People

            • Assignee:
              joehw Joe Wang
              Reporter:
              kkawagucsunw Kohsuke Kawaguchi (Inactive)
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Imported:
                Indexed: