Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6484409

Use string edit distance to provide better error diagnostics

    Details

    • Type: Enhancement
    • Status: Closed
    • Priority: P3
    • Resolution: Future Project
    • Affects Version/s: 6
    • Fix Version/s: 5.0
    • Component/s: xml
    • Labels:

      Description

      One of the very common validity error in a document is a typo. So a validator should pay a particular attention to that kind of errors, and try to point out the fact.

      I quote one example below, which is what one of the JAXB users hit:

      ------------------
      [xjc] [ERROR] cvc-complex-type.2.4.a: Invalid content was found starting with element 'xjc:javaType'. One of '{"http://java.sun.com/xml/ns/jaxb":javaType, "http://java.sun.com/xml/ns/jaxb":serializable, "http://java.sun.com/xml/ns/jaxb/xjc":serializable, "http://java.sun.com/xml/ns/jaxb/xjc":superClass, "http://java.sun.com/xml/ns/jaxb/xjc":superInterface, "http://java.sun.com/xml/ns/jaxb/xjc":typeSubstitution, "http://java.sun.com/xml/ns/jaxb/xjc":smartWildcardDefaultBinding, "http://java.sun.com/xml/ns/jaxb/xjc":simple, "http://java.sun.com/xml/ns/jaxb/xjc":javaType, "http://java.sun.com/xml/ns/jaxb/xjc":generateElementProperty, "http://java.sun.com/xml/ns/jaxb/xjc":noMarshaller, "http://java.sun.com/xml/ns/jaxb/xjc":noUnmarshaller, "http://java.sun.com/xml/ns/jaxb/xjc":noValidator, "http://java.sun.com/xml/ns/jaxb/xjc":noValidatingUnmarshaller}' is expected.
      [xjc] line 6 of file:/C:/Projects/ASTProto/ASTSupplemental/binding-extensions.xml
      ------------------

      First off, you'd notice that th error message should have mentioned the expanded name of the offending element, rather than just a qname, since we all know that people often misunderstand XML namespace. In this case the prefix XJC in the instance expanded to "http://java.sun.com/xml/ns/jaxb/xjc/".

      Then the validator should use the string edit distance to see if the offending name is really close to one of the allowed names. If it looks particularly close to one of the them, then the validator should add something like "perhaps you meant "http://java.sun.com/xml/ns/jaxb":javaType"? The error message should also use a newline character so that the offending name and the suggested name can be easily compared char-by-char. For example,

      ------------------
      [xjc] [ERROR] cvc-complex-type.2.4.a: Invalid content was found starting with element
      "http://java.sun.com/xml/ns/jaxb/xjc/":javaType. Perhaps you meant
      "http://java.sun.com/xml/ns/jaxb/xjc":javaType? One of '{"http://java.sun.com/xml/ns/jaxb":javaType, "http://java.sun.com/xml/ns/jaxb":serializable, "http://java.sun.com/xml/ns/jaxb/xjc":serializable, "http://java.sun.com/xml/ns/jaxb/xjc":superClass, "http://java.sun.com/xml/ns/jaxb/xjc":superInterface, "http://java.sun.com/xml/ns/jaxb/xjc":typeSubstitution, "http://java.sun.com/xml/ns/jaxb/xjc":smartWildcardDefaultBinding, "http://java.sun.com/xml/ns/jaxb/xjc":simple, "http://java.sun.com/xml/ns/jaxb/xjc":javaType, "http://java.sun.com/xml/ns/jaxb/xjc":generateElementProperty, "http://java.sun.com/xml/ns/jaxb/xjc":noMarshaller, "http://java.sun.com/xml/ns/jaxb/xjc":noUnmarshaller, "http://java.sun.com/xml/ns/jaxb/xjc":noValidator, "http://java.sun.com/xml/ns/jaxb/xjc":noValidatingUnmarshaller}' is expected.
      ------------------

      The JAXB RI already implements the string edit distance algorithm and uses it. So feel free to take the code from http://fisheye5.cenqua.com/browse/jaxb2-sources/jaxb-ri/runtime/src/com/sun/xml/bind/v2/util/EditDistance.java?r=1.1

        Attachments

          Activity

            People

            • Assignee:
              joehw Joe Wang
              Reporter:
              kkawagucsunw Kohsuke Kawaguchi (Inactive)
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                Imported:
                Indexed: