Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6609854

Regex does not match correctly for negative nested character classes

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: P3
    • Resolution: Fixed
    • Affects Version/s: 6
    • Fix Version/s: 9
    • Component/s: core-libs
    • Labels:
      None
    • Subcomponent:
    • Resolved In Build:
      b119
    • CPU:
      generic
    • OS:
      generic

      Description

      > >> I have been looking into the definition of [character set]
      > >> expressions in Java regular expressions, to understand what needs to
      > >> be done to make ICU be compatible, or more compatible at least.
      > >>
      > >> There does not appear to be any formal definition for [set
      > >> expressions], or at least not that I can find.
      > >>
      > >> Trying tests, one aspect of the behavior seems really odd. It would
      > >> be good if we could find out from Sun whether it was really intended
      > >> to work the way that it does.
      > >>
      > >> The question concerns the negation of a set,
      > >> [^0-9], to get everything except for the ASCII digits, for example.
      > >>
      > >> In Java, the negation does _not_ apply to anything appearing in
      > >> nested [brackets]
      > >>
      > >> So [^c] does not match "c", as you would expect.
      > >> [^[c]] does match "c". Not what I would expect.
      > >> [[^c]] does not match "c"
      > >>
      > >> The same holds true for ranges or property expressions - if they're
      > >> inside brackets, a negation at an out level does not affect them.
      > >>
      > >> [^a-z] is opposite from [^[a-z]]
      > >>
      > >> And the same seems to hold for set expressions with &&, although the
      > >> cases become hard to understand.
      > >>
      > >> Perl and Posix behavior doesn't provide any guidance here, as they do
      > >> not support nested brackets at all - a '[' is not special within a
      > >> set, and just becomes yet another member of the set.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                sherman Xueming Shen
                Reporter:
                sherman Xueming Shen
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Imported:
                  Indexed: