XMLWordPrintable

    Details

      Description

      The change JDK-8235812 in Java 15 introduced incorrect behavior for matching of the `\R` Unicode linebreak sequence when using the `java.util.regex.Pattern` API. The `\R` sequence should match CR (U+000D) or LF (U+000A) individually, but it should not match an individual CR if it occurs in a CRLF sequence. An example of the erroneous behavior is that the pattern `\R{2}` matches a CRLF sequence, but it should not. A possible workaround is to match linebreaks using individual characters instead of `\R`, using negative lookahead to prevent matching of an individual CR within a CRLF sequence. To do this, replace the `\R` sequence with the following:
      ```
          (?:(\u000D\u000A)|((?!\u000D\u000A)[\000A\u000B\u000C\u000D\u0085\u2028\u2029]))
      ```
      A simpler sequence can be used if matching all of the Unicode-specified linebreak characters is not required, or if special treatment for the CRLF sequence is not required.

        Attachments

          Activity

            People

            Assignee:
            smarks Stuart Marks
            Reporter:
            smarks Stuart Marks
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: