Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8278587

StringTokenizer(String, String, boolean) documentation bug

    XMLWordPrintable

    Details

    • Subcomponent:
    • Resolved In Build:
      b03
    • CPU:
      generic
    • OS:
      generic

      Backports

        Description

        A DESCRIPTION OF THE PROBLEM :
        https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/StringTokenizer.html#%3Cinit%3E(java.lang.String,java.lang.String,boolean) said: "Each delimiter is returned as a string of length one." This is not correct if any of the delimiter is a valid Unicode surrogate pair since the returned string will be of length two because the delimiter is represented by two code units.

        EXPECTED VERSUS ACTUAL BEHAVIOR :
        EXPECTED -
        "Each delimiter is returned as a string of the code unit(s) of the delimiter."

        Or remove "Each delimiter is returned as a string of length one." and clarify that "characters" in StringTokenizer documentation context refers to Unicode code points like other documentation, e.g., that of String: "The String class provides methods for dealing with Unicode code points (i.e., characters), in addition to those for dealing with Unicode code units (i.e., char values)." - https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html.
        ACTUAL -
        "Each delimiter is returned as a string of length one."

        ---------- BEGIN SOURCE ----------
        import java.util.StringTokenizer;

        public class StringTokenizerPlayground {

          public static void main(String[] args) {
            final var s = "\uD83D\uDE00"; // Grinning Face
            final var tokenizer = new StringTokenizer(s, s, true);

            final var tokenCount = tokenizer.countTokens();

            if (tokenCount != 1) {
              throw new AssertionError();
            }

            final var token = tokenizer.nextToken();

            if (token.length() != 2) {
              throw new AssertionError();
            }

            if (!token.equals(s)) {
              throw new AssertionError();
            }
          }
        }
        ---------- END SOURCE ----------

        FREQUENCY : always


          Attachments

            Issue Links

              Activity

                People

                Assignee:
                naoto Naoto Sato
                Reporter:
                webbuggrp Webbug Group
                Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                  Dates

                  Created:
                  Updated:
                  Resolved: