Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8176371

(scanner) Scanner fails when string length equals buffer size and latest characters are the delimiter

    Details

    • Subcomponent:
    • CPU:
      generic
    • OS:
      generic

      Description

      FULL PRODUCT VERSION :


      ADDITIONAL OS VERSION INFORMATION :
      Microsoft Windows 8.x

      A DESCRIPTION OF THE PROBLEM :
      I've found a strange behaviour of java.util.Scanner class. I tried to split a String variable into a set of tokens separated by the delimiter ";" using a Scanner variable.

      If I consider a string of "<any_char>[*1022]" + ";[*n]" I expect that Scanner returns a number n of token. However, when n=3, the Scanner class fails: it "see" just 2 tokens instead of 3. I think it's something related to internal char buffer size of Scanner class (1024 characters) and I've found this issue only if the last characters are exacly the delimiter set for the Scanner variable.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Generate a string of composed by 2 parts:
      1- 1022 random characters (even the delimiter)
      2- an ending set of 3 characters exactly the same as the delimiter set (in my case ";;;")

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      If I consider a string of "a[*1022]" + ";[*n]" I expect a number n of token. However if n=3 the Scanner class fails: it "see" just 2 tokens instead of 3. I think it's something related to internal char buffer size of Scanner class.

      a[x1022]; -> 1 token

      a[x1022];; -> 2 token

      a[x1022];;; -> 3 token

      a[x1022];;;; -> 4 token
      ACTUAL -
      a[x1022]; -> 1 token: correct

      a[x1022];; -> 2 token: correct

      a[x1022];;; -> 2 token: wrong (I expect 3 tokens)

      a[x1022];;;; -> 4 token: correct

      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      I attach a simple example:

      import java.util.Scanner;

      public static void main(String[] args) {

          // generate test string: (1022x "a") + (3x ";")
          String testLine = "";
          for (int i = 0; i < 1022; i++) {
              testLine = testLine + "a";
          }
          testLine = testLine + ";;;";

          // set up the Scanner variable
          String delimeter = ";";
          Scanner lineScanner = new Scanner(testLine);
          lineScanner.useDelimiter(delimeter);
          int p = 0;

          // tokenization
          while (lineScanner.hasNext()){
                  p++;
                  String currentToken = lineScanner.next();
                  System.out.println("token" + p + ": '" + currentToken + "'");
          }
          lineScanner.close();
      }
      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      Using String .split method

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                sherman Xueming Shen
                Reporter:
                webbuggrp Webbug Group
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated: