Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4086919

line terminators represented with Unicode-escapes

    XMLWordPrintable

    Details

    • Subcomponent:
    • Resolved In Build:
      1.2beta2
    • CPU:
      generic, sparc
    • OS:
      solaris_2.5.1
    • Verification:
      Verified

      Description



      Name: laC46010 Date: 10/17/97


      Java language permits to use Unicode-escapes to represent
      various source characters including line terminators.
      It means for example that \u000D may be used to terminate
      single-line comment:

      // text text \u000D int i=1;

      However Java compiler doesn't recognize \u000D ('CR' represented with
      Unicode-escape) as LINE TERMINATOR.

      JLS specifies that Unicode-escapes are processed before anything else
      (3.2 Lexical Translations, p.12):

      A raw Unicode character stream is translated into a sequence of
      Java tokens, using the following three lexical translation
      steps, which are applied in turn:

      1. A translation of Unicode escapes in the raw stream of
      Unicode characters to the corresponding Unicode character. A
      Unicode escape of the form \uxxxx, where xxxx is a hexadecimal
      value, represents the Unicode character whose encoding is
      xxxx. This translation step allows any Java program to be
      expressed using only ASCII characters.

      2. A translation of the Unicode stream resulting from step 1
      into a stream of input characters and line terminators.

      3. A translation of the stream of input characters and line
      terminators resulting from step 2 into a sequence of Java
      input elements which, after white space and comments are
      discarded, comprise the tokens that are the terminal symbols
      of the syntactic grammar.

      Note that the similar bug (bugID 4063147) for line terminators represented
      with Unicode-escapes within string literals has been fixed in jdk1.2beta1.

      The following JCK12beta1 tests are failed due to this bug:

      lang/LEX/lex005/lex00591/lex00591.html
      lang/LEX/lex054/lex05402/lex05402.html
      lang/LEX/lex054/lex05403/lex05403.html
      lang/LEX/lex058/lex05891/lex05891.html

      See "lex00503" source and results below:

      > /export/ld14/java/dest/jdk1.2P/solaris/bin/java -fullversion
      java full version "JDK1.2P"
      > /export/ld14/java/dest/jdk1.2P/solaris/bin/javac -d . lex00503.java
      lex00503.java:17: Invalid character in input.
      int a; \u000D
      ^
      1 error
      ----------------------lex00503.java----------------------
      // Ident: %Z%%M% %I% %E%
      // Copyright %G% Sun Microsystems, Inc. All Rights Reserved
      // Auto-generated with Jmpp
      // Template Ident: @(#)lex00591.jmpp 1.1 97/10/10

      package javasoft.sqe.tests.lang.lex005.lex00503;

      import java.io.PrintStream;
        
      public class lex00503 {
      public static void main(String argv[]) {
      System.exit(run(argv, System.out) + 95/*STATUS_TEMP*/);
      }

      public static int run(String argv[],PrintStream out) {
      /*--- Line terminator `carriage return` as Unicode-escape. ---*/
      int a; \u000D
      return 0/*STATUS_PASSED*/;
      }
      }
      -----------------------------------------------------------

      ======================================================================

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              dstoutamsunw David Stoutamire (Inactive)
              Reporter:
              leosunw Leo Leo (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: