Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4833373

Charset.decode() returns incorrect character sequence with UTF-8

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: P4
    • Resolution: Not an Issue
    • Affects Version/s: 5.0
    • Fix Version/s: None
    • Component/s: core-libs
    • Subcomponent:
    • CPU:
      sparc
    • OS:
      solaris_8

      Description

      Name: auR10023 Date: 03/17/2003



      java.nio.charset.Charset.decode(ByteBuffer bb) returns incorrect character
      sequence with unmapable character 0x00010000.

      Here is the example:

      -------test.java---------

      import java.io.*;
      import java.nio.*;
      import java.nio.charset.*;

      public class test {
          public static void main (String [] args) {
              byte bArray [] = new byte [] {
                  //UTF-8 representation of the 0x10000
                  (byte)0xf0, (byte)0x90, (byte)0x80, (byte)0x80
              };

              try {
                  Charset c = Charset.forName("UTF-8");
                  ByteBuffer bbuf = ByteBuffer.allocate(bArray.length);
                  bbuf.put(bArray);
                  bbuf.position(0);
                  CharBuffer res = c.decode(bbuf);
       
                  if (res.length() != 1) {
                      System.out.println("Incorrect character sequence");
                      for (int j = 0; j < res.length(); j++) {
                          System.out.println((int)res.get(j));
                      }
                      return;
                  }
                  System.out.println("OKAY");
              } catch(IllegalCharsetNameException e) {
                  System.out.println("Unexpected " + e);
              } catch (UnsupportedCharsetException e) {
                  System.out.println("Unexpected " + e);
              }
          }
      }

      Here is the result
      #java -version

      java version "1.4.2-beta"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2-beta-b16)
      Java HotSpot(TM) Client VM (build 1.4.2-beta-b16, mixed mode)

      #java test

      Incorrect character sequence
      55296
      56320

      ======================================================================

        Attachments

          Activity

            People

            Assignee:
            ilittlesunw Ian Little (Inactive)
            Reporter:
            avusunw Avu Avu (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: