Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6727466

java.exe/JRE1.6.0_10-b25/b27 doesn't seem to handle international characters

    Details

    • Type: Bug
    • Status: Closed
    • Priority: P4
    • Resolution: Duplicate
    • Affects Version/s: 6u10
    • Fix Version/s: None
    • Component/s: tools
    • Labels:
    • Subcomponent:
    • CPU:
      x86
    • OS:
      windows_xp

      Description

      FULL PRODUCT VERSION :
      java version "1.6.0_10-rc"
      Java(TM) SE Runtime Environment (build 1.6.0_10-rc-b27)
      Java HotSpot(TM) 64-Bit Server VM (build 11.0-b14, mixed mode)

      ADDITIONAL OS VERSION INFORMATION :
      Microsoft Windows [Version 5.2.3790]
      (Windows XP x64 Edition, all service packs and updates applied)

      Microsoft Windows XP [Version 5.1.2600]


      EXTRA RELEVANT SYSTEM CONFIGURATION :
      Intel Q6600 CPU

      A DESCRIPTION OF THE PROBLEM :
      When I try to feed java.exe or javaw.exe in JDK/JRE 1.6.0_10-b25 or b27 an argument containing international characters such as chinese, they turn up as question marks when the arguments are passed to my program's main function.
      I have written a small program to demonstrate, which can be downloaded from http://hem.bredband.net/unsound/temp/TestInternationalArgs.java .

      When I for instance execute this command in Windows' "Run"-menu (Windows-R on the keyboard)

      java -cp "C:\Temp" TestInternationalArgs "C:\Temp\<some international character>\file.txt"

      (assuming TestInternationalArgs.class is present in C:\Temp)
      I get the following output in my demonstration program:

      -----
      Arguments passed to program:
        args[0]: "C:\Temp\??\file.txt"

      Argument strings as hexadecimal representations of UTF-16 values:
        args[0]: 'C' (0x43) ':' (0x3a) '\' (0x5c) 'T' (0x54) 'e' (0x65) 'm' (0x6d) 'p' (0x70) '\' (0x5c) '?' (0x3f) '?' (0x3f) '\' (0x5c) 'f' (0x66) 'i' (0x69) 'l' (0x6c) 'e' (0x65) '.' (0x2e) 't' (0x74) 'x' (0x78) 't' (0x74)
      -----

      The limitation seems to be within the java.exe executable. Creating a Java VM manually with JNI and invoking the main class with arguments containing international characters works fine (obviously, since you have a lot more control that way).
      I'm running Windows XP x64 when testing this, and tested both the x64 version and the x86 version of JDK 1.6.0_10-b25 (they behaved the same way). I also tested on another computer with regular 32-bit Windows XP installed and with b25, and the result was the same.

      (According the OpenJDK source code (openjdk-6-src-b10_30_may_2008)), it seems that you, in java.c, function main and JavaMain, treat argc and
      argv as multi byte character strings encoded in the system OEM encoding
      (which is usually Cp1252). This is not the way to do it in Windows
      nowadays... you get the command line with GetCommandLineW (
      http://msdn.microsoft.com/en-us/library/ms683156(VS.85).aspx ) instead
      of using argc and argv and then convert it into argv-style wchar_t
      strings using CommandLineToArgvW (
      http://msdn.microsoft.com/en-us/library/bb776391.aspx ) in order to
      properly get a unicode (UTF-16) argument array. This is how I do it in
      my custom launcher.
      These functions are present in Windows 2000 and later. I'm not sure if
      you're currently supporting NT4 / 9x... in that case you would need to
      do some conditional function calls.

      Either way, may I suggest that you add a function such as
      GetCommandLineAsUTF8(int argc, char** argv) in java_md.h and implement
      this for the different platforms? This is a very machine dependent
      operation, so I think it belongs there.



      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      import javax.swing.*;

      /** Small program which pops up a JTextPane showing the argument list passed to main in detail. */
      public class TestInternationalArgs {
          public static void main(final String[] args) {
      JTextArea jta = new JTextArea(50, 80);
      jta.setLineWrap(true);
      JScrollPane jtaScroller = new JScrollPane(jta);
      JFrame jf = new JFrame("TestInternationalArgs");
      jf.add(jtaScroller);
      jf.pack();
      jf.setLocationRelativeTo(null);
      jf.setVisible(true);
      jf.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);

      /* Print args to text pane to ensure international characters
      * are displayed correctly. */
      jta.append("Arguments passed to program:\n");
      for(int i = 0; i < args.length; ++i) {
      String cur = args[i];
      jta.append(" args[" + i + "]: \"" + cur + "\"\n");
      }
      jta.append("\n");
      jta.append("Argument strings as hexadecimal representations of UTF-16 values:\n");
      for(int i = 0; i < args.length; ++i) {
      char[] cur = args[i].toCharArray();
      jta.append(" args[" + i + "]:");
      for(char c : cur) {
      jta.append(" '" + c + "' (0x" + Integer.toHexString(c) + ")");
      }
      jta.append("\n");
      }
          }
      }
      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      Write your own java launcher which looks up jvm.dll, creates the VM, finds the main class and executes it.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                ksrini Kumar Srinivasan
                Reporter:
                ryeung Roger Yeung (Inactive)
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Imported:
                  Indexed: