Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8005947

Windows builds in Chinese environment create generated files with Chinese characters

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: P4
    • Resolution: Duplicate
    • Affects Version/s: 8
    • Fix Version/s: 9
    • Component/s: infrastructure
    • Labels:

      Description

      It appears that somewhere in the jdk8 builds, the system encoding is leaking into the build process. This makes builds unpredictable.

      -kto

      From Frank Ding:


      On Jan 7, 2013, at 9:59 PM, Frank Ding wrote:

      Hi Kelly,
       I have filed a bug whose internal review id is 2421470. It was filed with "Product/Category" being "JDK/JRE" and "Subcategory" being "Problems common to more than one tool". I am wondering since it only happens when building OpenJDK, is it eligible for a Java bug?
       In addition, do you have any idea on how to force java programs such as idlj to use ascii?

      Best regards,
      Frank

      On 1/8/2013 3:42 AM, Kelly O'Hair wrote:
      Did a bug report get filed for this issue?

      -kto

      On Jan 4, 2013, at 9:37 PM, Frank Ding wrote:

      Hi Volker,
       Yes, I think so. The comment is pasted below.
      /**
      * org/omg/PortableServer/Current.java .
      * 由IDL-to-Java 编译器 (可移植), 版本 "3.2"生成
      * 从../../../../src/share/classes/org/omg/PortableServer/poa.idl
      * 2013年1月4日 星期五 下午01时21分01秒 CST
      */

      It's in Chinese, and it says when translated to English that "Generated by IDL-to-Java compiler(portable) version 3.2 and the date. You can also view a "normal" English one under openjdk folder "corba\gensrc\" after performing a build.

      An relevant question is that in my env (export LANG=C), java output without any param always gives me Chinese help. Even though we may find out which env var or value jvm reads on startup, it could be impossible to change them. Any insight or idea on the mechanism is welcome.

      Best regards,
      Frank

      On 1/4/2013 4:59 PM, Volker Simonis wrote:
      This is just a wild guess, but perhaps idlj uses the value of some environment variables (or values derived from them - check System.getProperties()) which contain non ASCII characters? This could be something like PATH, HOSTNAME, USER. What exact characters are there in the comment and what kind of comment is it? How does this comment look on a "normal" system?

      Regards,
      Volker


      On Fri, Jan 4, 2013 at 6:29 AM, Frank Ding <dingxmin@linux.vnet.ibm.com <mailto:dingxmin@linux.vnet.ibm.com>> wrote:

         Hi Kelly
           I investigated how local specific characters get into generated
         sources in corba module. Those classes are generated by following
         command idlj
         c:/openjdk/dep/jdk1.7.0_02/bin/idlj -J-XX:-PrintVMOptions
         -J-XX:+UnlockDiagnosticVMOptions -J-XX:-LogVMOutput -J-Xmx512m
         -J-Xms512m -J-XX:PermSize=32m -J-XX:MaxPermSize=160m -td
         "c:/openjdk/ojdk8_ojdk_739/../ojdk8_ojdk_739-debug/corba/gensrc"
         -i "../../../../src/share/classes/org/omg/PortableServer" -i
         "../../../../src/share/classes/org/omg/PortableInterceptor" -corba
         3.0 -fall -pkgPrefix PortableServer org.omg
         ../../../../src/share/classes/org/omg/PortableServer/poa.idl

           I checked idlj help but there is no encoding specific option. My
         locale environment vars are listed below

         $ locale
         LANG=C
         LC_CTYPE="C"
         LC_NUMERIC="C"
         LC_TIME="C"
         LC_COLLATE="C"
         LC_MONETARY="C"
         LC_MESSAGES="C"
         LC_ALL=

           Could you give me any hint about how to force idlj to generate
         ascii chars only?

         Best regards,
         Frank


         On 1/1/2013 12:47 AM, Kelly O'Hair wrote:

             In the past, the "-encoding ascii" was important, all the
             reasons I can't completely list right now. But it is important
             that regardless of the locale, the bits created during the
             build should be the same for everyone.
             The definition of "same" might not be bit for bit, but by
             minimizing the potential differences we have a fighting
             chance of measuring "the same".

             But my question is, how are any locale specific characters
             getting into generated sources? That's what we need to find out.

             Removing "-encoding ascii" is probably not the right answer,
             and if it is, will require some debate.

             -kto

             On Dec 30, 2012, at 9:25 PM, Frank Ding wrote:

                 Hi
                   I have an encoding problem when building openjdk 8 on my
                 Windows 7. My windows is Chinese environment but I
                 exported LANG=C in cygwin bash before building. The issue
                 is that in module corba and jdk, some java classes are
                 generated by program. They happen to contain Chinese
                 characters in comment. However, they are compiled with
                 explicit option "-encoding ascii" in makefile. This
                 results in unrecognizable chars complained by javac
                 (Error: encoding ascii unmappable chars) . I have a patch
                 that removes all unnecessary "-encoding ascii" but I am
                 not sure all its side effect. Shall I submit a bug?

                   Best regards,
                   Frank





        Attachments

          Issue Links

            Activity

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              ohair Kelly Ohair (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: