Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8232161

Align some one-way conversion in MS950 charset with Windows

    Details

      Backports

        Description

        According to make/data/charsetmapping/MS950.nr, 10 1-way trip entries are there.

        I tried following instructions on Traditional Chinese Windows.

        1. Run WriteMS950Data to create 1-way trip data file
        >java WriteMS950Data.java ms950data.txt

        2. Check byte data
        >java DumpText.java ms950data.txt
        A2 A4 F9 F9 0D 0A
        A2 A5 F9 E9 0D 0A
        A2 A7 F9 EB 0D 0A
        A2 A6 F9 EA 0D 0A
        A2 7E F9 FA 0D 0A
        A2 A1 F9 FB 0D 0A
        A2 A3 F9 FD 0D 0A
        A2 A2 F9 FC 0D 0A
        A2 CC A4 51 0D 0A
        A2 CE A4 CA 0D 0A

        3. Open ms950data.txt by notepad and just save without editing
        >notepad ms950data.txt

        4. Check data
        >java DumpText.java ms950data.txt
        F9 F9 F9 F9 0D 0A
        F9 E9 F9 E9 0D 0A
        F9 EB F9 EB 0D 0A
        F9 EA F9 EA 0D 0A
        A2 7E A2 7E 0D 0A
        A2 A1 A2 A1 0D 0A
        A2 A3 A2 A3 0D 0A
        A2 A2 A2 A2 0D 0A
        A4 51 A4 51 0D 0A
        A4 CA A4 CA 0D 0A

        I think above data is expected on MS950 charset.

        >java -showversion TestMS950
        openjdk version "14-ea" 2020-03-17
        OpenJDK Runtime Environment (build 14-ea+17-721)
        OpenJDK 64-Bit Server VM (build 14-ea+17-721, mixed mode, sharing)
        \u2550\u2550, expected: \xF9\xF9\xF9\xF9, result: \xA2\xA4\xA2\xA4
        \u255E\u255E, expected: \xF9\xE9\xF9\xE9, result: \xA2\xA5\xA2\xA5
        \u2561\u2561, expected: \xF9\xEB\xF9\xEB, result: \xA2\xA7\xA2\xA7
        \u256A\u256A, expected: \xF9\xEA\xF9\xEA, result: \xA2\xA6\xA2\xA6
        Exception in thread "main" java.lang.Exception: Failed
                at TestMS950.main(TestMS950.java:83)

        Also I checked
        ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit950.txt

        $ egrep '^0x255[0e]|^0x256[1a]' bestfit950.txt
        0x2550 0xf9f9 ;?? Forms Double Horizontal
        0x255e 0xf9e9 ;?? Forms Vertical Single And Right Double
        0x2561 0xf9eb ;?? Forms Vertical Single And Left Double
        0x256a 0xf9ea ;?? Forms Vertical Single And Horizontal Double

        I think above 4 entries are unexpected

        JTreg output is as follows:
        Without Fix
        =========
        $ jtreg -v TestMS950.java
        runner starting test: sun/nio/cs/TestMS950.java
        runner finished test: sun/nio/cs/TestMS950.java
        Failed. Execution failed: `main' threw exception: java.lang.Exception: Failed
        Test results: failed: 1

        With Fix
        =======
        $ jtreg -v TestMS950.java
        runner starting test: sun/nio/cs/TestMS950.java
        runner finished test: sun/nio/cs/TestMS950.java
        Passed. Execution successful
        Test results: passed: 1

        The fix was submitted by
        https://mail.openjdk.java.net/pipermail/core-libs-dev/2019-October/062862.html

        Testcase2

        1. Create test data ms950.txt
        >jdk-14\bin\java WriteMS950Data1.java ms950.txt

        >type ms950.txt





















        2. Cut 1st line character, then put it after findstr command
        >findstr /N ═ ms950.txt
        2:═

        3. Create backup data file
        >copy ms950.txt ms950.txt.orig
        複製了 1 個檔案。

        4. Open ms950.txt by notepad, then just save the data
        >notepad ms950.txt

        5. Type findstr command again, then #1 and #2 are displayed
        >findstr /N ═ ms950.txt
        1:═
        2:═

        6. Create test data via java
        >jdk-14\bin\java cat.java ms950.txt.orig ms950-java.txt

        7. Type findstr command again, nothing displayed <= This is issue
        >findstr /N ═ ms950-java.txt

        8. Check differences, 4 characters are displayed
        >fc ms950.txt ms950-java.txt
        正在比較檔案 ms950.txt 和 MS950-JAVA.TXT
        ***** ms950.txt









        ***** MS950-JAVA.TXT









        *****

        9. Check the character on each file
        >jdk-14\bin\java lookfor.java ═ ms950-java.txt
          1: ═
          2: ═

        >jdk-14\bin\java lookfor.java ═ ms950.txt
          1: ═
          2: ═

        >jdk-14\bin\java lookfor.java ═ ms950.txt.orig
          1: ═
          2: ═

          Attachments

          1. cat.java
            0.6 kB
          2. DumpText.java
            0.4 kB
          3. lookfor.java
            0.6 kB
          4. ms950-01.png
            ms950-01.png
            25 kB
          5. ms950-02.png
            ms950-02.png
            37 kB
          6. TestMS950.java
            4 kB
          7. WriteMS950Data.java
            1 kB
          8. WriteMS950Data1.java
            1 kB

            Issue Links

              Activity

                People

                • Assignee:
                  itakiguchi Ichiroh Takiguchi
                  Reporter:
                  itakiguchi Ichiroh Takiguchi
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  6 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: