Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8179597

Handle cut and paste of 1, 2 and 4 byte characters

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: P3
    • Resolution: Fixed
    • Affects Version/s: 8, 9, 10
    • Fix Version/s: 10
    • Component/s: javafx
    • Labels:
      None

      Description

      Glass should be able to properly pass on any character passed to it in a cut and paste operation. I found that for 4 byte characters in particular (emoji), our FX applications can display the characters, but cannot cut/paste them on Linux and Mac.

      some data points I gathered when chasing this:
        1) GTK works in UTF8.
        1a) Java says it works in UTF8, but...
        1b) glass GTK has two entry points that are affected:
             get_data_text()
             get_data_raw()
         2) the "problem" character(s) can be anywhere in the character stream
         3) JNI's NewStringUTF() will handle most things, but not any 4 btye character - which means any emoji for example.
        4) for reasons I don't understand, Java uses a "modified" (read non-standard) UTF-8
           see http://banachowski.com/deprogramming/2012/02/working-around-jni-utf-8-strings/
           For most operations the modifications don't matter, but for some chars like emoji - it matters a lot.
        5) to convert to a String, don't use NewStringUTF() as it has not been fixed yet. The same byte array passed to new String(mychars, "UTF-8"); will handle 4 byte chars properly.
        6) we have a problem going the other way with cut and paste too.
               After I created a test app that proved the byte array conversion did work (and got 2 emoji in my text area), trying to use that as a source of a paste to a gedit window showed we did not build the paste string properly and the values were displayed as \ud83d and so forth. This gedit window showed a paste from Firefox with emoji just fine.
          7) I doubt this a a Linux/Mac/Windows issue, this is a Java "modified" UTF-8 issue.
          8) the solution seems to be to do all conversion to/from String/bytes in Java.
              The solution will require some amount of thought and rework in glass to accomplish, hopefully to avoid an upcall into java from jni to perform a conversion, just to make another upcall to hand off the result.
           9) Swing handles the cut and paste operation correctly (both ways....)

      I found these three bugs below without looking too hard. I suspect there might be another one hiding somewhere.

      As part of this bug, the owner will need to see how Swing handles this likely using GTK as an example.

      From there the owner will need to trace the cut and paste logic in Glass further to suggest the cleanest fix.

      Lastly verify the proper behavior of the emoji laced text I have in my two sample programs (maybe with some 2 btye kanji tossed in later).

      And after that... modify one of our test toys to include some "fun" text by default to use for cut and paste testing.

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                ssadetsky Semyon Sadetsky (Inactive)
                Reporter:
                ddhill David Hill (Inactive)
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: