Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4095325

[BI] RFE: Need special word-break tables for Chinese

    Details

    • Type: Enhancement
    • Status: Closed
    • Priority: P4
    • Resolution: Won't Fix
    • Affects Version/s: 1.1.5, 1.2.0, 5.0
    • Fix Version/s: None
    • Component/s: core-libs
    • Subcomponent:
    • CPU:
      generic, x86, sparc
    • OS:
      generic, solaris_2.5, windows_95

      Description

      Name: bb33257 Date: 11/25/97


      The word-break tables (i.e., the tables used by the BreakIterator
      returned by BreakIterator.getWordInstance()-- line-breaking
      tables are fine) treat CJK characters in a Japanese-specific way:
      an arbitrary run of Kanji characters, followed by an optional
      arbitrary run of Hiragana characters, followed by an optional
      arbitrary run of Katakana characters, all gets treated as a
      single "word" by the word-break iterator. However, in Chinese
      text, which doesn't use hiragana or katakana, this will result
      in whole paragraphs (instead of individual ideographs) being
      treated as "words" for the purposes of double-click selection
      and "find whole words" operations. Chinese will therefore
      require its own state tables for word breaking.
      ======================================================================

      Dictionary-based break iterators may also be needed for Korean and Japanese.
      ###@###.### 11/2/04 18:15 GMT

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                peytoia Yuka Kamiya (Inactive)
                Reporter:
                bcbeck Brian Beck (Inactive)
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Imported:
                  Indexed: