日语和中文的Unicode代码点的不同表示形式 [英] Different representation of unicode code points in Japanese and chinese

查看:142
本文介绍了日语和中文的Unicode代码点的不同表示形式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试显示对应于Unicode 0x95E8的字形. 此代码点基本上是CJK块(中文,日文,韩文).

I am trying to display the glyph corresponding to unicode 0x95E8. This codepoint is basically of CJK block (chinese, Japanese, Korean).

我正努力知道此特定代码点的字形表示形式对于日语和中文是否可以不同.

I am struggling to know if the glyph representation of this particular codepoint can be different for Japanese and Chinese.

当我在JTextArea中显示此U + 95E8时,我能够在linux/windows上看到门"字符. 但是,当我尝试在嵌入式​​设备"中显示相同的代码点时.显示的字符变为.

When I am displaying this U+95E8 in a JTextArea, i am able to see "门" character on linux/windows. But when I am trying to display the same codepoint in my "embedded device". the displayed character changes to.

我想知道此代码点U + 95E8是否应在所有CJK(中文,日文,韩文)语言环境中均具有统一的表示形式,或者对于某些语言环境而言是否有所不同.这种表现形式可能是由于在不同设备中安装了不同字体所致吗?我为自己的无知感到抱歉,但我对国际化的投入并不高.

I want to know if this codepoint U+95E8 should have uniform representation in all the CJK (Chinese, Japanese, Korean) locales or is different for some of them. Can this kind of manifestation be because of different kind of font installed in different devices? I am sorry for my ignorance but I am not too much into internationalization.

import java.awt.*;
import java.awt.event.*;
import java.util.Locale;

import javax.swing.*;

public class TextDemo extends JPanel implements ActionListener {

    public TextDemo() {
    }

    public void actionPerformed(ActionEvent evt) {
    }

    /**
     * Create the GUI and show it.  For thread safety,
     * this method should be invoked from the
     * event dispatch thread.
     * @throws InterruptedException 
     */
    private static void createAndShowGUI() throws InterruptedException {

        JFrame frame = new JFrame(java.util.Locale.getDefault().getDisplayName());

        frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);

        Container contentPane = frame.getContentPane();
        contentPane.setLayout(new SpringLayout());

        Dimension size = new Dimension(500, 500);
        frame.setSize(size);
        JTextArea textArea = new JTextArea();

        //Font font1 = new Font("SansSerif", Font.BOLD, 20);
        //textArea.setFont(font1);

        textArea.setEditable(true);
        textArea.setSize(new Dimension(400,400));
        textArea.setDefaultLocale(java.util.Locale.SIMPLIFIED_CHINESE);

        textArea.setText("Printing U+95E8 : \u95e8");                
        contentPane.add(textArea);        
        frame.setVisible(true);
    }

    public static void main (String[] args) {
        java.util.Locale.setDefault(java.util.Locale.JAPANESE);
        javax.swing.SwingUtilities.invokeLater(new Runnable() {
            public void run() {
                try {
                    createAndShowGUI();
                } catch (InterruptedException e) {
                    // TODO Auto-generated catch block
                    e.printStackTrace();
                }
            }
        });
    }
}

推荐答案

通常,Unicode中的CJK字符是统一的",这意味着即使该字符传统上对于不同的语言有所不同,也使用单个代码点.从理论上讲,一个字体可以包含一个代码点的多个字形,并带有一些选择机制.实际上,包含CJK字符的字体通常具有单一设计,以反映繁体中文,简体中文,日文或韩文的设计.从这个意义上讲,某些字体可能称为繁体中文",日语"等.

Generally, CJK characters in Unicode are "unified", which means that a single code point is used even though the character has traditionally been somewhat different for the different languages. In theory, a single font can contain multiple glyphs for a code point, with some selection mechanism. In practice, a font that contains CJK characters typically has a single design for them, reflecting the design of Traditional Chinese, Simplified Chinese, Japanese, or Korean. In this sense, some fonts might be called "Traditional Chinese", "Japanese", etc.

显然,您应该根据文本的语言选择字体.

Obviously, you should select the font according to the language of the text.

问题中图像中的字形看起来有些奇怪,并且与

The glyph in the image in the question looks somewhat odd, and it deviates from the glyphs for U+95E8 in some common fonts, which generally show rather similar designs for this character. So for this specific character, the variation can be expected to be only in the general style (e.g., serif vs. sans-serif, stroke width). It seems that the font being used is somehow oddly designed, at least for this character,

这篇关于日语和中文的Unicode代码点的不同表示形式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆