读取java中的unicode字符 [英] to read unicode character in java

查看:79
本文介绍了读取java中的unicode字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Java从utf-8中保存的文本文件中读取Unicode字符我的文本文件如下

、、बियन,खियन,फियन,बन,थन,थन,जमर,हम,जम,गल,गथ,दरसे,दरनै,थनै,थथाम,सथाम,खफ,गल,गथ,मिख,जथ,जाथ,थाथ,दद,देख,न,नेथ,बर,बुंथ,बिथ,बिख,बेल,मम,आ,आइ,आउ,आगदा,आगसिर

我已经尝试使用以下代码

  import java.io. *;导入java.util.*;导入java.lang.*;公共类UcharRead{公共静态void main(String args []){尝试{字符串str;BufferedReader bufReader = new BufferedReader(new InputStreamReader(new FileInputStream("research_words.txt"),"UTF-8"));while((str = bufReader.readLine())!= null){System.out.println(str);}}catch(异常e){}}} 

取出作为???????????????????????????谁能帮我

解决方案

(最有可能)您正在正确阅读文本,但是当您将其写出时,还需要启用UTF-8.否则,所有无法使用默认编码打印的字符都将变成问号.

尝试将其写入文件而不是System.out(并指定正确的编码):

  Writer w =新的OutputStreamWriter(新的FileOutputStream("x.txt"),"UTF-8"); 

i am trying to read Unicode characters from a text file saved in utf-8 using java my text file is as follows

अ, अदेबानि ,अन, अनसुला, अनसुलि, अनफावरि, अनजालु, अनद्ला, अमा, अर, अरगा, अरगे, अरन, अराय, अलखद, असे, अहा, अहिंसा, अग्रं, अन्थाइ, अफ्रि, बियन, खियन, फियन, बन, गन, थन, हर, हम, जम, गल, गथ, दरसे, दरनै, थनै, थथाम, सथाम, खफ, गल, गथ, मिख, जथ, जाथ, थाथ, दद, देख, न, नेथ, बर, बुंथ, बिथ, बिख, बेल, मम, आ, आइ, आउ, आगदा, आगसिर

i have tried with the code as followed

import java.io.*;
import java.util.*;
import java.lang.*;
public class UcharRead
{
    public static void main(String args[])
    {
        try
        {
            String str;
            BufferedReader bufReader = new BufferedReader( new InputStreamReader(new FileInputStream("research_words.txt"), "UTF-8"));
            while((str=bufReader.readLine())!=null)
            {
                System.out.println(str);
            }
        }
        catch(Exception e)
        {
        }
    }
}

getting out put as ???????????????????????? can anyone help me

解决方案

You are (most likely) reading the text correctly, but when you write it out, you also need to enable UTF-8. Otherwise every character that cannot be printed in your default encoding will be turned into question marks.

Try writing it to a File instead of System.out (and specify the proper encoding):

Writer w = new OutputStreamWriter(
   new FileOutputStream("x.txt"), "UTF-8");

这篇关于读取java中的unicode字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆