Java控制台无法正确读取汉字 [英] Java console not reading in Chinese characters correctly

查看:376
本文介绍了Java控制台无法正确读取汉字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在努力让Eclipse正确读取汉字,我不知道我可能会错误。



具体来说,从控制台读取中文(简体或繁体)字符串并输出它之间,会出现乱码。
即使输出一个大字符串的混合文本(英文/中文字符),它似乎只改变汉字的外观。



我已经把它剪切到下面的测试示例,并明确注释它与我相信在每个阶段发生 - 注意,我是一个学生,将非常喜欢以确认我的理解(或其他):)

  public static void main(String [] args){
try
{
boolean isRunning = true;

//从控制台输入数据的原始流
InputStream inputStream = System.in;
//允许您读取流,使用默认字符编码,否则指定的编码;
InputStreamReader inputStreamReader = new InputStreamReader(inputStream,UTF-8);
//添加用于将读取的流转换为字符串(?)的功能
BufferedReader input_BufferedReader = new BufferedReader(inputStreamReader);


//输出到控制台的原始数据流
OutputStream outputStream = System.out;
//从给定的文本位写一个流
OutputStreamWriter outputStreamWriter = new OutputStreamWriter(outputStream,UTF-8);
//添加基本功能写入流
BufferedWriter output_BufferedWriter = new BufferedWriter(outputStreamWriter);



while(isRunning){
System.out.println(); //强制额外的换行
System.out.print(> ;);

//读入一行文本(作为String):
String userInput_asString = input_BufferedReader.readLine();

//输出一行文本:
String outputToUser_fromString_englishFromCode =foo; //输出正确
output_BufferedWriter.write(outputToUser_fromString_englishFromCode);
output_BufferedWriter.flush();

System.out.println(); //强制额外换行

字符串outputToUser_fromString_ChineseFromCode =之谓甚 //输出正确
output_BufferedWriter.write(outputToUser_fromString_ChineseFromCode);
output_BufferedWriter.flush();

System.out.println(); //强制额外的换行

String outputToUser_fromString_userSupplied = userInput_asString; //给定英文文本时输出正确,给定中文文本时显示乱码
output_BufferedWriter.write(outputToUser_fromString_userSupplied);
output_BufferedWriter.flush();

System.out.println(); //强制额外的换行符

}
}
catch(Exception e){
/ / TODO:handle exception
}
}

输出示例:

 >之谓甚
foo
之谓
ä¹<è¬,çš

> oaea
foo
之原则
oaea

>混合输入 - 英语:fubar;中文:
foo
之谓甚$ b混合输入 - 英语:fubar;中文:ä¹<è¬,çš;

>

在这个Stack Overflow post上看到的内容与我在Eclipse控制台中看到的一样, (当查看/编辑变量值时)。通过Eclipse调试器手动更改变量值会导致代码取决于该值,以便像我通常预期的那样工作,这表明它是如何读取 IN 这是一个问题。



我已经尝试过许多不同的扫描器/缓冲流[读写器| writer]等的组合读入和输出,有和没有明确的字符类型,虽然这不是特别系统地做,可能很容易错过了什么。



我试图将Eclipse环境设置为尽可能使用UTF-8,但我想我可能错过了一两个地方。注意控制台会正确非常感谢任何帮助/指导:)

输入硬编码的中文字符



<解决方案

看起来控制台没有正确读取输入。这是我认为描述您的问题和工作循环的链接。



http://paranoid-engineering.blogspot.com/2008/05/getting-unicode-output-in-eclipse.html



简单答案:
尝试在eclipse.ini中设置环境变量-Dfile.encoding = UTF-8。
(在为整个eclipse启用它之前,你可以尝试在这个程序的调试配置中设置它,看看它是否工作)



更多建议


I am struggling to get Eclipse to read in Chinese characters correctly, and I am not sure where I may be going wrong.

Specifically, somewhere between reading in a string of Chinese (simplified or traditional) from the console and outputting it, it gets garbled. Even when outputting a large string of mixed text (English/Chinese characters), it appears to only alter the appearance of the Chinese characters.

I have cut it down to the following test example and explicitly annotated it with what I believe is happening at each stage - note that I am a student and would very much like to confirm my understanding (or otherwise) :)

public static void main(String[] args) {    
    try 
    {
        boolean isRunning = true;

        //Raw flow of input data from the console
        InputStream inputStream = System.in;
        //Allows you to read the stream, using either the default character encoding, else the specified encoding;
        InputStreamReader inputStreamReader = new InputStreamReader(inputStream, "UTF-8");
        //Adds functionality for converting the stream being read in, into Strings(?)
        BufferedReader input_BufferedReader = new BufferedReader(inputStreamReader);


        //Raw flow of outputdata to the console
        OutputStream outputStream = System.out;
        //Write a stream, from a given bit of text
        OutputStreamWriter outputStreamWriter = new OutputStreamWriter(outputStream, "UTF-8");
        //Adds functionality to the base ability to write to a stream
        BufferedWriter output_BufferedWriter = new BufferedWriter(outputStreamWriter);



        while(isRunning) {
            System.out.println();//force extra newline
            System.out.print("> ");

            //To read in a line of text (as a String):
            String userInput_asString = input_BufferedReader.readLine();

            //To output a line of text:
            String outputToUser_fromString_englishFromCode = "foo"; //outputs correctly
            output_BufferedWriter.write(outputToUser_fromString_englishFromCode);
            output_BufferedWriter.flush();

            System.out.println();//force extra newline

            String outputToUser_fromString_ChineseFromCode = "之謂甚"; //outputs correctly
            output_BufferedWriter.write(outputToUser_fromString_ChineseFromCode);
            output_BufferedWriter.flush();

            System.out.println();//force extra newline

            String outputToUser_fromString_userSupplied = userInput_asString; //outputs correctly when given English text, garbled when given Chinese text
            output_BufferedWriter.write(outputToUser_fromString_userSupplied);
            output_BufferedWriter.flush();

            System.out.println();//force extra newline

        }
    }
    catch (Exception e) {
        // TODO: handle exception
    }
}

Sample output:

> 之謂甚
foo
之謂甚
之謂ç"š

> oaea
foo
之謂甚
oaea

> mixed input - English: fubar; Chinese: 之謂甚;
foo
之謂甚
mixed input - English: fubar; Chinese: 之謂ç"š;

> 

What is seen on this Stack Overflow post matches exactly what I see in the Eclipse console and what is seen within the Eclipse debugger (when viewing/editing the variable values). Altering the variable values manually via the Eclipse debugger results in the code depending on that value to behave as I would normally expect them to, suggesting that it is how the text is read IN that is an issue.

I have tried many different combinations of scanners/buffered stream [reader|writer]s etc to read in and output, with and without explicit character types though this wasn't done particularly systematically and could easily have missed something.

I have tried to set the Eclipse environment to use UTF-8 wherever possible, but I guess I could have missed a place or two.. Note that the console will correctly output hard-coded Chinese characters.

Any assistance / guidance on this matter is greatly appreciated :)

解决方案

It looks like the console is not reading the input correctly. Here is a link that I believe describes your problem and work-rounds.

http://paranoid-engineering.blogspot.com/2008/05/getting-unicode-output-in-eclipse.html

Simple Answer : Try setting the environmental variable -Dfile.encoding=UTF-8 in your eclipse.ini. (Before enabling this for whole of eclipse, you could just try setting this in the debug configurtion for this program and see if it works )

The link has lot more suggestions

这篇关于Java控制台无法正确读取汉字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆