如何使用Java中的Scanner正确读取中文字符? [英] How can I read Chinese characters correctly using Scanner in Java?

查看:938
本文介绍了如何使用Java中的Scanner正确读取中文字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

编程语言:Java
任务:设计一个哈希函数,将中文字符串映射到数字
问题:正确读取和显示中文字符

Programming language: Java Task: designing a hash function that maps Chinese Strings to numbers Problem: correct reading and displaying of Chinese characters

这是一个家庭作业问题,但我不是在问这个问题,只是在实现阅读汉字时遇到了麻烦。

This is a homework question, but I'm not asking how to do it, just having trouble implementing the reading of Chinese characters.

我的任务的简短说明:设计一个哈希函数,将我们班级中学生的姓名映射到他们的学生ID和其他卫星数据(性别,电话之类的)。

A short description of my task: to design a hash function to map (Chinese) students' names in our class to their student IDs, and other satellite data (gender, phone and the like).

我还在考虑它,但就像其他语言一样,这个范围涉及我使用字符的字符编码,通过哈希函数来如果我没有弄错的话,有一个独特的价值。

I'm still thinking about it, but just like other languages, the scope of this involves me using the character encoding of a character to, via the hash function, come up with a unique value, if I'm not mistaken.

以下是我必须测试这一思路的有效性:

Here's what I have to test the validity of this train of thought:

// test whether console can read chinese characters
Scanner s = new Scanner(System.in);

System.out.print("Please enter a Chinese character: ");
int chi = (int)s.next().toCharArray()[0];

System.out.println("\nThe string entered is " + chi);

如果我使用简单的System.out.println(character)语句,正确的字符是显示。

If I use a simple System.out.println("character") statement, the correct character is displayed.

但如上所示,如果我使用Scanner读取输入,我试图将String转换为char数组,然后转换为其int unicode等效,但它来了一个荒谬的数字,我无法正确显示它。

But as seen above, if I use Scanner to read input, I've tried to convert the String into a char array then to its int unicode equivalent, but it comes up with a ridiculous number, and I can't display it correctly.

我意识到我可以使用这个错误的值来设计一个哈希函数,但为了不产生可能的冲突(我不知道这些是否会产生UNIQUE错误为了学习,你能指出我如何统一不同机器上的汉字输入吗?

I realize I can just use this erroneous value to design a hash function, but for the sake of not creating possible collisions (I don't know if these produce UNIQUE erroneous values), and for the sake of learning, could you point out how I might unify input of chinese characters across different machines?

总是感谢你的想法。 :D

Always grateful for your thoughts. :D

巴乔。

推荐答案

你在想这个。每个字符串已经(概念上)是一系列字符,包括中文字符。编码只有当你需要将它转换为字节时才进入它,你不需要需要为你的任务。只需使用 String 的哈希码。实际上,当你创建一个 HashMap< String,YourObject> 时,这正是幕后发生的事情。

You are over-thinking this. Every String is already (conceptually) a sequence of characters, including Chinese characters.. Encoding only comes into it when you need to convert it into a bytes, which you don't need to for your assignment. Just use the String's hashcode. In fact, when you create a HashMap<String,YourObject>, that's exactly what will happen behind the scenes.

这篇关于如何使用Java中的Scanner正确读取中文字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆