使用OpenCSV解析包含Unicode字符的CSV文件 [英] Parse CSV file containing a Unicode character using OpenCSV
问题描述
我正在尝试使用NetBeans 6.0.1中的 OpenCSV 解析.csv文件。我的文件包含一些Unicode字符。当我在输出中写入时,角色以其他形式出现,如(HJ1'-E /;)。当我在记事本中打开此文件时,它看起来没问题。
I'm trying to parse a .csv file with OpenCSV in NetBeans 6.0.1. My file contains some Unicode character. When I write it in output the character appears in other form, like (HJ1'-E/;). When when I open this file in Notepad, it looks ok.
我使用的代码:
CSVReader reader=new CSVReader(new FileReader("d:\\a.csv"),',','\'',1);
String[] line;
while((line=reader.readNext())!=null){
StringBuilder stb=new StringBuilder(400);
for(int i=0;i<line.length;i++){
stb.append(line[i]);
stb.append(";");
}
System.out.println( stb);
}
推荐答案
首先你需要知道什么编码您的文件,例如UTF-8或UTF-16。是什么生成这个文件开始?
First you need to know what encoding your file is in, such as UTF-8 or UTF-16. What's generating this file to start with?
之后,它相对简单 - 你需要创建一个包含在中的
而不仅仅是 FileInputStream
InputStreamReader FileReader
。 ( FileReader
始终使用系统的默认编码。)指定创建 InputStreamReader
时要使用的编码,以及如果你选择了正确的,那么一切都应该开始工作了。
After that, it's relatively straightforward - you need to create a FileInputStream
wrapped in an InputStreamReader
instead of just a FileReader
. (FileReader
always uses the default encoding for the system.) Specify the encoding to use when you create the InputStreamReader
, and if you've picked the right one, everything should start working.
请注意,你不需要使用OpenCSV来检查这一点 - 你可以只阅读文本你自己的文件并打印出来。我不确定我是否相信 System.out
能够处理非ASCII字符 - 你可能想找到一种不同的方法来检查字符串,例如将字符的各个值打印为整数(最好是十六进制),然后将它们与unicode.org上的图表进行比较 。另一方面,你可以尝试正确的编码,看看开始时会发生什么...
Note that you don't need to use OpenCSV to check this - you could just read the text of the file yourself and print it all out. I'm not sure I'd trust System.out
to be able to handle non-ASCII characters though - you may want to find a different way of examining strings, such as printing out the individual values of characters as integers (preferably in hex) and then comparing them with the charts at unicode.org. On the other hand, you could try the right encoding and see what happens to start with...
编辑:好的,所以如果你使用的是UTF-8:
Okay, so if you're using UTF-8:
CSVReader reader=new CSVReader(
new InputStreamReader(new FileInputStream("d:\\a.csv"), "UTF-8"),
',', '\'', 1);
String[] line;
while ((line = reader.readNext()) != null) {
StringBuilder stb = new StringBuilder(400);
for (int i = 0; i < line.length; i++) {
stb.append(line[i]);
stb.append(";");
}
System.out.println(stb);
}
(我希望你有一个try / finally块来关闭你的文件真实代码。)
(I hope you have a try/finally block to close the file in your real code.)
这篇关于使用OpenCSV解析包含Unicode字符的CSV文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!