读/写带有特殊字符的 .txt 文件 [英] Read/write .txt file with special characters

查看:66
本文介绍了读/写带有特殊字符的 .txt 文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我打开记事本(Windows)并写

I open Notepad (Windows) and write

Some lines with special characters
Special: Žđšćč

然后转到另存为...someFile.txt",将编码设置为UTF-8.

and go to Save As... "someFile.txt" with Encoding set to UTF-8.

在 Java 中我有

FileInputStream fis = new FileInputStream(new File("someFile.txt"));
InputStreamReader isr = new InputStreamReader(fis, "UTF-8");
BufferedReader in = new BufferedReader(isr);

String line;
while((line = in.readLine()) != null) {                         
    printLine(line);
}
in.close();

但是我得到了问号和类似的特殊"字符.为什么?

But I get question marks and similar "special" characters. Why?

我有这个输入(.txt 文件中的一行)

I have this input (one line in .txt file)

665,Žđšćč

和这段代码

FileInputStream fis = new FileInputStream(new File(fileName));
InputStreamReader isr = new InputStreamReader(fis, "UTF-8");
BufferedReader in = new BufferedReader(isr);

String line;
while((line = in.readLine()) != null) {
    Toast.makeText(mContext, line, Toast.LENGTH_LONG).show();

    Pattern p = Pattern.compile(",");
    String[] article = p.split(line);

    Toast.makeText(mContext, article[0], Toast.LENGTH_LONG).show();
    Toast.makeText(mContext, Integer.parseInt(article[0]), Toast.LENGTH_LONG).show();
}
in.close();

Toast 输出(对于不熟悉 Android 的人来说,Toast 只是一种在屏幕上显示带有特定文本的弹出窗口的方法)很好.控制台显示奇怪的字符"(可能是因为控制台窗口中的编码).但它在解析整数时失败,因为控制台说这个(warning: toast output is just fine) - 问题?

And Toast output (for ones who aren't familiar with Android, Toast is just a method to show a pop-up on screen with particular text in it) is fine. Console shows "weird characters" (probably because of encoding in console window). But it fails at parsing an integer because console says this (warning: toast output is just fine) - Problem?

似乎字符串包含一些 Toast 无法显示/呈现的奇怪"字符,但是当我尝试解析它时,它崩溃了.建议?

It seems like the String is containing some "weird" characters which Toast can't show/render but when I try to parse it, it crashes. Suggestions?

如果我将 ANSI 放在 NotePad 中,它会起作用(整数解析)并且没有上图中的奇怪字符,但是当然我的特殊字符不起作用.

If I put ANSI in NotePad it works (integer parsing) and there are no weird chars as in the picture above, but of course my special characters aren't working.

推荐答案

这是不支持这些字符的输出控制台.由于您使用的是 Eclipse,因此您需要确保将其配置为为此使用 UTF-8.你可以通过 Window >首选项 >一般>工作区文本文件编码设置为 UTF-8.

It's the output console which doesn't support those characters. Since you're using Eclipse, you need to ensure that it's configured to use UTF-8 for this. You can do this by Window > Preferences > General > Workspace > Text File Encoding > set to UTF-8.

更新 根据更新的问题和评论,显然是 UTF-8BOM 是罪魁祸首.默认情况下,记事本在保存时添加 UTF-8 BOM.看起来你的 HTC 上的 JRE 并没有吞下它.您可能需要考虑使用 中概述的 UnicodeReader 示例这个答案 而不是 InputStreamReader 在您的代码中.它会自动检测并跳过 BOM.

Update as per the updated question and the comments, apparently the UTF-8 BOM is the culprit. Notepad by default adds the UTF-8 BOM on save. It look like that the JRE on your HTC doesn't swallow that. You may want to consider to use the UnicodeReader example as outlined in this answer instead of InputStreamReader in your code. It autodetects and skips the BOM.

FileInputStream fis = new FileInputStream(new File(fileName));
UnicodeReader ur = new UnicodeReader(fis, "UTF-8");
BufferedReader in = new BufferedReader(ur);


与实际问题无关,最好在 finally 块中关闭资源,以确保它们在发生异常时关闭.


Unrelated to the actual problem, it's a good practice to close resources in finally block so that you ensure that they will be closed in case of exceptions.

BufferedReader reader = null;
try {
    reader = new BufferedReader(new UnicodeReader(new FileInputStream(fileName), "UTF-8"));
    // ...
} finally {
    if (reader != null) try { reader.close(); } catch (IOException logOrIgnore) {}
}

同样不相关,我建议将 Pattern p = Pattern.compile(","); 放在循环之外,甚至使其成为静态常量,因为它相对昂贵编译它,没有必要在循环中每次都这样做.

Also unrelated, I'd suggest to put Pattern p = Pattern.compile(","); outside the loop, or even make it a static constant, because it's relatively expensive to compile it and it's unnecessary to do this everytime inside a loop.

这篇关于读/写带有特殊字符的 .txt 文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆