编码为UCS-2 Little Endian的文件报告Java的行数增加了2倍 [英] File encoded as UCS-2 Little Endian reports 2x too many lines to Java

查看：108 发布时间：2020/10/1 0:46:28 java character-encoding

本文介绍了编码为UCS-2 Little Endian的文件报告Java的行数增加了2倍的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用一个简单的Java程序处理多个txt文件，而我的过程的第一步是计算每个文件的行数：

I was processing several txt files with a simple Java program, and the first step of my process is counting the lines of each file:

int count = 0;
br = new BufferedReader(new FileReader(myFile)); // myFile is the txt file in question
while (br.readLine() != null) {
    count++;
}

对于我的一个文件，Java计算的行数恰好是其中的两倍真的是！起初这让我非常困惑。我在Notepad ++中打开了每个文件，可以看到错误计数的文件以与其他文件完全相同的方式以CR和LF结束了每一行。我做了一些检查，发现我所有的 ok文件都经过ANSI编码，而一个问题文件则编码为UCS-2 Little Endian（我一无所知）。我把这些文件放在别处，所以我不知道为什么用这种方式编码，但是当然将其切换到ANSI可以解决此问题。

For one of my files, Java was counting exactly twice as many lines as there really were! This was confusing me greatly at first. I opened each file in Notepad++ and could see that the mis-counting file ended every line in exactly the same way as the other files, with a CR and LF. I did a little more poking around and noticed that all my "ok" files were ANSI encoded, and the one problem file was encoded as UCS-2 Little Endian (which I know nothing about). I got these files elsewhere, so I have no idea why the one was encoded that way, but of course switching it to ANSI fixed the issue.

但是现在好奇心仍然存在。为何编码导致双行计数报告？

But now curiosity remains. Why was the encoding causing a double line count report?

谢谢！

编码为UCS-2 Little Endian的文件报告Java的行数增加了2倍 [英] File encoded as UCS-2 Little Endian reports 2x too many lines to Java

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

编码为UCS-2 Little Endian的文件报告Java的行数增加了2倍 [英] File encoded as UCS-2 Little Endian reports 2x too many lines to Java

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭