所有包含的字符集,以避免“java.nio.charset.MalformedInputException:Input length = 1”? [英] All inclusive Charset to avoid "java.nio.charset.MalformedInputException: Input length = 1"?
问题描述
我在Java中创建一个简单的wordcount程序,读取目录的基于文本的文件。
I'm creating a simple wordcount program in Java that reads through a directory's text-based files.
但是,我一直收到错误:
However, I keep on getting the error:
java.nio.charset.MalformedInputException: Input length = 1
从这行代码:
BufferedReader reader = Files.newBufferedReader(file,Charset.forName("UTF-8"));
我知道我可能得到这个,因为我使用 Charset
不包括文本文件中的一些字符,其中一些包括其他语言的字符。但我想包含这些字符。
I know I probably get this because I used a Charset
that didn't include some of the characters in the text files, some of which included characters of other languages. But I want to include those characters.
我后来在JavaDocs , Charset
是可选的,仅用于更高效地读取文件,所以我将代码更改为:
I later learned at the JavaDocs that the Charset
is optional and only used for a more efficient reading of the files, so I changed the code to:
BufferedReader reader = Files.newBufferedReader(file);
但有些文件仍然会引发 MalformedInputException
。我不知道为什么。
But some files still throw the MalformedInputException
. I don't know why.
我想知道是否有一个全包 Charset
感谢。
推荐答案
您可能想要一个支持的编码列表。对于每个文件,依次尝试每个编码,可能从UTF-8开始。每次您捕获 MalformedInputException
时,请尝试下一个编码。
You probably want to have a list of supported encodings. For each file, try each encoding in turn, maybe starting with UTF-8. Every time you catch the MalformedInputException
, try the next encoding.
这篇关于所有包含的字符集,以避免“java.nio.charset.MalformedInputException:Input length = 1”?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!