Files.readAllLines()的MalformedInputException [英] MalformedInputException with Files.readAllLines()

查看:1687
本文介绍了Files.readAllLines()的MalformedInputException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在迭代一些文件,准确地说是5328。这些文件是平均XML文件,最多60-200行。它们首先通过解析路径的简单方法 isXmlSourceFile 进行过滤。

I was iterating over some files, 5328 to be precise. These files are average XML files with 60-200 lines max. They are first filtered through a simple method isXmlSourceFile that parse the path.

    Files.walk(Paths.get("/home/me/development/projects/myproject"), FileVisitOption.FOLLOW_LINKS)
            .filter(V3TestsGenerator::isXmlTestSourceFile)
            .filter(V3TestsGenerator::fileContainsXmlTag)

最大的问题是第二个过滤器,尤其是方法fileContainsXmlTag。对于每个文件,我想检测一个模式是否至少包含一行:

The big question is for the second filter, especially the method fileContainsXmlTag. For each file I wanted to detect if a pattern was contained at least once among the lines of it:

private static boolean fileContainsXmlTag(Path path) {
    try {
        return Files.readAllLines(path).stream().anyMatch(line -> PATTERN.matcher(line).find());
    } catch (IOException e) {
        e.printStackTrace();
    }
    return false;
}

对于某些文件我得到了这个例外

For some files I get then this exception

java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:161)
at java.io.BufferedReader.readLine(BufferedReader.java:324)
at java.io.BufferedReader.readLine(BufferedReader.java:389)
at java.nio.file.Files.readAllLines(Files.java:3205)
at java.nio.file.Files.readAllLines(Files.java:3242)

但是当我使用FileUtiles.readLines()而不是Files时.readAllLines一切都进展顺利。

But when I use FileUtiles.readLines() instead of Files.readAllLines everything is getting well.

这是一个好奇的问题,所以如果有人知道发生了什么,那就很愉快。

It's a curiosity question so if someone as a clue of what's going on, it's with pleasure.

谢谢

推荐答案

方法 Files.readAllLines() 假设您正在阅读的文件以UTF-8编码。

The method Files.readAllLines() assumes that the file you are reading is encoded in UTF-8.

如果您收到此异常,那么您正在阅读的文件很可能是编码的使用与UTF-8不同的字符编码。

If you get this exception, then the file you are reading is most likely encoded using a different character encoding than UTF-8.

找出使用的字符编码,并使用另一个 readAllLines 方法,允许您指定字符编码。

Find out what character encoding is used, and use the other readAllLines method, that allows you to specify the character encoding.

例如,如果文件是用ISO编码的8859-1:

For example, if the files are encoded in ISO-8859-1:

return Files.readAllLines(path, StandardCharsets.ISO_8859_1).stream()... // etc.

方法 FileUt iles.readLines()(它来自哪里?)可能假定其他东西(它可能假设文件是​​系统的默认字符编码,这不是UTF-8)。

The method FileUtiles.readLines() (where does that come from?) probably assumes something else (it probably assumes the files are in the default character encoding of your system, which is something else than UTF-8).

这篇关于Files.readAllLines()的MalformedInputException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆