Files.readAllBytes 与 Files.lines 获取 MalformedInputException [英] Files.readAllBytes vs Files.lines getting MalformedInputException

查看:34
本文介绍了Files.readAllBytes 与 Files.lines 获取 MalformedInputException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我会认为以下两种读取文件的方法应该表现相同.但他们没有.第二种方法是抛出一个 MalformedInputException.

I would have thought that the following two approaches to read a file should behave equally. But they don't. The second approach is throwing a MalformedInputException.

public static void main(String[] args) {    
    try {
        String content = new String(Files.readAllBytes(Paths.get("_template.txt")));
        System.out.println(content);
    } catch (IOException e) {
        e.printStackTrace();
    }

    try(Stream<String> lines = Files.lines(Paths.get("_template.txt"))) {
        lines.forEach(System.out::println);
    } catch (IOException e) {
        e.printStackTrace();
    }
}

这是堆栈跟踪:

Exception in thread "main" java.io.UncheckedIOException: java.nio.charset.MalformedInputException: Input length = 1
    at java.io.BufferedReader$1.hasNext(BufferedReader.java:574)
    at java.util.Iterator.forEachRemaining(Iterator.java:115)
    at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
    at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
    at Test.main(Test.java:19)
Caused by: java.nio.charset.MalformedInputException: Input length = 1
    at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
    at java.io.InputStreamReader.read(InputStreamReader.java:184)
    at java.io.BufferedReader.fill(BufferedReader.java:161)
    at java.io.BufferedReader.readLine(BufferedReader.java:324)
    at java.io.BufferedReader.readLine(BufferedReader.java:389)
    at java.io.BufferedReader$1.hasNext(BufferedReader.java:571)
    ... 4 more

这里有什么区别,我该如何解决?

What is the difference here, and how do I fix it?

推荐答案

这与字符编码有关.计算机只处理数字.要存储文本,必须使用某种方案将文本中的字符与数字进行相互转换.该方案称为字符编码.有许多不同的字符编码;一些众所周知的标准字符编码是 ASCII、ISO-8859-1 和 UTF-8.

This has to do with character encoding. Computers only deal with numbers. To store text, the characters in the text have to be converted to and from numbers, using some scheme. That scheme is called the character encoding. There are many different character encodings; some of the well-known standard character encodings are ASCII, ISO-8859-1 and UTF-8.

在第一个示例中,您读取文件中的所有字节(数字),然后通过将它们传递给 String 类的构造函数将它们转换为字符.这将使用您系统的默认字符编码(无论您的操作系统是什么)将字节转换为字符.

In the first example, you read all the bytes (numbers) in the file and then convert them to characters by passing them to the constructor of class String. This will use the default character encoding of your system (whatever it is on your operating system) to convert the bytes to characters.

在第二个示例中,您使用 Files.lines(...),根据 文档.当在文件中发现字节序列不是有效的 UTF-8 序列时,您将收到 MalformedInputException.

In the second example, where you use Files.lines(...), the UTF-8 character encoding will be used, according to the documentation. When a sequence of bytes is found in the file that is not a valid UTF-8 sequence, you'll get a MalformedInputException.

您系统的默认字符编码可能是也可能不是 UTF-8,因此这可以解释行为上的差异.

The default character encoding of your system may or may not be UTF-8, so that can explain a difference in behaviour.

您必须找出文件使用的字符编码,然后明确使用它.例如:

You'll have to find out what character encoding is used for the file, and then explicitly use that. For example:

String content = new String(Files.readAllBytes(Paths.get("_template.txt")),
        StandardCharsets.ISO_8859_1);

第二个例子:

Stream<String> lines = Files.lines(Paths.get("_template.txt"),
        StandardCharsets.ISO_8859_1);

这篇关于Files.readAllBytes 与 Files.lines 获取 MalformedInputException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆