为什么File.ReadAllBytes结果与使用File.ReadAllText时不同? [英] Why is File.ReadAllBytes result different than when using File.ReadAllText?

查看:200
本文介绍了为什么File.ReadAllBytes结果与使用File.ReadAllText时不同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个文本文件(UTF-8编码),内容为 test。我尝试从此文件中获取字节数组并转换为字符串,但是它包含一个奇怪的字符。我使用以下代码:

I have a text file (UTF-8 encoding) with contents "test". I try to get the byte array from this file and convert to string, but it contains one strange character. I use the following code:

var path = @"C:\Users\Tester\Desktop\test\test.txt"; // UTF-8

var bytes = File.ReadAllBytes(path);
var contents1 = Encoding.UTF8.GetString(bytes);

var contents2 = File.ReadAllText(path);

Console.WriteLine(contents1); // result is "?test"
Console.WriteLine(contents2); // result is "test"

conents1 contents2 不同-为什么?

推荐答案

ReadAllText 的文档:


此方法尝试根据字节序标记的存在自动检测文件的编码。可以检测到UTF-8和UTF-32编码格式(大端和小端)。

This method attempts to automatically detect the encoding of a file based on the presence of byte order marks. Encoding formats UTF-8 and UTF-32 (both big-endian and little-endian) can be detected.

因此文件包含BOM(字节顺序标记)和 ReadAllText 方法正确地解释了它,而第一种方法只读取了纯字节,而根本不解释它们。

So the file contains BOM (Byte order mark), and ReadAllText method correctly interprets it, while the first method just reads plain bytes, without interpreting them at all.

Encoding.GetString 说,它只是:

Encoding.GetString says that it only:


将指定字节数组中的所有字节解码为字符串

(重点是我的)。当然,这并不完全是结论性的,但是您的示例表明,应从字面上理解这一点。

(emphasis mine). Which is of course not entirely conclusive, but your example shows that this is to be taken literally.

这篇关于为什么File.ReadAllBytes结果与使用File.ReadAllText时不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆