Windows记事本如何解释字符 [英] How windows notepad interpret characters
问题描述
我想知道Windows如何解释字符,例如:
I was wondering how windows interpret characters, for instance:
我用3位字节的code> E3 81 81 。
这些字节是编码为UTF-8的ぁ
字符。
I maked a file with an Hexeditor with the 3 bytes E3 81 81
.
Those bytes are the "ぁ"
character encoded as UTF-8.
我打开记事本并显示ぁ
我没有指定文件的编码,我刚刚创建了字节。
和记事本正确解释。
i dont specified the encoding of the file, i just created the bytes. and the notepad interpret it correctly.
记事本是否猜测可能是什么编码?
或是十六进制编辑器,用特定编码保存这些字节。
Is the notepad guessing what encoding probably is? or is the Hex editor saving those bytes with a specific encoding.
推荐答案
如果文件只包含这三个字节,那么根本就没有关于使用哪种编码的信息。
If the file only contains these three bytes, then there is no information at all about which encoding to use.
一个字节只是一个字节,没有办法在其中包含任何编码信息。此外,十六进制编辑器甚至不知道您打算将数据解码为文本。
A byte is just a byte, and there is no way to include any encoding information in it. Besides, the hex editor doesn't even know that you intended to decode the data as text.
记事本通常使用ANSI编码,因此如果将文件读取为UTF- 8然后它必须根据文件中的数据猜测编码。
Notepad normally uses ANSI encoding, so if it reads the file as UTF-8 then it has to guess the encoding based on the data in the file.
如果将文件保存为UTF-8,记事本将放置BOM(字节顺序标记) EF BB BF
在文件的开头。
If you save a file as UTF-8, Notepad will put the BOM (byte order mark) EF BB BF
at the beginning of the file.
这篇关于Windows记事本如何解释字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!