Powershell和UTF-8 [英] Powershell and UTF-8
问题描述
我有一个用原子创建的html文件 test.html ,其中包含:
I have an html file test.html created with atom which contains:
Testé编码utf-8
Testé encoding utf-8
当我使用Powershell控制台阅读时(我正在使用法语Windows)
When I read it with Powershell console (I'm using French Windows)
Get-Content -Raw test.html
我回来了:
Testé encoding utf-8
为什么重音字符无法正确打印?
Why is the accent character not printing correctly?
推荐答案
-
Atom编辑器创建UTF-8文件而没有默认情况下伪BOM 跨平台的角度去做).
The Atom editor creates UTF-8 files without a pseudo-BOM by default (which is the right thing to do, from a cross-platform perspective).
- 其他流行的跨平台编辑器,例如 Visual Studio代码和
- Other popular cross-platform editors, such as Visual Studio Code and Sublime Text, behave the same way.
Windows PowerShell [1] 仅使用伪BOM识别 的UTF-8文件.
Windows PowerShell[1] only recognizes UTF-8 files with a pseudo-BOM.
- 在缺少伪BOM的情况下,PowerShell会将文件解释为根据系统的旧版代码页(例如默认情况下创建 UTF-16LE编码的文件.)
- In the absence of the pseudo-BOM, PowerShell interprets files as being formatted according to the system's legacy codepage, such as Windows-1252 on US systems, for instance.
(This is also the default encoding used by Notepad, which it calls "ANSI", not just when reading files, but also when creating them. By contrast, PowerShell creates UTF-16LE-encoded files by default.)
因此,为了使
Get-Content
在Windows PowerShell中正确识别 BOM不足 UTF-8文件,您必须使用-Encoding utf8
.Therefore, in order for
Get-Content
to recognize a BOM-less UTF-8 file correctly in Windows PowerShell, you must use-Encoding utf8
.[1]相比之下,跨平台 PowerShell Core 版本在读和写上都默认为 到UTF-8,因此即使没有BOM,它也可以正确解释UTF-8编码的文件,并且默认情况下还创建没有BOM的文件./sup>
[1] By contrast, the cross-platform PowerShell Core edition commendably defaults to UTF-8, both on reading and writing, so it does interpret UTF-8-encoded files correctly even without a BOM and by default also creates files without a BOM.
这篇关于Powershell和UTF-8的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!