PowerShell的UTF-8输出 [英] UTF-8 output from PowerShell
问题描述
我尝试使用 Process.Start
与重定向的I / O来调用 PowerShell.exe
字符串,并返回输出,全部在 UTF-8 中。
I'm trying to use Process.Start
with redirected I/O to call PowerShell.exe
with a string, and to get the output back, all in UTF-8. But I don't seem to be able to make this work.
我尝试过的:
- 通过
-Command
参数传递命令 - 将PowerShell脚本作为文件写入磁盘使用UTF-8编码
- 使用UTF-8将PowerShell脚本作为文件写入磁盘,并使用 BOM 编码
- 使用UTF-16将PowerShell脚本作为文件写入磁盘
- 设置
控制台。在我的控制台应用程序和PowerShell脚本中输出编码
- 在PowerShell中设置
$ OutputEncoding
>
- 设置
Process.StartInfo.StandardOutputEncoding
- 使用
编码。 Unicode
而不是Encoding.UTF8
- Passing the command to run via the
-Command
parameter - Writing the PowerShell script as a file to disk with UTF-8 encoding
- Writing the PowerShell script as a file to disk with UTF-8 with BOM encoding
- Writing the PowerShell script as a file to disk with UTF-16
- Setting
Console.OutputEncoding
in both my console application and in the PowerShell script - Setting
$OutputEncoding
in PowerShell - Setting
Process.StartInfo.StandardOutputEncoding
- Doing it all with
Encoding.Unicode
instead ofEncoding.UTF8
每一种情况,当我检查我给出的字节,我得到不同的值,我的原始字符串。
In every case, when I inspect the bytes I'm given, I get different values to my original string. I'd really love an explanation as to why this doesn't work.
这是我的代码:
static void Main(string[] args)
{
DumpBytes("Héllo");
ExecuteCommand("PowerShell.exe", "-Command \"$OutputEncoding = [System.Text.Encoding]::UTF8 ; Write-Output 'Héllo';\"",
Environment.CurrentDirectory, DumpBytes, DumpBytes);
Console.ReadLine();
}
static void DumpBytes(string text)
{
Console.Write(text + " " + string.Join(",", Encoding.UTF8.GetBytes(text).Select(b => b.ToString("X"))));
Console.WriteLine();
}
static int ExecuteCommand(string executable, string arguments, string workingDirectory, Action<string> output, Action<string> error)
{
try
{
using (var process = new Process())
{
process.StartInfo.FileName = executable;
process.StartInfo.Arguments = arguments;
process.StartInfo.WorkingDirectory = workingDirectory;
process.StartInfo.UseShellExecute = false;
process.StartInfo.CreateNoWindow = true;
process.StartInfo.RedirectStandardOutput = true;
process.StartInfo.RedirectStandardError = true;
process.StartInfo.StandardOutputEncoding = Encoding.UTF8;
process.StartInfo.StandardErrorEncoding = Encoding.UTF8;
using (var outputWaitHandle = new AutoResetEvent(false))
using (var errorWaitHandle = new AutoResetEvent(false))
{
process.OutputDataReceived += (sender, e) =>
{
if (e.Data == null)
{
outputWaitHandle.Set();
}
else
{
output(e.Data);
}
};
process.ErrorDataReceived += (sender, e) =>
{
if (e.Data == null)
{
errorWaitHandle.Set();
}
else
{
error(e.Data);
}
};
process.Start();
process.BeginOutputReadLine();
process.BeginErrorReadLine();
process.WaitForExit();
outputWaitHandle.WaitOne();
errorWaitHandle.WaitOne();
return process.ExitCode;
}
}
}
catch (Exception ex)
{
throw new Exception(string.Format("Error when attempting to execute {0}: {1}", executable, ex.Message),
ex);
}
}
更新
我发现如果我使这个脚本:
Update
I found that if I make this script:
[Console]::OutputEncoding = [System.Text.Encoding]::UTF8
Write-Host "Héllo!"
[Console]::WriteLine("Héllo")
/ p>
Then invoke it via:
ExecuteCommand("PowerShell.exe", "-File C:\\Users\\Paul\\Desktop\\Foo.ps1",
Environment.CurrentDirectory, DumpBytes, DumpBytes);
第一行已损坏,但第二行不是:
The first line is corrupted, but the second isn't:
H?llo! 48,EF,BF,BD,6C,6C,6F,21
Héllo 48,C3,A9,6C,6C,6F
这表明我的重定向代码都工作正常;当我在PowerShell中使用 Console.WriteLine
时,我得到了UTF-8。
This suggests to me that my redirection code is all working fine; when I use Console.WriteLine
in PowerShell I get UTF-8 as I expect.
这意味着PowerShell的 Write-Output
和 Write-Host
命令必须对输出执行不同的操作,而不是简单地调用 Console.WriteLine
。
This means that PowerShell's Write-Output
and Write-Host
commands must be doing something different with the output, and not simply calling Console.WriteLine
.
下面强制PowerShell控制台代码页为UTF-8,但 Write-Host
和 Write-Output
继续产生损坏的结果,而 [Console] :: WriteLine
工作。
I've even tried the following to force the PowerShell console code page to UTF-8, but Write-Host
and Write-Output
continue to produce broken results while [Console]::WriteLine
works.
$sig = @'
[DllImport("kernel32.dll")]
public static extern bool SetConsoleCP(uint wCodePageID);
[DllImport("kernel32.dll")]
public static extern bool SetConsoleOutputCP(uint wCodePageID);
'@
$type = Add-Type -MemberDefinition $sig -Name Win32Utils -Namespace Foo -PassThru
$type::SetConsoleCP(65001)
$type::SetConsoleOutputCP(65001)
Write-Host "Héllo!"
& chcp # Tells us 65001 (UTF-8) is being used
解决方案
Lee的回答是正确的。正如李说,我正在努力的一切,迫使PowerShell生产UTF-8,但这似乎是不可能的。相反,我们只需要使用PowerShell使用的相同编码(默认OEM编码)读取流。没有必要告诉 Process.StartInfo
以不同的编码读取,因为它已经读为默认值。
Solution
Lee's answer was correct. As Lee says, I was trying everything to force PowerShell to produce UTF-8, but that doesn't seem to be possible. Instead, we just need to read the stream using the same encoding PowerShell uses (the default OEM encoding). There's no need to tell Process.StartInfo
to read with a different encoding, since it already reads with the default.
其实不是真的。我认为 Process.Start
使用任何当前编码;当我在控制台应用程序下运行它使用OEM编码,因此可以读取输出。但是当在Windows服务下运行时,它没有。所以我不得不明确强制它。
Actually that's not true. I think Process.Start
uses whatever the current encoding is; when I ran it under a Console application it used the OEM encoding and so could read the output. But when running under a Windows Service, it didn't. So I had to force it explicitly.
您可以通过链接@andyb发布控制台使用的代码页:
You can get the code page used by the console via the link @andyb posted:
我需要在这里使用签名:
http://www.pinvoke.net/default.aspx/kernel32.getcpinfoex
I needed to use the signatures here: http://www.pinvoke.net/default.aspx/kernel32.getcpinfoex
然后指定:
CPINFOEX info;
if (GetCPInfoEx(CP_OEMCP, 0, out info))
{
var oemEncoding = Encoding.GetEncoding(info.CodePage);
process.StartInfo.StandardOutputEncoding = oemEncoding;
}
推荐答案
净。当PowerShell启动时,它缓存输出句柄(Console.Out)。
This is a bug in .NET. When PowerShell launches, it caches the output handle (Console.Out). The Encoding property of that text writer does not pick up the value StandardOutputEncoding property.
当从PowerShell中更改它时,缓存输出作者的Encoding属性返回缓存的值,因此输出仍使用默认编码进行编码。
When you change it from within PowerShell, the Encoding property of the cached output writer returns the cached value, so the output is still encoded with the default encoding.
作为解决方法,我建议不要更改编码。
As a workaround, I would suggest not changing the encoding. It will be returned to you as a Unicode string, at which point you can manage the encoding yourself.
缓存示例:
102 [C:\Users\leeholm]
>> $r1 = [Console]::Out
103 [C:\Users\leeholm]
>> $r1
Encoding FormatProvider
-------- --------------
System.Text.SBCSCodePageEncoding en-US
104 [C:\Users\leeholm]
>> [Console]::OutputEncoding = [System.Text.Encoding]::UTF8
105 [C:\Users\leeholm]
>> $r1
Encoding FormatProvider
-------- --------------
System.Text.SBCSCodePageEncoding en-US
这篇关于PowerShell的UTF-8输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!