PowerShell Invoke-RestMethod Umlauts与UTF-8和Windows-1252有关的问题 [英] PowerShell Invoke-RestMethod Umlauts issues with UTF-8 and Windows-1252

查看:71
本文介绍了PowerShell Invoke-RestMethod Umlauts与UTF-8和Windows-1252有关的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

执行Confluence REST API调用后,我得到了以UTF-8编码的响应.但是,当我使用 Out-File Export-CSV 导出结果时,即使使用 -Encoding utf8 参数,也无法正确表示德国Umlauts.例如,ü"仍然是¼".

Upon executing Confluence REST API calls I get back a response encoded in UTF-8. However, when I export the results with either Out-File or Export-CSV even with the -Encoding utf8 parameter German Umlauts are not correctly represented. For example, 'ü' is still 'ü'.

据我所知,这是因为PowerShell 5.1本机依赖于Windows-1252.我通过执行

[psobject] .Assembly.GetTypes()|使用PowerShell Core验证了Umlauts是否已保留.哪里对象{$ _.Name -eq'ClrFacade'} |ForEach-Object {$ _.GetMethod('GetDefaultEncoding',[System.Reflection.BindingFlags]'nonpublic,static').调用($ null,@())}

From what I could gather it's due to the fact that PowerShell 5.1 natively relies on Windows-1252. I verified that Umlauts are preserved when using PowerShell Core by executing

[psobject].Assembly.GetTypes() | Where-Object { $_.Name -eq 'ClrFacade'} | ForEach-Object { $_.GetMethod('GetDefaultEncoding', [System.Reflection.BindingFlags]'nonpublic,static').Invoke($null, @()) }

即使更改脚本文件本身以使用带有BOM或Windows-1252的UTF-8编码,也不会保留Umlauts,无论是在PowerShell还是在exportet输出中.

Even changing the script file itself to use the encoding UTF-8 with BOM or Windows-1252 does not preserve Umlauts, neither in the PowerShell nor exportet output.

您知道有什么方法可以告诉PowerShell 5.1在执行REST调用时保留Umlauts吗?

Do you know of any way to tell PowerShell 5.1 to preserve Umlauts while executing the REST call?

我无法使用PowerShell Core,因为进一步的操作需要cmdlet,而cmdlet对于PowerShell Core确实存在.

I cannot use PowerShell core as further operations require cmdlets which do net yet exist for PowerShell Core.

谢谢!

推荐答案

如注释中所述,Confluence API似乎使用UTF8编码了http响应,但 not 并未包含"Content-键入标题以表明这一点.

As discussed in the comments, it looks like the Confluence API encodes http responses using UTF8, but does not include the "Content-Type" header to indicate that.

字符集参数的HTTP规范说:在没有此标头的情况下,客户端应假定它是使用ISO-8859-1字符集编码的,因此您的请求中发生的事情是这样的:

The HTTP specification for the charset parameter says that in the absence of this header, the client should assume it's encoded with ISO-8859-1 character set, so what is happening in your request is something like this:

# server (Confluence API) encodes response text using utf8
PS> $text = "ü";
PS> $bytes = [System.Text.Encoding]::UTF8.GetBytes($text);
PS> write-host $bytes;
195 188

# client (Invoke-RestMethod) decodes bytes as ISO-8859-1
PS> $text = [System.Text.Encoding]::GetEncoding("ISO-8859-1").GetString($bytes);
PS> write-host $text;
ü

鉴于您无法控制服务器发送的内容,您要么需要自己捕获原始字节(例如,使用

Given that you can't control what the server sends, you'll either need to capture the raw bytes yourself (e.g. using System.Net.Http.HttpClient) and decode them using UTF8, or modify the existing response to compensate for the encoding mismatch (e.g. below).

PS> $text = "ü"
PS> $bytes = [System.Text.Encoding]::GetEncoding("ISO-8859-1").GetBytes($text)
PS> $text = [System.Text.Encoding]::UTF8.GetString($bytes)
PS> write-host $text
ü

请注意,如果您使用Invoke-RestMethod的 -Outfile 参数,它可能会将响应字节直接流式传输到磁盘,而无需对其进行解码或编码,因此结果文件已经包含 utf8 $ bytes而不是 utf8 $ bytes->使用ISO-8859-1解码的字符串->使用utf8编码的文件字节

Note that if you use the -Outfile parameter of Invoke-RestMethod it presumably streams the response bytes directly to disk without decoding or encoding them, so the resultant file already contains utf8 $bytes rather than utf8 $bytes -> string decoded using ISO-8859-1 -> file bytes encoded using utf8

这篇关于PowerShell Invoke-RestMethod Umlauts与UTF-8和Windows-1252有关的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆