源代码中字符串文字的字符编码问题 [英] Character-encoding problem with string literal in source code

查看:139
本文介绍了源代码中字符串文字的字符编码问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

$logstring = Invoke-Command -ComputerName $filesServer   -ScriptBlock {
        param(
            $logstring,
            $grp
        )

    $Klassenbuchordner = "KB " + $grp.Gruppe
    $Gruppenordner = $grp.Gruppe
    $share = $grp.Gruppe
    $path = "D:\Gruppen\$Gruppenordner"

    if ((Test-Path D:\Dozenten\01_Klassenbücher\$Klassenbuchordner) -eq $true)
    {$logstring += "Verzeichnis für Klassenbücher existiert bereits"}
    else {
        mkdir D:\Dozenten\01_Klassenbücher\$Klassenbuchordner
        $logstring += "Klassenbuchordner wurde erstellt!"
    }} -ArgumentList $logstring, $grp

我的目标是测试目录的存在并按需创建.

My goal is to test the existence of a directory and create it on demand.

问题在于路径中包含德语字母(变音符),目标服务器无法正确看到它们.

The problem is that the path contains German letters (umlauts), which aren't seen correctly by the target server.

例如,服务器接收路径"D:\Dozent\01_Klassenbücher"而不是预期的"D:\Dozent\01_Klassenbücher".

For instance, the server receives path "D:\Dozent\01_Klassenbücher" instead of the expected "D:\Dozent\01_Klassenbücher".

如何强制采用正确的UTF-8编码?

How can I force proper UTF-8 encoding?

推荐答案

注意: Invoke-Command的远程处理和使用与您的问题是偶然的 .

Note: Remoting and use of Invoke-Command are incidental to your problem.

由于问题出现在源代码(...\01_Klassenbücher\...)中的字符串文字上,所以最可能的解释是PowerShell误解了您的脚本文件.

Since the problem occurs with a string literal in your source code (...\01_Klassenbücher\...), the likeliest explanation is that your script file is misinterpreted by PowerShell.

在Windows PowerShell中(相对于PowerShel Core ),如果您的脚本文件实际上是UTF-8编码的,但缺少BOM ,PowerShell将误解脚本中的任何非ASCII范围字符(例如ü). [1]

In Windows PowerShell (as opposed to PowerShel Core), if your script file is de facto UTF-8-encoded but lacks a BOM, PowerShell will misinterpret any non-ASCII-range characters (such as ü) in the script.[1]

因此:将脚本重新保存为UTF-8 BOM .

Therefore: Re-save your script as UTF-8 with BOM.

Visual Studio代码和其他现代编辑器默认情况下创建不带BOM的UTF-8文件 ,这就是导致Windows PowerShell出现问题的原因.

Visual Studio Code and other modern editors create UTF-8 files without BOM by default, which is what causes the problem in Windows PowerShell.

相比之下,PowerShell ISE会创建"ANSI"编码的 [1] 文件,Windows PowerShell(但不是PowerShell Core )可以正确读取这些文件.

By contrast, the PowerShell ISE creates "ANSI"-encoded[1] files, which Windows PowerShell - but not PowerShell Core - reads correctly.

您只能摆脱"ANSI"编码的文件:

You can only get away with "ANSI"-encoded files:

  • 如果您的脚本永远不会在PowerShell Core 中运行-以后所有的开发工作都将在此运行.

  • if your scripts will never be run in PowerShell Core - where all future development effort will go.

,如果您的脚本永远不会在具有不同"ANSI"代码页的计算机上运行.

if your scripts will never run on a machine where a different "ANSI" code page is in effect.

如果您的脚本不包含无法用"ANSI"代码页表示的字符(例如表情符号).

if your script doesn't contain characters - e.g., emoji - that cannot be represented with your "ANSI" code page.

鉴于这些限制,最安全-面向未来的-始终使用BOM表将 UTF-8创建PowerShell脚本 .
(或者,您可以使用UTF-16(始终与BOM一起保存 ),但是如果您主要使用ASCII/"ANSI"范围字符,则文件大小会过大,这很可能在PS脚本中).

Given these limitations, it's safest - and future-proof - to always create PowerShell scripts as UTF-8 with BOM.
(Alternatively, you can use UTF-16 (which is always saved with a BOM), but that bloats the file size if you're primarily using ASCII/"ANSI"-range characters, which is likely in PS scripts).

注意:从 PowerShell的v1.11.0起,仍需要以下内容VSCode的扩展名,但不是建议将扩展名 default PowerShell文件设置为带有BOM的UTF-8

Note: The following is still required as of v1.11.0 of the PowerShell extension for VSCode, but not that there's a suggestion to make the extension default PowerShell files to UTF-8 with BOM on GitHub.

将以下内容添加到您的settings.json文件中(从命令选项板( Ctrl + Shift + P ,键入settings并选择Preferences: Open Settings (JSON)):

Add the following to your settings.json file (from the command palette (Ctrl+Shift+P, type settings and select Preferences: Open Settings (JSON)):

"[powershell]": {
  "files.encoding": "utf8bom"
}

请注意,设置故意只限于 PowerShell 文件,因为您不希望所有文件默认为UTF-8 使用BOM ,因为Unix平台上的许多实用程序既不期望也不知道如何处理这样的BOM.

Note that the setting is intentionally scoped to PowerShell files only, because you wouldn't want all files to default to UTF-8 with BOM, given that many utilities on Unix platforms neither expect nor know how to handle such a BOM.

[1]在没有BOM的情况下,Windows PowerShell默认使用旧系统区域设置确定的系统当前"ANSI"代码页的编码.例如,在西欧文化中,Windows-1252.

这篇关于源代码中字符串文字的字符编码问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆