Windows命令行参数编码是什么? [英] What is the Windows command line parameter encoding?
问题描述
Windows对传递给cmd.exe窗口中启动的程序的命令行参数使用哪种编码?
What encoding does Windows use for command line parameters passed to programs started in a cmd.exe window?
使用 chcp
设置的控制台代码页似乎不影响命令行参数的编码(我将其设置为UTF-8,代码页65001,并使用Lucida Console字体.)
The encoding of command line parameters doesn't seem to be affected by the console code page set using chcp
(I set it to UTF-8, code page 65001 and use the Lucida Console font.)
如果我将UTF-8文件中的EN DASH(编码为十六进制E28093)粘贴到命令行中,则它将在cmd.exe窗口中正确显示.但是,当它传递给程序时,似乎已转换为十六进制96(ANSI表示形式).如果我将西里尔字母粘贴到命令行中,它们也可以正确显示,但在程序中显示为问号(十六进制3F).
If I paste an EN DASH, encoded as hex E28093, from a UTF-8 file into a command line, it is displayed correctly in the cmd.exe window. However, it seems to be translated to a hex 96 (an ANSI representation) when it is passed to the program. If I paste Cyrillic characters into a command line, they are also displayed correctly, but appear in the program as question marks (hex 3F.)
如果我复制命令行并将其粘贴到文本文件中,则结果文件为UTF-8;它包含与源文件相同的EN DASH和西里尔字符编码.
If I copy a command line and paste it into a text file, the resulting file is UTF-8; it contains the same encoding of the EN DASH and Cyrillic characters as the source file.
看起来粘贴到cmd.exe窗口中的字符是使用通过 chcp
选择的代码页捕获并显示的,但是某些ANSI代码页用于在传递字符之前将字符转换为不同的编码它们作为程序的参数.显然无法转换的字符将自动转换为问号.
It appears the characters pasted into the cmd.exe window are captured and displayed using the code page selected with chcp
, but some ANSI code page is used to translate the characters into a different encoding before passing them as parameters to a program. Characters that cannot be converted apparently are silently converted to question marks.
因此,如果我想在程序中正确处理命令行参数,则需要确切地知道参数的编码是什么.例如,如果我想将命令行参数与从文件中读取的已知UTF-8数据进行比较,则需要将参数从正确的编码转换为UTF-8.谢谢.
So, if I want to correctly handle command line parameters in a program, I need to know exactly what the encoding of the parameters is. For example, if I wish to compare command line parameters with known UTF-8 data read from a file, I need to convert the parameters from the correct encoding to UTF-8. Thanks.
推荐答案
如果您的目标是比较Unicode字符,则应在程序中调用 GetCommandLineW
(或使用 wmain
,以便 argv
使用wchar_t),然后将此UTF-16LE命令行字符串转换为UTF-8,反之亦然.
If your goal is to compare Unicode characters then you should call GetCommandLineW
in your program (or use wmain
so that argv
uses wchar_t) and then convert this UTF-16LE command line string to UTF-8 or vice versa.
GetCommandLineA
可能会使用CP_ACP转换Unicode源字符串.
GetCommandLineA
probably converts the Unicode source string with CP_ACP.
这篇关于Windows命令行参数编码是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!