Windows控制台上的UTF-8输出 [英] UTF-8 output on Windows console
问题描述
以下代码显示了我的机器上的意外行为(在Windows XP上使用Visual C ++ 2008 SP1和在Windows 7上使用VS 2012测试):
#include< iostream>
#includeWindows.h
int main(){
SetConsoleOutputCP(CP_UTF8);
std :: cout<< \xc3\xbc;
int fail = std :: cout.fail()? '1':'0';
fputc(fail,stdout);
fputs(\xc3\xbc,stdout);
}
我只是用 cl / EHsc test.cpp
Windows XP:在控制台窗口中输出的是
(翻译为代码页1252,最初在默认代码页中显示一些线图,可能是437)。当我将控制台窗口的设置
更改为使用Lucida Console字符集并再次运行我的
test.exe时,输出将更改为1ü
,这意味着
- 字符
ü
可以使用fputs
及其UTF-8编码C3 BC
-
std :: cout
不会因为任何原因工作 - 尝试写入后,流设置
failbit
字符
Windows 7:使用Consolas输出 0ü
。更有趣。正确的字节被写入,可能(至少在将输出重定向到文件时)并且流状态是确定的,但是两个字节被写为单独的字符。)
我尝试在Microsoft Connect上提出此问题(请参阅此处 a>),
,但MS没有非常有帮助。您也可以这里
你能重现这个问题吗?
我做错了什么?不应该 std :: cout
和 fputs
有相同的
效果吗?
解决:(排序)遵循mike.dld的想法,我实现了一个 std :: stringbuf
在 sync()
中从UTF-8到Windows-1252的转换,并替换了 std :: cout
的streambuf这个转换器(见我对mike.dld的回答的评论)。
Stephan T. Lavavej 说行为是按设计,虽然我不能按照这个解释。
我目前的知识是:Windows XP控制台-8代码页不能与C ++ iostreams一起使用。
Windows XP现在正在脱离时尚,VS 2008也是如此。我有兴趣听到问题是否仍然存在
在Windows 7上的效果可能是由于C ++流输出字符的方式。如在在Windows控制台中正确打印utf8字符的答案中所示,当打印一个字节时,UTF-8输出将失败,并显示C stdio后来像 putc('\xc3'); putc('\xbc');
。也许这是C ++流在这里做的。 The following code shows unexpected behaviour on my machine (tested with Visual C++ 2008 SP1 on Windows XP and VS 2012 on Windows 7):
#include <iostream>
#include "Windows.h"
int main() {
SetConsoleOutputCP( CP_UTF8 );
std::cout << "\xc3\xbc";
int fail = std::cout.fail() ? '1': '0';
fputc( fail, stdout );
fputs( "\xc3\xbc", stdout );
}
I simply compiled with cl /EHsc test.cpp
.
Windows XP: Output in a console window is
ü0ü
(translated to Codepage 1252, originally shows some line drawing
charachters in the default Codepage, perhaps 437). When I change the settings
of the console window to use the "Lucida Console" character set and run my
test.exe again, output is changed to 1ü
, which means
- the character
ü
can be written usingfputs
and its UTF-8 encodingC3 BC
std::cout
does not work for whatever reason- the streams
failbit
is setting after trying to write the character
Windows 7: Output using Consolas is ��0ü
. Even more interesting. The correct bytes are written, probably (at least when redirecting the output to a file) and the stream state is ok, but the two bytes are written as separate characters).
I tried to raise this issue on "Microsoft Connect" (see here), but MS has not been very helpful. You might as well look here as something similar has been asked before.
Can you reproduce this problem?
What am I doing wrong? Shouldn't the std::cout
and the fputs
have the same
effect?
SOLVED: (sort of) Following mike.dld's idea I implemented a std::stringbuf
doing the conversion from UTF-8 to Windows-1252 in sync()
and replaced the streambuf of std::cout
with this converter (see my comment on mike.dld's answer).
It's time to close this now. Stephan T. Lavavej says the behaviour is "by design", although I cannot follow this explanation.
My current knowledge is: Windows XP console in UTF-8 codepage does not work with C++ iostreams.
Windows XP is getting out of fashion now and so does VS 2008. I'd be interested to hear if the problem still exists on newer Windows systems.
On Windows 7 the effect is probably due to the way the C++ streams output characters. As seen in an answer to Properly print utf8 characters in windows console, UTF-8 output fails with C stdio when printing one byte after after another like putc('\xc3'); putc('\xbc');
as well. Perhaps this is what C++ streams do here.
这篇关于Windows控制台上的UTF-8输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!