UTF-8到宽字符转换 [英] UTF-8 to wide char conversion
问题描述
#ifndef UNICODE
#define UNICODE
#endif
#include <Windows.h>
#include <cstdio>
#include <fstream>
using namespace std;
int main()
{
FILE* resFile;
char multiByteStr[256];
ifstream oFile;
FILE* exampleFile;
TCHAR buffer[256];
system("chcp 65001");
resFile = _wfopen(L"foo",L"w, ccs=UTF-8");
fwprintf(resFile,L"%s",L"C:\\exsistingFolder\\zażółć gęśłą jaźń ☺☻♥♦• ć.txt");
fclose(resFile);
oFile.open(L"foo");
oFile.getline(multiByteStr,256,'\n');
oFile.close();
MultiByteToWideChar(CP_UTF8,0,multiByteStr,256,buffer,256);
wprintf(L"%s",buffer);
exampleFile = _wfopen(buffer,L"w, ccs=UTF-16LE");
fwprintf(exampleFile,L"%s",buffer);
fclose(exampleFile);
system("pause");
return 0;
}
正如你所看到的,程序应该创建文件foo resFile
,其中包含要创建的文件的完整路径,并且此新文件 exampleFile
应包含自身的路径。虽然在Visual Studio 2010自动生成调试期间缓冲区有正确的字符串,但不创建exampleFile。为什么?
另一件事:为什么 wprintf
不输出扩展字符,虽然我已经将控制台的字体切换到Lucida控制台 - 处理解码字符。
As you can see, program should create file "foo" resFile
that contains a full path of the file to be created, and this new file exampleFile
should contain a path to itself. Although during debugging in Visual studio 2010 autos yields that buffer has the correct string, exampleFile isn't created. Why?
And another thing: why wprintf
doesn't output extended characters, though I've switched console's font to Lucida Console - that one which can deal with uncode characters.
Ps。 exampleFile指向 NULL
,即使在 _wfopen
后,缓冲区的最后一个字符为'/ 0'
Ps. exampleFile points to NULL
, even after _wfopen
, and the last character of buffer is '/0'
.
推荐答案
解决方案非常简单 - _wfopen创建一个以UTF-8编码的文件使用BOM ,而MultiByteToWideChar函数不会删除BOM,因此我们需要手动清除此错误。
The solution is very trivial- _wfopen creates a file encoded in UTF-8 with BOM, and MultiByteToWideChar function doesn't remove the BOM, so we need to manually get rid of this.
这篇关于UTF-8到宽字符转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!