默认情况下，C ++中的宽文件流为什么会缩小写入数据？ [英] Why does wide file-stream in C++ narrow written data by default?

查看：140 发布时间：2017/10/26 20:51:44 c++ file unicode wofstream

本文介绍了默认情况下，C ++中的宽文件流为什么会缩小写入数据？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

老实说，我在C ++标准库中没有得到以下设计决策。将宽字符写入文件时， wofstream 将 wchar_t 转换为 char character：

  #include< fstream> 
 #include< string> 
 
 int main（）
 {
 using namespace std; 
 
 wstring someString = LHello StackOverflow！; 
 wofstream文件（LTest.txt）; 
 
文件<< someString; //输出文件将由ASCII字符组成！ 
}

我知道这与标准的的codecvt 。 utf8 中有 codecvt /1_40_0/libs/serialization/doc/codecvt.htmlrel =nofollow noreferrer> Boost 。另外， codecvt utf16 207662 / writing-utf16-to-file-in-binary-mode / 208431＃208431> Martin York这里是SO。问题是为什么 标准codecvt 转换宽字符？为什么不写字符呢？

另外，我们要用C ++实现真正的 unicode stream 0x或者我在这里缺少某些东西？

解决方案

C ++用于字符集的模型是从C继承的，所以可追溯到至少1989年。

两个要点：

IO完成了的char。

它是区域设置的工作，以确定多少个字符串行化

默认语言环境（称为C）非常最小（我不记得来自标准的限制，在这里它只能处理只有7位ASCII的窄和宽字符集）。

有一个环境确定的区域设置命名

所以要得到任何东西，你必须设置区域设置。

如果我使用简单的程序

  #include< locale> 
 #include< fstream> 
 #include< ostream> 
 #include< iostream> 
 
 int main（）
 {
 wchar_t c = 0x00FF; 
 std :: locale :: global（std :: locale（））; 
 std :: wofstream os（test.dat）; 
 os<< c <的std :: ENDL; 
 if（！os）{
 std :: cout<<< 输出失败 
} 
}

使用环境语言环境并输出宽字符代码0x00FF到文件。如果我要求使用C区域，我将获得

  $ env LC_ALL = C ./a.out 
输出失败

语言环境无法处理宽字符，我们收到问题的通知IO失败。如果我运行一个UTF-8语言环境，我得到

  $ env LC_ALL = en_US.utf8 ./a.out 
 $ od -t x1 test.dat 
 0000000 c3 bf 0a 
 0000003

（od -t xx只是转储以十六进制表示的文件），正是我对UTF-8编码文件的期望。

Honestly, I just don't get the following design decision in C++ Standard library. When writing wide characters to a file, the wofstream converts wchar_t into char characters:

#include <fstream>
#include <string>

int main()
{
    using namespace std;

    wstring someString = L"Hello StackOverflow!";
    wofstream file(L"Test.txt");

    file << someString; // the output file will consist of ASCII characters!
}

I am aware that this has to do with the standard codecvt. There is codecvt for utf8 in Boost. Also, there is a codecvt for utf16 by Martin York here on SO. The question is why the standard codecvt converts wide-characters? why not write the characters as they are!

Also, are we gonna get real unicode streams with C++0x or am I missing something here?

解决方案

The model used by C++ for charsets is inherited from C, and so dates back to at least 1989.

Two main points:

IO is done in term of char.
it is the job of the locale to determine how wide chars are serialized
the default locale (named "C") is very minimal (I don't remember the constraints from the standard, here it is able to handle only 7-bit ASCII as narrow and wide character set).
there is an environment determined locale named ""

So to get anything, you have to set the locale.

If I use the simple program

#include <locale>
#include <fstream>
#include <ostream>
#include <iostream>

int main()
{
    wchar_t c = 0x00FF;
    std::locale::global(std::locale(""));
    std::wofstream os("test.dat");
    os << c << std::endl;
    if (!os) {
        std::cout << "Output failed\n";
    }
}

which use the environment locale and output the wide character of code 0x00FF to a file. If I ask to use the "C" locale, I get

$ env LC_ALL=C ./a.out
Output failed

the locale has been unable to handle the wide character and we get notified of the problem as the IO failed. If I run ask an UTF-8 locale, I get

$ env LC_ALL=en_US.utf8 ./a.out
$ od -t x1 test.dat
0000000 c3 bf 0a
0000003

(od -t x1 just dump the file represented in hex), exactly what I expect for an UTF-8 encoded file.

这篇关于默认情况下，C ++中的宽文件流为什么会缩小写入数据？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

默认情况下，C ++中的宽文件流为什么会缩小写入数据？ [英] Why does wide file-stream in C++ narrow written data by default?

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

默认情况下，C ++中的宽文件流为什么会缩小写入数据？ [英] Why does wide file-stream in C++ narrow written data by default?

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭