符合UTF-8的IOstream [英] UTF-8-compliant IOstreams

查看:113
本文介绍了符合UTF-8的IOstream的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

GCC的标准库或Boost或任何其他库是否实现了iostream兼容版本的ifstreamofstream,该版本支持UTF-8编码的(文件)流与std::vector<wchar_t>std::wstring之间的转换?

Does GCC's standard library or Boost or any other library implement iostream-compliant versions of ifstream or ofstream that supports conversion between UTF-8-encoded (file-) streams and a std::vector<wchar_t> or std::wstring?

推荐答案

您的问题不太有效. UTF-8是一种特定的编码,而wchar_t是一种数据类型.而且,标准要使用wchar_t表示系统的字符集,但这完全留给平台使用,并且该标准没有要求.

Your question doesn't quite work. UTF-8 is a specific encoding, while wchar_t is a data type. Moreover, wchar_t is intended by the standard to represent the system's character set, but this is entirely left to platform, and the standard makes no requirements.

因此,正确的要求首先是系统的窄多字节编码与系统编码的定长编码之间的转换.此功能由std::mbstowcsstd::wcstombs提供.可能在某个地方也有一个语言环境方面,但这是图书馆的一个小众领域.

Therefore, the correct thing to ask for is first of all conversion between the system's narrow, multibyte encoding and the fixed-length encoding of the system's encoding into a wide string. This functionality is provided by std::mbstowcs and std::wcstombs. There may also be a locale facet somewhere that wraps this, but that's a bit of a niche area of the library.

如果要在标准规定的不透明系统编码"和序列化数据源/接收器规定的确定编码之间进行转换,则需要一个额外的库.我建议使用Posix的iconv(),它广泛可用. (Windows API使用不同的方法,并提供了特殊的转换功能.)

If you want to convert between the opaque "system's encoding" prescribed by the standard and a definite encoding prescribed by your serialized data source/sink, you need an extra library. I'd recommend Posix's iconv(), which is widely available. (The Windows API has a different approach and offers special functions for conversion.)

C ++ 11通过添加显式的UTF编码的字符串类型和文字家族,并大概在其中也进行了转码功能(尽管我从未见过任何人实现它们),从而稍微缓解了这个问题.

C++11 alleviates the issue slightly by adding an explicit family of UTF-encoded string types and literals, and presumably also transcoding facilities among those (though I've never seen them implemented by anyone).

这是我过去关于该主题的帖子的标准回复: Q1 第二季度

Here's my standard response of past posts on the subject: Q1, Q2, Q3. C++11 will be a joy once its fully available :-)

这篇关于符合UTF-8的IOstream的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆