将Unicode UTF-8文件读入wstring [英] Read Unicode UTF-8 file into wstring

查看:133
本文介绍了将Unicode UTF-8文件读入wstring的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在Windows平台上将Unicode(UTF-8)文件读入 wstring

How can I read a Unicode (UTF-8) file into wstring(s) on the Windows platform?

推荐答案

使用C ++ 11支持,您可以使用 std :: codecvt_utf8 facet 封装UTF-8编码字节字符串与可用于读写UTF-8文件的UCS2或UCS4字符串和之间的转换,包括文本和二进制。

With C++11 support, you can use std::codecvt_utf8 facet which encapsulates conversion between a UTF-8 encoded byte string and UCS2 or UCS4 character string and which can be used to read and write UTF-8 files, both text and binary.

为了使用 facet您通常会将区域设置对象 创建为将文化特定信息封装为一组共同定义特定的本地化环境的方面。一旦你有一个区域设置对象,你可以 imbue 您的流缓冲区:

In order to use facet you usually create locale object that encapsulates culture-specific information as a set of facets that collectively define a specific localized environment. Once you have a locale object, you can imbue your stream buffer with it:

#include <sstream>
#include <fstream>
#include <codecvt>

std::wstring readFile(const char* filename)
{
    std::wifstream wif(filename);
    wif.imbue(std::locale(std::locale::empty(), new std::codecvt_utf8<wchar_t>));
    std::wstringstream wss;
    wss << wif.rdbuf();
    return wss.str();
}

可以这样使用:

std::wstring wstr = readFile("a.txt");

或者您可以设置全局C ++语言环境之前,使用导致所有未来调用 std :: locale 默认构造函数返回全局C ++语言环境的副本(您不需要明确地填充流缓冲区):

Alternatively you can set the global C++ locale before you work with string streams which causes all future calls to the std::locale default constructor to return a copy of the global C++ locale (you don't need to explicitly imbue stream buffers with it then):

std::locale::global(std::locale(std::locale::empty(), new std::codecvt_utf8<wchar_t>));

这篇关于将Unicode UTF-8文件读入wstring的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆