使用std :: wifstream读取带有特殊字符的unicode文件 [英] Read unicode file with special characters using std::wifstream

查看：555 发布时间：2020/5/1 9:48:24 c++ linux unicode

本文介绍了使用std :: wifstream读取带有特殊字符的unicode文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在Linux环境中，我有一段代码用于读取unicode文件，如下所示.

In a Linux environment, I have a piece of code for reading unicode files, similar as shown below.

但是，特殊字符(如丹麦字母æ，ø和å)的处理不正确.对于abcæøåabc"行，则输出仅为"abc".使用调试器，我可以看到wline的内容也只是a\000b\000c\000.

However, special characters (like danish letters æ, ø and å) are not handled correctly. For the line 'abcæøåabc' then output is simply 'abc'. Using a debugger I can see that the contents of wline is also only a\000b\000c\000.

#include <fstream>
#include <string>

std::wifstream wif("myfile.txt");
if (wif.is_open())
{
    //set proper position compared to byteorder
    wif.seekg(2, std::ios::beg);
    std::wstring wline;

    while (wif.good())
    {
        std::getline(wif, wline);
        if (!wif.eof())
        {
            std::wstring convert;
            for (auto c : wline)
            {
                if (c != '\0')
                convert += c;
            }
        }
    }
}
wif.close();

有人可以告诉我如何阅读整行吗?

Can anyone tell me how I get it to read the whole line?

感谢和问候

推荐答案

您必须使用

You have to use the imbue() method to tell wifstream that the file is encoded as UTF-16, and let it consume the BOM for you. You do not have to seekg() past the BOM manually. For example:

#include <fstream>
#include <string>
#include <locale>
#include <codecvt>

// open as a byte stream
std::wifstream wif("myfile.txt", std::ios::binary);
if (wif.is_open())
{
    // apply BOM-sensitive UTF-16 facet
    wif.imbue(std::locale(wif.getloc(), new std::codecvt_utf16<wchar_t, 0x10ffff, std::consume_header>));

    std::wstring wline;
    while (std::getline(wif, wline))
    {
        std::wstring convert;
        for (auto c : wline)
        {
            if (c != L'\0')
                convert += c;
        }
    }

    wif.close();
}

这篇关于使用std :: wifstream读取带有特殊字符的unicode文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用std :: wifstream读取带有特殊字符的unicode文件 [英] Read unicode file with special characters using std::wifstream

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录关闭

使用std :: wifstream读取带有特殊字符的unicode文件 [英] Read unicode file with special characters using std::wifstream

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录 关闭

登录关闭