将字符串从UTF-8转换为ISO-8859-1 [英] Convert string from UTF-8 to ISO-8859-1

查看:2034
本文介绍了将字符串从UTF-8转换为ISO-8859-1的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将UTF-8 string转换为ISO-8859-1 char*以在旧代码中使用.我看到的唯一方法是使用 iconv .

I'm trying to convert a UTF-8 string to a ISO-8859-1 char* for use in legacy code. The only way I'm seeing to do this is with iconv.

我绝对希望使用完全基于string的C ++解决方案,然后仅在结果字符串上调用.c_str().

I would definitely prefer a completely string-based C++ solution then just call .c_str() on the resulting string.

我该怎么做?如果可能,请提供代码示例.如果您唯一知道的解决方案,我可以使用iconv.

How do I do this? Code example if possible, please. I'm fine using iconv if it is the only solution you know.

推荐答案

我将从另一个答案中修改我的代码 实施Alf的建议.

I'm going to modify my code from another answer to implement the suggestion from Alf.

std::string UTF8toISO8859_1(const char * in)
{
    std::string out;
    if (in == NULL)
        return out;

    unsigned int codepoint;
    while (*in != 0)
    {
        unsigned char ch = static_cast<unsigned char>(*in);
        if (ch <= 0x7f)
            codepoint = ch;
        else if (ch <= 0xbf)
            codepoint = (codepoint << 6) | (ch & 0x3f);
        else if (ch <= 0xdf)
            codepoint = ch & 0x1f;
        else if (ch <= 0xef)
            codepoint = ch & 0x0f;
        else
            codepoint = ch & 0x07;
        ++in;
        if (((*in & 0xc0) != 0x80) && (codepoint <= 0x10ffff))
        {
            if (codepoint <= 255)
            {
                out.append(1, static_cast<char>(codepoint));
            }
            else
            {
                // do whatever you want for out-of-bounds characters
            }
        }
    }
    return out;
}

无效的UTF-8输入会导致字符丢失.

Invalid UTF-8 input results in dropped characters.

这篇关于将字符串从UTF-8转换为ISO-8859-1的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆