无符号整数为UTF-8值 [英] Unsigned integer as UTF-8 value

查看:210
本文介绍了无符号整数为UTF-8值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有

uint32_t a(3084);

我想创建一个字符串来存储unicode字符 U + 3084 这意味着我应该使用 a 的值,并将其用作UTF8表/字符集中右侧字符的坐标。

I would like to create a string that stores the unicode character U+3084 which means that I should take the value of a and use it as the coordinate for the right character in the UTF8 table/charset.

现在,显然 std :: to_string()不适用于我,标准中有很多函数在数值和char之间进行转换,我找不到任何给予我UTF8支持并输出 std :: string

Now, clearly std::to_string() doesn't work for me, there are a lot of functions in the standard to convert between numeric values and char, I can't find anything that grants me UTF8 support and outputs an std::string.

我想问我是否从头开始创建这个函数,或者在C ++ 11标准中有一些东西可以帮助我。请注意,我的编译器(gcc / g ++ 4.8.1)不提供对 codecvt 的完全支持。

I would like to ask if I have to create this function from scratch or there is something in the C++11 standard that can help me with that; please note that my compiler ( gcc/g++ 4.8.1 ) doesn't offer a complete support for codecvt.

推荐答案

这里是一些C ++代码,不会很难转换为C.改编自旧版回答

Here's some C++ code that wouldn't be hard to convert to C. Adapted from an older answer.

std::string UnicodeToUTF8(unsigned int codepoint)
{
    std::string out;

    if (codepoint <= 0x7f)
        out.append(1, static_cast<char>(codepoint));
    else if (codepoint <= 0x7ff)
    {
        out.append(1, static_cast<char>(0xc0 | ((codepoint >> 6) & 0x1f)));
        out.append(1, static_cast<char>(0x80 | (codepoint & 0x3f)));
    }
    else if (codepoint <= 0xffff)
    {
        out.append(1, static_cast<char>(0xe0 | ((codepoint >> 12) & 0x0f)));
        out.append(1, static_cast<char>(0x80 | ((codepoint >> 6) & 0x3f)));
        out.append(1, static_cast<char>(0x80 | (codepoint & 0x3f)));
    }
    else
    {
        out.append(1, static_cast<char>(0xf0 | ((codepoint >> 18) & 0x07)));
        out.append(1, static_cast<char>(0x80 | ((codepoint >> 12) & 0x3f)));
        out.append(1, static_cast<char>(0x80 | ((codepoint >> 6) & 0x3f)));
        out.append(1, static_cast<char>(0x80 | (codepoint & 0x3f)));
    }
    return out;
}

这篇关于无符号整数为UTF-8值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆