无符号整数为UTF-8值 [英] Unsigned integer as UTF-8 value
问题描述
假设我有
uint32_t a(3084);
我想创建一个字符串来存储unicode字符 U + 3084
这意味着我应该使用 a
的值,并将其用作UTF8表/字符集中右侧字符的坐标。
I would like to create a string that stores the unicode character U+3084
which means that I should take the value of a
and use it as the coordinate for the right character in the UTF8 table/charset.
现在,显然 std :: to_string()
不适用于我,标准中有很多函数在数值和char之间进行转换,我找不到任何给予我UTF8支持并输出 std :: string
。
Now, clearly std::to_string()
doesn't work for me, there are a lot of functions in the standard to convert between numeric values and char, I can't find anything that grants me UTF8 support and outputs an std::string
.
我想问我是否从头开始创建这个函数,或者在C ++ 11标准中有一些东西可以帮助我。请注意,我的编译器(gcc / g ++ 4.8.1)不提供对 codecvt
的完全支持。
I would like to ask if I have to create this function from scratch or there is something in the C++11 standard that can help me with that; please note that my compiler ( gcc/g++ 4.8.1 ) doesn't offer a complete support for codecvt
.
推荐答案
这里是一些C ++代码,不会很难转换为C.改编自旧版回答。
Here's some C++ code that wouldn't be hard to convert to C. Adapted from an older answer.
std::string UnicodeToUTF8(unsigned int codepoint)
{
std::string out;
if (codepoint <= 0x7f)
out.append(1, static_cast<char>(codepoint));
else if (codepoint <= 0x7ff)
{
out.append(1, static_cast<char>(0xc0 | ((codepoint >> 6) & 0x1f)));
out.append(1, static_cast<char>(0x80 | (codepoint & 0x3f)));
}
else if (codepoint <= 0xffff)
{
out.append(1, static_cast<char>(0xe0 | ((codepoint >> 12) & 0x0f)));
out.append(1, static_cast<char>(0x80 | ((codepoint >> 6) & 0x3f)));
out.append(1, static_cast<char>(0x80 | (codepoint & 0x3f)));
}
else
{
out.append(1, static_cast<char>(0xf0 | ((codepoint >> 18) & 0x07)));
out.append(1, static_cast<char>(0x80 | ((codepoint >> 12) & 0x3f)));
out.append(1, static_cast<char>(0x80 | ((codepoint >> 6) & 0x3f)));
out.append(1, static_cast<char>(0x80 | (codepoint & 0x3f)));
}
return out;
}
这篇关于无符号整数为UTF-8值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!