将字符串从UTF-8转换为ISO-8859-1 [英] Convert string from UTF-8 to ISO-8859-1
本文介绍了将字符串从UTF-8转换为ISO-8859-1的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试将UTF-8 string
转换为ISO-8859-1 char*
以在旧代码中使用.我看到的唯一方法是使用 iconv
.
I'm trying to convert a UTF-8 string
to a ISO-8859-1 char*
for use in legacy code. The only way I'm seeing to do this is with iconv
.
我绝对希望使用完全基于string
的C ++解决方案,然后仅在结果字符串上调用.c_str()
.
I would definitely prefer a completely string
-based C++ solution then just call .c_str()
on the resulting string.
我该怎么做?如果可能,请提供代码示例.如果您唯一知道的解决方案,我可以使用iconv
.
How do I do this? Code example if possible, please. I'm fine using iconv
if it is the only solution you know.
推荐答案
I'm going to modify my code from another answer to implement the suggestion from Alf.
std::string UTF8toISO8859_1(const char * in)
{
std::string out;
if (in == NULL)
return out;
unsigned int codepoint;
while (*in != 0)
{
unsigned char ch = static_cast<unsigned char>(*in);
if (ch <= 0x7f)
codepoint = ch;
else if (ch <= 0xbf)
codepoint = (codepoint << 6) | (ch & 0x3f);
else if (ch <= 0xdf)
codepoint = ch & 0x1f;
else if (ch <= 0xef)
codepoint = ch & 0x0f;
else
codepoint = ch & 0x07;
++in;
if (((*in & 0xc0) != 0x80) && (codepoint <= 0x10ffff))
{
if (codepoint <= 255)
{
out.append(1, static_cast<char>(codepoint));
}
else
{
// do whatever you want for out-of-bounds characters
}
}
}
return out;
}
无效的UTF-8输入会导致字符丢失.
Invalid UTF-8 input results in dropped characters.
这篇关于将字符串从UTF-8转换为ISO-8859-1的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文