转换ISO-8859-1的字符串为UTF-8在C / C ++ [英] Convert ISO-8859-1 strings to UTF-8 in C/C++
问题描述
您会认为这将是现成的,但我有一个很难找到一个简单的库函数,从ISO-8859-1编码C或C ++字符串转换为UTF-8。我读的数据是8位ISO-8859-1编码,但需要将其转换为UTF-8字符串用于SQLite数据库并最终建立一个Android应用程序。
You would think this would be readily available, but I'm having a hard time finding a simple library function that will convert a C or C++ string from ISO-8859-1 coding to UTF-8. I'm reading data that is in 8-bit ISO-8859-1 encoding, but need to convert it to a UTF-8 string for use in an SQLite database and eventually an Android app.
我发现一个商业产品,但它超出我的预算在这个时候。
I found one commercial product, but it's beyond my budget at this time.
- 道格·摹
推荐答案
如果你的信源编码将总是是ISO-8859-1,这是微不足道的。这里有一个循环:
If your source encoding will always be ISO-8859-1, this is trivial. Here's a loop:
unsigned char *in, *out;
while (*in)
if (*in<128) *out++=*in++;
else *out++=0xc2+(*in>0xbf), *out++=(*in++&0x3f)+0x80;
有关安全,你需要确保输出缓冲器是两倍于输入缓冲区一样大,否则有大小限制,并检查它的循环条件。
For safety you need to ensure that the output buffer is twice as large as the input buffer, or else include a size limit and check it in the loop condition.
这篇关于转换ISO-8859-1的字符串为UTF-8在C / C ++的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!