转换ISO-8859-1的字符串为UTF-8在C / C ++ [英] Convert ISO-8859-1 strings to UTF-8 in C/C++

查看:175
本文介绍了转换ISO-8859-1的字符串为UTF-8在C / C ++的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您会认为这将是现成的,但我有一个很难找到一个简单的库函数,从ISO-8859-1编码C或C ++字符串转换为UTF-8。我读的数据是8位ISO-8859-1编码,但需要将其转换为UTF-8字符串用于SQLite数据库并最终建立一个Android应用程序。

You would think this would be readily available, but I'm having a hard time finding a simple library function that will convert a C or C++ string from ISO-8859-1 coding to UTF-8. I'm reading data that is in 8-bit ISO-8859-1 encoding, but need to convert it to a UTF-8 string for use in an SQLite database and eventually an Android app.

我发现一个商业产品,但它超出​​我的预算在这个时候。

I found one commercial product, but it's beyond my budget at this time.


  • 道格·摹

推荐答案

如果你的信源编码将总是是ISO-8859-1,这是微不足道的。这里有一个循环:

If your source encoding will always be ISO-8859-1, this is trivial. Here's a loop:

unsigned char *in, *out;
while (*in)
    if (*in<128) *out++=*in++;
    else *out++=0xc2+(*in>0xbf), *out++=(*in++&0x3f)+0x80;

有关安全,你需要确保输出缓冲器是两倍于输入缓冲区一样大,否则有大小限制,并检查它的循环条件。

For safety you need to ensure that the output buffer is twice as large as the input buffer, or else include a size limit and check it in the loop condition.

这篇关于转换ISO-8859-1的字符串为UTF-8在C / C ++的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆