Unicode std :: string类替换 [英] Unicode std::string class replacement

查看:126
本文介绍了Unicode std :: string类替换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找关于unicode感知std :: string库替换的建议。我有一堆代码使用std :: string,它的迭代器等,并希望现在支持unicode字符串(免费或开源实现首选,regex功能将是伟大的!)。

I'm looking for suggestions regarding unicode aware std::string library replacements. I have a bunch of code that uses std::string, its iterators etc, and would like to now support unicode strings (free or open source implementations preferred, regex capabilities would be great!).

我不确定如果我需要一个完整的重写,或者如果我可以放弃一个新的字符串库支持所有的std ::字符串接口。 unicode世界似乎很复杂,我只是想在我的应用程序中启用它不必学习它的每一个方面。

I'm not sure at this point if I require a complete rewrite or if I can get away with dropping in a new string library that supports all of the std::string interfaces. The unicode world seems very complex and I'm just wanting to enable it in my applications not have to learn every single aspect of it.

btw索引操作如何工作当它必须传回对1,2,3或4结构的引用时,其可以在理论上改变为1,2,3或4字节结构。如果通过更大或更小的值,内部数据表示的来回移动是否发生在原位?

btw how does the index operator work when it has to pass back a reference to either a 1, 2,3 or 4 structure which could in theory change to either a 1,2,3 or 4 byte structure. if a larger or smaller sized value is passed, does the shifting back and forth of the internal data representation occur insitu?

推荐答案

如果你确定你的std :: string包含什么,不需要一个完整的重写。例如,您可以假设(并转换输入以确保)您的std :: string包含UTF8编码的字符串(对于那些需要本地化的字符串)。不要忘记std :: string只是一个原始数据的容器,它不与编码相关联(即使在C ++ 0x,它只是一种可能性,而不是一个要求)。

You don't need a complete rewrite if you make sure about what your std::string contains. For example, you could assume (and convert inputs to be sure) that your std::string contain UTF8 encoded strings (for those that need localization). Don't forget that std::string is only a container of raw data, it's not associated with an encoding (even in C++0x, it's only a possibility, not a requirement).

然后,当你将文本传递给需要不同编码的其他库时,你可以使用像UTF8CPP这样的库来转换为所需的编码(但大多数时候这样的库会自己做)。

Then when you pass text to other libraries that require different encodings, you can use libraries like UTF8CPP to convert to the required encoding (but most of the time such libraries will do it themselves).

这样就简单了。在您的代码中使用标准std :: string的UTF8,可以将unicode字符串传递给其他任何内容(如果需要,可以转换)。

That way makes it simple. UTF8 with standard std::string in your code, enabling passing unicode string to everything else (with conversion if necessary).

增强社区邮件列表。也许阅读它(如果你有足够的时间...)可以帮助你了解其他可能的解决方案。

There have been a lot of discussions about this in the boost community mailing list. Maybe reading it (if you have enough time...) can help you understand other possible solutions.

这篇关于Unicode std :: string类替换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆