如何比较utf8字符串如波斯语在c ++中? [英] how can I compare utf8 string such as persian words in c++?
问题描述
我想比较波斯语中的字符串(utf8)。我知道我必须使用一些东西像Lگل,它必须保存在wchar_t *或wstring。问题是当我通过比较函数compare()字符串我没有得到正确的结果。
I want to compare strings in Persian (utf8). I know I must use some thing like L"گل" and it must be saved in wchar_t * or wstring. the question is when I compare by the function compare() strings I dont get the right result.
推荐答案
想要比较是在一个特定的,明确的编码已经,然后不使用 wchar_t
,不要使用 L
文字 - 这些不是用于Unicode,而是用于实现定义的不透明编码 < a>。
If the strings that you want to compare are in a specific, definite encoding already, then don't use wchar_t
and don't use L""
literals -- those are not for Unicode, but for implementation-defined, opaque encodings only.
如果您的字符串是UTF-8,请使用 char
的字符串。如果你想将它们转换为原始的Unicode代码点(UCS-4 / UTF-32),或者如果你已经有这些形式,它们存储在 uint32_t
If your strings are in UTF-8, use a string of char
s. If you want to convert them to raw Unicode codepoints (UCS-4/UTF-32), or if you already have them in that form, store them in a string of uint32_t
s, or char32_t
s if you have a modern compiler.
如果你有C ++ 11,你的文字可以 char str8 [] = u8گل;
或 char32_t str32 [] = Uگل;
。 有关详情,请参阅此主题。
If you have C++11, your literal can be char str8[] = u8"گل";
or char32_t str32[] = U"گل";
. See this topic for some more on this.
如果要与命令行参数或环境交互,请使用 iconv()
将WCHAR转换为UTF-32或UTF -8。
If you want to interact with command line arguments or the environment, use iconv()
to convert from WCHAR to UTF-32 or UTF-8.
这篇关于如何比较utf8字符串如波斯语在c ++中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!