如何比较utf8字符串,如c ++中的波斯语? [英] how can I compare utf8 string such as persian words in c++?

查看:157
本文介绍了如何比较utf8字符串,如c ++中的波斯语?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想比较波斯语(utf8)中的字符串。我知道我必须使用像Lگل这样的东西,它必须保存在wchar_t *或wstring中。问题是当我比较函数compare()的字符串我没有得到正确的结果。

I want to compare strings in Persian (utf8). I know I must use some thing like L"گل" and it must be saved in wchar_t * or wstring. the question is when I compare by the function compare() strings I dont get the right result.

推荐答案

如果你的字符串要比较是在一个具体的,确定的编码已经,那么不要使用 wchar_t ,不要使用 L文字 - 不适用于Unicode,但对于实现定义的不透明编码,仅< a>。

If the strings that you want to compare are in a specific, definite encoding already, then don't use wchar_t and don't use L"" literals -- those are not for Unicode, but for implementation-defined, opaque encodings only.

如果您的字符串是UTF-8,请使用字符串 char s。如果要将它们转换为原始Unicode代码点(UCS-4 / UTF-32),或者如果您已经将它们存储在该表单中,请将其存储在 uint32_t s或 char32_t 如果您有现代编译器。

If your strings are in UTF-8, use a string of chars. If you want to convert them to raw Unicode codepoints (UCS-4/UTF-32), or if you already have them in that form, store them in a string of uint32_ts, or char32_ts if you have a modern compiler.

如果您有C ++ 11,您的文字可以是 char str8 [] = u8گل; char32_t str32 [] = Uگل; 。有关这方面的更多信息,请查看此主题

If you have C++11, your literal can be char str8[] = u8"گل"; or char32_t str32[] = U"گل";. See this topic for some more on this.

如果要与命令行参数或环境交互,请使用 iconv()将WCHAR转换为UTF-32或UTF -8。

If you want to interact with command line arguments or the environment, use iconv() to convert from WCHAR to UTF-32 or UTF-8.

这篇关于如何比较utf8字符串,如c ++中的波斯语?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆