如何从UTF-8字符串的每个字符获取UNICODE代码? [英] How to get the UNICODE code from each character of a UTF-8 string?
本文介绍了如何从UTF-8字符串的每个字符获取UNICODE代码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
std :: string
中获取文本的每个字符的Unicode值到一个 uint32_t
? 如下所示:
void f(const std :: string& utf8_str)
{
for(???){
uint32_t code = ???;
/ *用代码我的东西... * /
}
}
假设主机系统区域设置是UTF-8有帮助吗?什么标准库工具C ++ 11为任务提供?
解决方案
您可以简单地将字符串转换为UTF-32编码一个,使用提供的转换方面和 std :: wstring_convert
从< locale>
:
#include< codecvt>
#include< locale>
#include< string>
void foo(std :: string const& utf8str)
{
std :: wstring_convert< std :: codecvt_utf8< char32_t>,char32_t> CONV;
std :: u32string utf32str = conv.from_bytes(utf8str);
for(char32_t u:utf32str){/ * ... * /}
}
With C++11, how can I, from an UTF-8 encoded std::string
, get the Unicode value of each character of the text into an uint32_t
?
Something like:
void f(const std::string &utf8_str)
{
for(???) {
uint32_t code = ???;
/* Do my stuff with the code... */
}
}
Does assuming the host system locale is UTF-8 helps? What standard library tools C++11 offers for the task?
解决方案
You can simply convert the string into a UTF-32 encoded one, using the provided conversion facet and std::wstring_convert
from <locale>
:
#include <codecvt>
#include <locale>
#include <string>
void foo(std::string const & utf8str)
{
std::wstring_convert<std::codecvt_utf8<char32_t>, char32_t> conv;
std::u32string utf32str = conv.from_bytes(utf8str);
for (char32_t u : utf32str) { /* ... */ }
}
这篇关于如何从UTF-8字符串的每个字符获取UNICODE代码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文