如何从UTF-8字符串的每个字符获取UNICODE代码？ [英] How to get the UNICODE code from each character of a UTF-8 string?

查看：158 发布时间：2017/8/16 22:15:34 c++ c++11 unicode encoding utf-8

本文介绍了如何从UTF-8字符串的每个字符获取UNICODE代码？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

使用C ++ 11，如何从UTF-8编码的 std :: string 中获取文本的每个字符的Unicode值到一个 uint32_t ？

如下所示：

  void f（const std :: string& utf8_str）
 {
 for（???）{
 uint32_t code = ???; 
 
 / *用代码我的东西... * / 
} 
}

假设主机系统区域设置是UTF-8有帮助吗？什么标准库工具C ++ 11为任务提供？

解决方案

您可以简单地将字符串转换为UTF-32编码一个，使用提供的转换方面和 std :: wstring_convert 从< locale> ：

  #include< codecvt> 
 #include< locale> 
 #include< string> 
 
 void foo（std :: string const& utf8str）
 {
 std :: wstring_convert< std :: codecvt_utf8< char32_t>，char32_t> CONV; 
 std :: u32string utf32str = conv.from_bytes（utf8str）; 
 
 for（char32_t u：utf32str）{/ * ... * /} 
}

With C++11, how can I, from an UTF-8 encoded std::string, get the Unicode value of each character of the text into an uint32_t?

Something like:

void f(const std::string &utf8_str)
{
    for(???) {
       uint32_t code = ???;

       /* Do my stuff with the code... */
    }
}

Does assuming the host system locale is UTF-8 helps? What standard library tools C++11 offers for the task?

解决方案

You can simply convert the string into a UTF-32 encoded one, using the provided conversion facet and std::wstring_convert from <locale>:

#include <codecvt>
#include <locale>
#include <string>

void foo(std::string const & utf8str)
{
     std::wstring_convert<std::codecvt_utf8<char32_t>, char32_t> conv;
     std::u32string utf32str = conv.from_bytes(utf8str);

     for (char32_t u : utf32str)  { /* ... */ }
}

这篇关于如何从UTF-8字符串的每个字符获取UNICODE代码？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何从UTF-8字符串的每个字符获取UNICODE代码？ [英] How to get the UNICODE code from each character of a UTF-8 string?

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

如何从UTF-8字符串的每个字符获取UNICODE代码？ [英] How to get the UNICODE code from each character of a UTF-8 string?

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭