C ++ 11中支持的Unicode有多好? [英] How well is Unicode supported in C++11?

查看:129
本文介绍了C ++ 11中支持的Unicode有多好?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我读过并听说C ++ 11支持Unicode。有几个问题:

I've read and heard that C++11 supports Unicode. A few questions on that:


  • C ++标准库支持Unicode有多好?

  • std :: string 是否应该做什么?

  • 如何使用?

  • 潜在问题在哪里?

  • How well does the C++ standard library support Unicode?
  • Does std::string do what it should?
  • How do I use it?
  • Where are potential problems?

推荐答案


C ++标准库如何支持unicode?

How well does the C++ standard library support unicode?

可怕。

Unicode支持给我这个列表:

A quick scan through the library facilities that might provide Unicode support gives me this list:


  • 字符串库

  • 本地化库

  • 输入/输出库

  • 正则表达式库

  • Strings library
  • Localization library
  • Input/output library
  • Regular expressions library

第一个提供可怕的支持。

I think all but the first one provide terrible support. I'll get back to it in more detail after a quick detour through your other questions.


std: :string 做什么呢?

是的。根据C ++标准,这是 std :: string 和它的兄弟应该做的

Yes. According to the C++ standard, this is what std::string and its siblings should do:


类模板 basic_string 描述了可以存储由不同数量的任意char样对象组成的序列的对象,序列的第一个元素位于零。

The class template basic_string describes objects that can store a sequence consisting of a varying number of arbitrary char-like objects with the first element of the sequence at position zero.

好吧, std :: string 这是否提供任何Unicode特定的功能?不。

Well, std::string does that just fine. Does that provide any Unicode-specific functionality? No.

应该吗?可能不会。 std :: string 作为 char 对象的序列很好。这很有用;唯一的烦恼是,它是一个非常低级别的文本和标准C ++不提供更高级别的视图。

Should it? Probably not. std::string is fine as a sequence of char objects. That's useful; the only annoyance is that it is a very low-level view of text and standard C++ doesn't provide a higher-level one.


如何使用它?

How do I use it?

使用它作为 char 对象的序列;

Use it as a sequence of char objects; pretending it is something else is bound to end in pain.


潜在的问题在哪里?

Where are potential problems?

整个地方?让我们看看...

All over the place? Let's see...

字符串库

字符串库提供给我们 basic_string ,这只是标准调用char-like objects的顺序。我称它们为代码单位。如果你想要一个高级的文本视图,这不是你要找的。这是一个适用于序列化/反序列化/存储的文本视图。

The strings library provides us basic_string, which is merely a sequence of what the standard calls "char-like objects". I call them code units. If you want a high-level view of text, this is not what you are looking for. This is a view of text suitable for serialization/deserialization/storage.

它还提供了C库中的一些工具,可用于弥合狭窄世界和Unicode世界: c16rtomb / mbrtoc16 c32rtomb mbrtoc32

It also provides some tools from the C library that can be used to bridge the gap between the narrow world and the Unicode world: c16rtomb/mbrtoc16 and c32rtomb/mbrtoc32.

本地化库

本地化库仍然相信那些char-like对象之一等于一个字符。这当然是愚蠢的,并且使得不可能得到很多东西在ASCII之外的一些小的Unicode子集上工作。

The localization library still believes that one of those "char-like objects" equals one "character". This is of course silly, and makes it impossible to get lots of things working properly beyond some small subset of Unicode like ASCII.

例如,考虑标准调用< locale> 标题中的便利界面:

Consider, for example, what the standard calls "convenience interfaces" in the <locale> header:

template <class charT> bool isspace (charT c, const locale& loc);
template <class charT> bool isprint (charT c, const locale& loc);
template <class charT> bool iscntrl (charT c, const locale& loc);
// ...
template <class charT> charT toupper(charT c, const locale& loc);
template <class charT> charT tolower(charT c, const locale& loc);
// ...

这篇关于C ++ 11中支持的Unicode有多好?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆