C++ std::string 的长度(以字节为单位) [英] Length of a C++ std::string in bytes

查看:114
本文介绍了C++ std::string 的长度(以字节为单位)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在弄清楚 std::string.length() 的确切语义时遇到了一些麻烦.documentation 明确指出 length()返回字符串中的字符数,不是字节数.我想知道在哪些情况下这会有所作为.

I'm having some trouble figuring out the exact semantics of std::string.length(). The documentation explicitly points out that length() returns the number of characters in the string and not the number of bytes. I was wondering in which cases this actually makes a difference.

特别是,这仅与 std::basic_string<> 的非字符实例化有关,还是在存储具有多字节字符的 UTF-8 字符串时也会遇到麻烦?标准是否允许 length() 支持 UTF8?

In particular, is this only relevant to non-char instantiations of std::basic_string<> or can I also get into trouble when storing UTF-8 strings with multi-byte characters? Does the standard allow for length() to be UTF8-aware?

推荐答案

当处理 std::basic_string<> 的非 char 实例化时,当然,长度可能不等于字节数.这在 std::wstring:

When dealing with non-char instantiations of std::basic_string<>, sure, length may not equal number of bytes. This is particularly evident with std::wstring:

std::wstring ws = L"hi";
cout << ws.length();     // <-- 2, not 4

但是 std::string 是关于 char 字符的;就 std::string 而言,没有多字节字符这样的东西,无论您是否在高层次上塞满了一个字符.因此, std::string.length() 始终是字符串表示的字节数.请注意,如果您将多字节字符"塞入 std::string,那么您对字符"的定义会突然与容器和标准的定义不一致.

But std::string is about char characters; there is no such thing as a multi-byte character as far as std::string is concerned, whether you crammed one in at a high level or not. So, std::string.length() is always the number of bytes represented by the string. Note that if you're cramming multibyte "characters" into an std::string, then your definition of "character" suddenly becomes at odds with that of the container and of the standard.

这篇关于C++ std::string 的长度(以字节为单位)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆