C ++ std :: string和UTF-8 [英] C++ std::string and UTF-8

查看:357
本文介绍了C ++ std :: string和UTF-8的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只是想在C ++中写一些简单的行到一个文本文件,但我想要它们以UTF-8编码。

I just want to write some few simple lines to a text file in C++, but I want them to be encoded in UTF-8. What is the easiest and simple way to do so?

推荐答案

UTF-8影响的唯一方法 std :: string size() length()

The only way UTF-8 affects std::string is that size(), length(), and all the indices are measured in bytes, not characters.

并且,如sbi所指出的,递增 std :: string 将逐个字节前进,而不是字符,因此它实际上可以指向一个多字节UTF-8编码点的中间。在标准库中没有提供支持UTF-8的迭代器,但是有一些可用的'Net。

And, as sbi points out, incrementing the iterator provided by std::string will step forward by byte, not by character, so it can actually point into the middle of a multibyte UTF-8 codepoint. There's no UTF-8-aware iterator provided in the standard library, but there are a few available on the 'Net.

如果你还记得,你可以把UTF- 8到 std :: string ,写入一个文件等等所有的通常的方式(我的意思是你将使用

If you remember that, you can put UTF-8 into std::string, write it to a file, etc. all in the usual way (by which I mean the way you'd use a std::string without UTF-8 inside).

您可能希望以字节顺序标记启动文件,以便其他程序将知道它是UTF-8。

You may want to start your file with a byte order mark so that other programs will know it is UTF-8.

这篇关于C ++ std :: string和UTF-8的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆