QString到Unicode std :: string [英] QString to unicode std::string

查看：103 发布时间：2021/5/4 19:18:02 qt unicode encoding qstring

本文介绍了QString到Unicode std :: string的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我知道有很多有关将 QString 转换为 char * 的信息，但是在这个问题上我仍然需要澄清.

I know there is plenty of information about converting QString to char*, but I still need some clarification in this question.

Qt提供了 QTextCodec 来将 QString (内部以unicode存储字符)转换为 QByteArray ，从而使我能够检索 char* 表示某些非unicode编码中的字符串.但是，当我想获取Unicode QByteArray 时该怎么办?

Qt provides QTextCodecs to convert QString (which internally stores characters in unicode) to QByteArray, allowing me to retrieve char* which represents the string in some non-unicode encoding. But what should I do when I want to get a unicode QByteArray?

QTextCodec* codec = QTextCodec::codecForName("UTF-8");
QString qstr = codec->toUnicode("Юникод");
std::string stdstr(reinterpret_cast<const char*>(qstr.constData()), qstr.size() * 2 );  // * 2 since unicode character is twice longer than char
qDebug() << QString(reinterpret_cast<const QChar*>(stdstr.c_str()), stdstr.size() / 2); // same

上面的代码按照我的预期打印Юникод".但是我想知道这是否是获取 QString 的unicode char * 的正确方法.特别是，此技术中的 reinterpret_cast s和大小算法看起来很丑.

The above code prints "Юникод" as I've expected. But I'd like to know if that is the right way to get to the unicode char* of the QString. In particular, reinterpret_casts and size arithmetics in this technique looks pretty ugly.

推荐答案

以下内容适用于Qt5.Qt4的行为是不同的，实际上是不正确的.

您需要选择:

是否要使用8位宽的 std :: string 或16位宽的 std :: wstring 或其他某种类型.

Whether you want the 8-bit wide std::string or 16-bit wide std::wstring, or some other type.

目标字符串中需要哪种编码?

What encoding is desired in your target string?

在内部， QString 存储UTF-16编码的数据，因此任何Unicode代码点都可以用一个或两个 QChar 表示.

Internally, QString stores UTF-16 encoded data, so any Unicode code point may be represented in one or two QChars.

常见案例:

本地编码的8位 std :: string (如:系统区域设置):

Locally encoded 8-bit std::string (as in: system locale):

std::string(str.toLocal8Bit().constData())

UTF-8编码的8位 std :: string :

str.toStdString()

这等效于:

std::string(str.toUtf8().constData())

UTF-16或UCS-4编码的 std :: wstring ，分别为16或32位宽.Qt选择16位和32位编码，以匹配平台的 wchar_t 宽度.

UTF-16 or UCS-4 encoded std::wstring, 16- or 32 bits wide, respectively. The selection of 16- vs. 32-bit encoding is done by Qt to match the platform's width of wchar_t.

str.toStdWString()

C ++ 11的U16或U32字符串-从Qt 5.5起:

U16 or U32 strings of C++11 - from Qt 5.5 onwards:

str.toStdU16String()
str.toStdU32String()

UTF-16编码的16位 std :: u16string -仅在Qt 5.4之前需要此hack:

UTF-16 encoded 16-bit std::u16string - this hack is only needed up to Qt 5.4:

std::u16string(reinterpret_cast<const char16_t*>(str.constData()))

此编码不包含字节顺序标记(BOM).

This encoding does not include byte order marks (BOMs).

在转换之前将BOM预先添加到 QString 本身很容易:

It's easy to prepend BOMs to the QString itself before converting it:

QString src = ...;
src.prepend(QChar::ByteOrderMark);
#if QT_VERSION < QT_VERSION_CHECK(5,5,0)
auto dst = std::u16string{reinterpret_cast<const char16_t*>(src.constData()),
                          src.size()};
#else
auto dst = src.toStdU16String();

如果您希望字符串很大，则可以跳过一个副本:

If you expect the strings to be large, you can skip one copy:

const QString src = ...;
std::u16string dst;
dst.reserve(src.size() + 2); // BOM + termination
dst.append(char16_t(QChar::ByteOrderMark));
dst.append(reinterpret_cast<const char16_t*>(src.constData()),
           src.size()+1);

在两种情况下， dst 现在都可以移植到具有任意字节序的系统.

In both cases, dst is now portable to systems with either endianness.

这篇关于QString到Unicode std :: string的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

QString到Unicode std :: string [英] QString to unicode std::string

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

QString到Unicode std :: string [英] QString to unicode std::string

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭