如何HTML- / URL编码包含Unicode字符的std :: wstring？ [英] How do I HTML-/ URL-Encode a std::wstring containing Unicode characters?

查看：228 发布时间：2016/10/17 11:16:54 c++ html url unicode utf-8

本文介绍了如何HTML- / URL编码包含Unicode字符的std :: wstring？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我还有另一个问题。如果我有一个std :: wstring看起来像这样：

I have another question yet. If I had a std::wstring looking like this:

在这里， / p>

ドイツ語で検索していてこちらのサイトにたどり着きました。

我怎么可能得到它的URL编码（％nn ， em> = 0-9，af）到：

How could I possibly get it to be URL-Encoded (%nn, n = 0-9, a-f) to:

％E3％83％89％E3％82％A4％E3％ 84％E8％AA％9E％E3％81％A7％E6％A4％9C％E7％B4％A2％E3％81％97％E3％81％A6％E3％81％84％E3％81％A6％ E3％81％94％E3％83％98％E3％83％78％E3％81％78％E3％82％ 81％9F％E3％81％A9％E3％82％8A％E7％9D％80％E3％81％8D％E3％81％BE％E3％81％97％E3％81％9F％E3％80％ 82

%E3%83%89%E3%82%A4%E3%83%84%E8%AA%9E%E3%81%A7%E6%A4%9C%E7%B4%A2%E3%81%97%E3%81%A6%E3%81%84%E3%81%A6%E3%81%93%E3%81%A1%E3%82%89%E3%81%AE%E3%82%B5%E3%82%A4%E3%83%88%E3%81%AB%E3%81%9F%E3%81%A9%E3%82%8A%E7%9D%80%E3%81%8D%E3%81%BE%E3%81%97%E3%81%9F%E3%80%82

...以及HTML编码（& #nnn （* nn *）; ， n = 0-9（？））至：

... and also HTML-Encoded (&#nnn(*nn*);, n = 0-9(?)) to:

ドイツ語で検索していてこちらのサイトにたどり着きました。

请帮助我，因为我现在完全迷失，甚至不知道从哪里开始。

Please help me as I am totally lost right now and don't even know where to start. By the way, performance isn't much important to me right now.

提前感谢！

推荐答案

下面是一个显示两个方法的示例，一个基于Qt库，另一个基于ICU库。两者应该是相当平台无关的：

Here is an example which shows two methods, one based on the Qt library and one based on the ICU library. Both should be fairly platform-independent:

#include <iostream> #include <sstream> #include <iomanip> #include <stdexcept> #include <boost/scoped_array.hpp> #include <QtCore/QString> #include <QtCore/QUrl> #include <QtCore/QVector> #include <unicode/utypes.h> #include <unicode/ustring.h> #include <unicode/unistr.h> #include <unicode/schriter.h> void encodeQt() { const QString str = QString::fromWCharArray(L"ドイツ語で検索していてこちらのサイトにたどり着きました。"); const QUrl url = str; std::cout << "URL encoded: " << url.toEncoded().constData() << std::endl; typedef QVector<uint> CodePointVector; const CodePointVector codePoints = str.toUcs4(); std::stringstream htmlEncoded; for (CodePointVector::const_iterator it = codePoints.constBegin(); it != codePoints.constEnd(); ++it) { htmlEncoded << "&#" << *it << ';'; } std::cout << "HTML encoded: " << htmlEncoded.str() << std::endl; } void encodeICU() { const std::wstring cppString = L"ドイツ語で検索していてこちらのサイトにたどり着きました。"; int bufSize = cppString.length() * 2; boost::scoped_array<UChar> strBuffer(new UChar[bufSize]); int size = 0; UErrorCode error = U_ZERO_ERROR; u_strFromWCS(strBuffer.get(), bufSize, &size, cppString.data(), cppString.length(), &error); if (error) return; const UnicodeString str(strBuffer.get(), size); bufSize = str.length() * 4; boost::scoped_array<char> buffer(new char[bufSize]); u_strToUTF8(buffer.get(), bufSize, &size, str.getBuffer(), str.length(), &error); if (error) return; const std::string urlUtf8(buffer.get(), size); std::stringstream urlEncoded; urlEncoded << std::hex << std::setfill('0'); for (std::string::const_iterator it = urlUtf8.begin(); it != urlUtf8.end(); ++it) { urlEncoded << '%' << std::setw(2) << static_cast<unsigned int>(static_cast<unsigned char>(*it)); } std::cout << "URL encoded: " << urlEncoded.str() << std::endl; std::stringstream htmlEncoded; StringCharacterIterator it = str; while (it.hasNext()) { const UChar32 pt = it.next32PostInc(); htmlEncoded << "&#" << pt << ';'; } std::cout << "HTML encoded: " << htmlEncoded.str() << std::endl; } int main() { encodeQt(); encodeICU(); }

这篇关于如何HTML- / URL编码包含Unicode字符的std :: wstring？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何HTML- / URL编码包含Unicode字符的std :: wstring？ [英] How do I HTML-/ URL-Encode a std::wstring containing Unicode characters?

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

如何HTML- / URL编码包含Unicode字符的std :: wstring？ [英] How do I HTML-/ URL-Encode a std::wstring containing Unicode characters?

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭