如何将UTF-8文本解码为可读字符串C ++ [英] How to decode UTF-8 text to readable char array C ++

查看：140 发布时间：2019/6/7 13:03:15 C++

本文介绍了如何将UTF-8文本解码为可读字符串C ++的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

嗨！我有一个文件包含（英文和西里尔文字）：

Hi! I have a file that contains (english and cyrillic words):

\u0074\u0065\u0078\u0074\u0442\u0435\u043a\u0441\u0442

使用ifstream和read（）方法将文件内容复制到char数组。

< b>我尝试了什么：

Using ifstream and read() method copy file contents to char array.

What I have tried:

std::ifstream file("d:/example.txt", std::ios::in | std::ios::binary);
char buffer[128] = "";
file.seekg(0, ios::end);
int data_len = (int)file.tellg();
file.seekg(0, ios::beg);

file.read(buffer, data_len);

当输出缓冲区到MessageBox时，它将按原样显示 - 未解码。

如何解码包含英文和西里尔文单词到char数组的文本？

And when output buffer to MessageBox, then it will be displayed as is - not decoded.

How to decode text, that contains english and cyrillic words to char array?

推荐答案

虽然，我还没有尝试过，但我相信您需要将数据读取从普通字节（char）更改为宽字节（wchar_t）。读取数据时会丢失数据，因为您可能会逐字节读取数据，而Unicode在这种情况下是臭名昭着的。

Although, I have not yet tried this but I believe you would need to change the data reading from ordinary bytes (char) to wide bytes (wchar_t). Your data is lost when you read it, because you might be reading it byte-by-byte, and Unicode is notorious in this case.

// I just shamelessly copied this code from https://stackoverflow.com/a/901617/1762944
ifstream file; 
file.open("k:/test.txt", ifstream::in|ifstream::binary);

wchar_t buffer[2048]; 
file.seekg(2);
file.read((char*)buffer, line_length);
wprintf(L"%s\n", buffer);
file.close();

请参阅此处， visual c ++ - Read Unicode文件C ++ - 堆栈溢出 [ ^ ]

宽字符 - 维基百科 [ ^ ]

编码概述 - 全球化| Microsoft Docs [ ^ ]

See here, visual c++ - Read Unicode files C++ - Stack Overflow[^]
Wide character - Wikipedia[^]
Encoding Overview - Globalization | Microsoft Docs[^]

使用

use

std::wfstream

而不是

instead

参见使用C / C ++处理简单的文本文件 [ ^ ]用于将多字节（UTF8）转换为Unicode。

See Handling simple text files in C/C++[^] for converting multi-byte (UTF8) to Unicode.

这篇关于如何将UTF-8文本解码为可读字符串C ++的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何将UTF-8文本解码为可读字符串C ++ [英] How to decode UTF-8 text to readable char array C ++

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

如何将UTF-8文本解码为可读字符串C ++ [英] How to decode UTF-8 text to readable char array C ++

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭