如何将 UTF8 字符数组转换为 Windows 1252 字符数组 [英] How to convert UTF8 char array to Windows 1252 char array

查看：69 发布时间：2021/9/15 19:47:04 c++ unicode utf-8

本文介绍了如何将 UTF8 字符数组转换为 Windows 1252 字符数组的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我是 C++ 的菜鸟，所以我很抱歉提出愚蠢的问题.

I am noob in C++ so I am very sorry for asking stupid question.

我有一段文字:ÐŸÐ°Ð²Ð»Ð¾

I have a piece of text: ÐŸÐ°Ð²Ð»Ð¾

我从我正在处理的一段代码的控制台输出中得到它.我知道这是隐藏在它后面的西里尔字.它的真正价值是Петро".

I get it somewhere from console output in piece of code I am working on. I know that this is cyrillic word hidded behind it. It's real value is "Петро".

使用在线编码检测器，我发现要正确阅读此文本，我必须将其从 UTF-8 转换为 Windows 1252.

With online encoding detector I have found that to read this text properly, I have to convert it from UTF-8 to Windows 1252.

我怎样才能用代码做到这一点?

How can I do it with code?

我已经试过了，它给出了一些结果，但它输出了 5 个问号(至少是预期的)

I have tried this, it gives some results, but it outputs 5 questionmarks (at least lenght expected)

    wchar_t *CodePageToUnicode(int codePage, const char *src)
{
    if (!src) return 0;
    int srcLen = strlen(src);
    if (!srcLen)
    {
        wchar_t *w = new wchar_t[1];
        w[0] = 0;
        return w;
    }

    int requiredSize = MultiByteToWideChar(codePage,
        0,
        src, srcLen, 0, 0);

    if (!requiredSize)
    {
        return 0;
    }

    wchar_t *w = new wchar_t[requiredSize + 1];
    w[requiredSize] = 0;

    int retval = MultiByteToWideChar(codePage,
        0,
        src, srcLen, w, requiredSize);
    if (!retval)
    {
        delete[] w;
        return 0;
    }

    return w;
}

char *UnicodeToCodePage(int codePage, const wchar_t *src)
{
    if (!src) return 0;
    int srcLen = wcslen(src);
    if (!srcLen)
    {
        char *x = new char[1];
        x[0] = '\0';
        return x;
    }

    int requiredSize = WideCharToMultiByte(codePage,
        0,
        src, srcLen, 0, 0, 0, 0);

    if (!requiredSize)
    {
        return 0;
    }

    char *x = new char[requiredSize + 1];
    x[requiredSize] = 0;

    int retval = WideCharToMultiByte(codePage,
        0,
        src, srcLen, x, requiredSize, 0, 0);
    if (!retval)
    {
        delete[] x;
        return 0;
    }

    return x;
}
int main()
{
    const char *text = "ÐŸÐ°Ð²Ð»Ð¾";

    // Now convert utf-8 back to ANSI:
    wchar_t *wText2 = CodePageToUnicode(65001, text);

    char *ansiText = UnicodeToCodePage(1252, wText2);
    cout << ansiText;
    _getch();

}

也试过这个，但它不起作用

also tried this, but it's not working propery

int main()
{
    const char *orig = "ÐŸÐ°Ð²Ð»Ð¾";
    size_t origsize = strlen(orig) + 1;
    const size_t newsize = 100;
    size_t convertedChars = 0;
    wchar_t wcstring[newsize];
    mbstowcs_s(&convertedChars, wcstring, origsize, orig, _TRUNCATE);
    wcscat_s(wcstring, L" (wchar_t *)");

    std::wstring strUTF(wcstring);

    const wchar_t* szWCHAR = strUTF.c_str();

    cout << szWCHAR << '\n';


    char *buffer = new char[origsize / 2 + 1];

    WideCharToMultiByte(CP_ACP, 0, szWCHAR, -1, buffer, 256, NULL, NULL);

    cout << buffer;
    _getch();
}

如何将 UTF8 字符数组转换为 Windows 1252 字符数组 [英] How to convert UTF8 char array to Windows 1252 char array

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

如何将 UTF8 字符数组转换为 Windows 1252 字符数组 [英] How to convert UTF8 char array to Windows 1252 char array

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭