如何在 C# 中将字符串转换为 UTF-8? [英] How can I transform string to UTF-8 in C#?

查看:854
本文介绍了如何在 C# 中将字符串转换为 UTF-8?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个从第三方应用程序收到的字符串,我想在我的 Windows Surface 上使用 C# 以任何语言正确显示它.

I have a string that I receive from a third party app and I would like to display it correctly in any language using C# on my Windows Surface.

由于编码不正确,我的一段字符串在西班牙语中看起来像这样:

Due to incorrect encoding, a piece of my string looks like this in Spanish:

Acción

而它应该是这样的:

动作

根据这个问题的答案:如何知道 C# 中的字符串编码,我收到的编码应该已经是 UTF-8,但它是在 Encoding.Default(可能是 ANSI?)上读取的.

According to the answer on this question: How to know string encoding in C#, the encoding I am receiving should be coming on UTF-8 already, but it is read on Encoding.Default (probably ANSI?).

我正在尝试将此字符串转换为真正的 UTF-8,但问题之一是我只能看到 Encoding 类的一个子集(仅限 UTF8 和 Unicode 属性),可能是因为我仅限于 windows表面 API.

I am trying to transform this string into real UTF-8, but one of the problems is that I can only see a subset of the Encoding class (UTF8 and Unicode properties only), probably because I'm limited to the windows surface API.

我尝试了一些我在互联网上找到的片段,但到目前为止,对于东方语言(即韩语),没有一个被证明是成功的.一个例子如下:

I have tried some snippets I've found on the internet, but none of them have proved successful so far for eastern languages (i.e. korean). One example is as follows:

var utf8 = Encoding.UTF8;
byte[] utfBytes = utf8.GetBytes(myString);
myString= utf8.GetString(utfBytes, 0, utfBytes.Length);     

我还尝试将字符串提取到字节数组中,然后使用 UTF8.GetString:

I also tried extracting the string into a byte array and then using UTF8.GetString:

byte[] myByteArray = new byte[myString.Length];
for (int ix = 0; ix < myString.Length; ++ix)
{
    char ch = myString[ix];
    myByteArray[ix] = (byte) ch;
}

myString = Encoding.UTF8.GetString(myByteArray, 0, myString.Length);

你们还有什么我可以尝试的想法吗?

Do you guys have any other ideas that I could try?

推荐答案

如您所知,字符串以 Encoding.Default 的形式出现,您可以简单地使用:

As you know the string is coming in as Encoding.Default you could simply use:

byte[] bytes = Encoding.Default.GetBytes(myString);
myString = Encoding.UTF8.GetString(bytes);

还有一点你可能要记住:如果你使用Console.WriteLine来输出一些字符串,那么你还应该写Console.OutputEncoding = System.Text.Encoding.UTF8;!!!或者所有的utf8字符串都会输出为gbk...

Another thing you may have to remember: If you are using Console.WriteLine to output some strings, then you should also write Console.OutputEncoding = System.Text.Encoding.UTF8;!!! Or all utf8 strings will be outputed as gbk...

这篇关于如何在 C# 中将字符串转换为 UTF-8?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆