如何在C#中将字符串转换为UTF-8? [英] How can I transform string to UTF-8 in C#?

查看:257
本文介绍了如何在C#中将字符串转换为UTF-8?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个字符串,我从第三方应用程序收到,我想使用任何语言使用C#在我的Windows Surface上正确显示。

I have a string that I receive from a third party app and I would like to display it correctly in any language using C# on my Windows Surface.

由于编码不正确,我的一串字符串看起来像西班牙语:

Due to incorrect encoding, a piece of my string looks like this in Spanish:


Acción

Acción

,而应该如下所示:


Acción

Acción

根据这个问题的答案:
如何知道C#中的字符串编码,我正在收到的编码应该是在UTF-8上,但是它是在Encoding.Default(可能是ANSI?)上读取的。

According to the answer on this question: How to know string encoding in C#, the encoding I am receiving should be coming on UTF-8 already, but it is read on Encoding.Default (probably ANSI?).

我正在尝试将此字符串转换成真正的UTF-8,但是一个的问题是,我只能看到Encoding类的一个子集(仅适用于UTF8和Unicode属性),可能是因为我仅限于Windows表面API。

I am trying to transform this string into real UTF-8, but one of the problems is that I can only see a subset of the Encoding class (UTF8 and Unicode properties only), probably because I'm limited to the windows surface API.

我已经尝试了一些在互联网上发现的片段,但是迄今为止,对于东部的朗格没有一个证明是成功的年龄(即韩国)。一个例子如下:

I have tried some snippets I've found on the internet, but none of them have proved successful so far for eastern languages (i.e. korean). One example is as follows:

var utf8 = Encoding.UTF8;
byte[] utfBytes = utf8.GetBytes(myString);
myString= utf8.GetString(utfBytes, 0, utfBytes.Length);     

我还尝试将字符串解压缩成字节数组,然后使用UTF8.GetString:

I also tried extracting the string into a byte array and then using UTF8.GetString:

byte[] myByteArray = new byte[myString.Length];
for (int ix = 0; ix < myString.Length; ++ix)
{
    char ch = myString[ix];
    myByteArray[ix] = (byte) ch;
}

myString = Encoding.UTF8.GetString(myByteArray, 0, myString.Length);

你们有其他想法可以尝试吗?

Do you guys have any other ideas that I could try?

推荐答案

如你所知,字符串以 Encoding.Default 进入,您可以简单地使用:

As you know the string is coming in as Encoding.Default you could simply use:

byte[] bytes = Encoding.Default.GetBytes(myString);
myString = Encoding.UTF8.GetString(bytes);

另一件事你可能要记住:如果你正在使用Console.WriteLine输出一些字符串,那么你也应该写 Console.OutputEncoding = System.Text.Encoding.UTF8; !!!或者所有utf8字符串将被输出为gbk ...

Another thing you may have to remember: If you are using Console.WriteLine to output some strings, then you should also write Console.OutputEncoding = System.Text.Encoding.UTF8;!!! Or all utf8 strings will be outputed as gbk...

这篇关于如何在C#中将字符串转换为UTF-8?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆