从UTF8转换为ASCII [英] Conversion from UTF8 to ASCII

查看:165
本文介绍了从UTF8转换为ASCII的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经从存储在UTF8编码一个XML文件中读取文本。 C#读取它完美,我与调试检查,但是当我尝试将其转换为ASCII将其保存在另一个文件中,我收到了?烧焦在存在冲突的字符位置。举例来说,这样的文字:

 字符串s =香格里拉introducciónmasiva德拉斯NUEVAS TECNOLOGIAS德拉资讯; 



将被保存为



 香格里拉introducci 5 N masiva德拉斯NUEVAS tecnolog?因为德拉informaci?N

我不能代替他们的拉丁(A,E,I,O,U)元音,因为在西班牙语中有些话会错过感。我已经尝试过这个和< A HREF =http://stackoverflow.com/questions/497782/how-to-convert-a-string-from-utf8-to-ascii-single-byte-in-c>此内容带着疑问没有sucess。因此,进出口希望有人能帮助我。 !即使编译第二个剪掉所选的答案......



在情况下,有人想看看,我的代码是这样的:

 私人无效WriteInput(字符串输入)
{
字节[]的字节数组= Encoding.UTF8.GetBytes(输入);
字节[] = asciiArray Encoding.Convert(Encoding.UTF8,Encoding.ASCII,字节);
串finalString = Encoding.ASCII.GetString(asciiArray);

串INPUTFILE = _idFile +。在
变种batchWriter =新的StreamWriter(INPUTFILE,假的,Encoding.ASCII);
batchWriter.Write(finalString);
batchWriter.Close();
}


解决方案

这些字符在没有映射ASCII。回顾一个ASCII表,如维基百科的,以验证这一点。你可能有兴趣在Windows 1252方式编码,或扩展ASCII,因为它有时也被称为,里面有代码点很多重音字符,西班牙语在内。

  VAR输入=香格里拉introducciónmasiva德拉斯NUEVAS TECNOLOGIAS德拉资讯; 
VAR utf8bytes = Encoding.UTF8.GetBytes(输入);
VAR win1252Bytes = Encoding.Convert(
Encoding.UTF8,Encoding.GetEncoding(窗口1252),utf8bytes);
File.WriteAllBytes(@foo.txt的,win1252Bytes);


I have a text read from a XML file stored in UTF8 encoding. C# reads it perfectly, I checked with the debugger, but when I try to convert it to ASCII to save it in another file I get a ? char in places where there was a conflicting character. For instance, this text:

string s = "La introducción masiva de las nuevas tecnologías de la información";

Will be saved as

"La introducci?n masiva de las nuevas tecnolog?as de la informaci?n"

I cannot just replace them for their latin (a, e, i, o, u) vowels because some words in spanish would miss the sense. I've already tried this and this questions with no sucess. So Im hoping someone can help me. The selected answer in the second one didnt even compiled...!

In case someone wants to take a look, my code is this one:

private void WriteInput( string input )
{
   byte[] byteArray = Encoding.UTF8.GetBytes(input);
   byte[] asciiArray = Encoding.Convert(Encoding.UTF8, Encoding.ASCII, byteArray);
   string finalString = Encoding.ASCII.GetString(asciiArray);

   string inputFile = _idFile + ".in";
   var batchWriter = new StreamWriter(inputFile, false, Encoding.ASCII);
   batchWriter.Write(finalString);
   batchWriter.Close();
}

解决方案

Those characters have no mapping in ASCII. Review an ASCII table, like Wikipedia's, to verify this. You might be interested in the Windows 1252 encoding, or "extended ASCII", as it's sometimes called, which has code points for many accented characters, Spanish included.

var input = "La introducción masiva de las nuevas tecnologías de la información";
var utf8bytes = Encoding.UTF8.GetBytes(input);
var win1252Bytes = Encoding.Convert(
                Encoding.UTF8, Encoding.GetEncoding("windows-1252"), utf8bytes);
File.WriteAllBytes(@"foo.txt", win1252Bytes);

这篇关于从UTF8转换为ASCII的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆