如何在统一code字符转换为其对应的ASCII码 [英] How to convert a Unicode character to its ASCII equivalent

查看:105
本文介绍了如何在统一code字符转换为其对应的ASCII码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面的问题:

在C#我得到从旧式的ACCESS数据库的信息。 .NET移交的内容,以我之前转换数据库的内容(在这个问题上的字符串的情况下),以统一code。

In C# I'm getting information from a legacy ACCESS database. .NET converts the content of the database (in the case of this problem a string) to Unicode before handing the content to me.

我要如何转换这种统一code字符串返回到它对应的ASCII码?

How do I convert this Unicode string back to it's ASCII equivalent?


修改
统一code字符710的确是修饰符字母抑扬音符号。这里的问题多一点precise:


Edit
Unicode char 710 is indeed MODIFIER LETTER CIRCUMFLEX ACCENT. Here's the problem a bit more precise:

 -> (Extended) ASCII character ê (Extended ASCII 136) was inserted in the database.
 -> Either Access or the reading component in .NET converted this to U+02C6 U+0065
    (MODIFIER LETTER CIRCUMFLEX ACCENT + LATIN SMALL LETTER E)
 -> I need the (Extended) ASCII character 136 back.


以下是我已经试过(我现在明白为什么这不工作...):


Here's what I've tried (I see now why this did not work...):

string myInput = Convert.ToString(Convert.ToChar(710));
byte[] asBytes = Encoding.ASCII.GetBytes(myInput);

但是,这并不会导致94,但与价值63字节......
这是一个新的尝试,但它仍然无法正常工作:

But this does not result in 94 but a byte with value 63...
Here's a new try but it still does not work:

byte[] bytes = Encoding.ASCII.GetBytes("ê");


Soltution
由于这两个<一href="http://stackoverflow.com/questions/138449/how-to-convert-a-uni$c$c-character-to-its-extended-ascii-equivalent#138579">csgero和<一href="http://stackoverflow.com/questions/138449/how-to-convert-a-uni$c$c-character-to-its-extended-ascii-equivalent#138583">bzlm为指向正确的方向我解决了这个问题<一href="http://stackoverflow.com/questions/138449/how-to-convert-a-uni$c$c-character-to-its-ascii-equivalent#141816">here.


Soltution
Thanks to both csgero and bzlm for pointing in the right direction I solved the problem here.

推荐答案

好了,让我们来详细说明。这两个<一href="http://stackoverflow.com/questions/138449/how-to-convert-a-uni$c$c-character-to-its-extended-ascii-equivalent#138579">csgero和<一href="http://stackoverflow.com/questions/138449/how-to-convert-a-uni$c$c-character-to-its-extended-ascii-equivalent#138583">bzlm指出了正确的方向。

Okay, let's elaborate. Both csgero and bzlm pointed in the right direction.

由于我抬起头,在Windows 1252页的维基,发现它被称为codePAGE。维基百科的文章为 code页面其声明如下:

Because of blzm's reply I looked up the Windows-1252 page on wiki and found that it's called a codepage. The wikipedia article for Code page which stated the following:

没有正式的标准存在这些'扩展字符集; IBM仅仅提到了变种code页面,因为它的EBCDIC编码的变种也一直在做。

No formal standard existed for these ‘extended character sets’; IBM merely referred to the variants as code pages, as it had always done for variants of EBCDIC encodings.

这使我codePAGE 437:

This led me to codepage 437:

ñASCII兼容code页面,较低的128个字符保持其标准的US-ASCII值,和不同的页面(或者字符集)的上部128个字符可予提供。 DOS电脑专为北美市场为例,使用 code页437 ,其中包括重音需要法语,德语人物,和其他一些欧洲语言,以及一些图形线条绘制字符。

n ASCII-compatible code pages, the lower 128 characters maintained their standard US-ASCII values, and different pages (or sets of characters) could be made available in the upper 128 characters. DOS computers built for the North American market, for example, used code page 437, which included accented characters needed for French, German, and a few other European languages, as well as some graphical line-drawing characters.

所以,codePAGE 437是codePAGE我打电话'扩展ASCII',它有电子作为字符136,所以我查阅了一些其他的字符,以及他们似乎对的。

So, codepage 437 was the codepage I was calling 'extended ASCII', it had the ê as character 136 so I looked up some other chars as well and they seem right.

csgero随附Encoding.GetEncoding()暗示,我用它来创建下面的语句解决了我的问题:

csgero came with the Encoding.GetEncoding() hint, I used it to create the following statement which solves my problem:

byte[] bytes = Encoding.GetEncoding(437).GetBytes("ê");

这篇关于如何在统一code字符转换为其对应的ASCII码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆