C#Encoding.Converting拉丁语希伯来语 [英] C# Encoding.Converting Latin to Hebrew
问题描述
我想要获取和分析在线练成这是希伯来文写的,但不幸的是在非希伯来语编码文件。
作为一个例子,我尝试转换以下字符串:âìéåï_1,作为第一表名称使用C#code希伯来语,但我不能这样做。
我知道上面是可变的,当我在记事本+ +和选择编码/字符集打开它,因为/希伯来文/ Windows的1255,我可以看到:גליון_1这是上面的正确的希伯来文重presentation字符串。
我用下面的code
字符串str =âìéåï_1;
编码窗口= Encoding.GetEncoding(的Windows-1255);
编码ASCII = Encoding.GetEncoding(视窗1252);
byte []的asciiBytes = ascii.GetBytes(STR);
byte []的windowsBytes = Encoding.Convert(ASCII,窗户,asciiBytes);
的char [] windowsChars =新的char [windows.GetCharCount(windowsBytes,0,windowsBytes.Length)];
windows.GetChars(windowsBytes,0,windowsBytes.Length,windowsChars,0);
字符串windowsString =新的字符串(windowsChars);
我认为起源字符串编码为Windows-1252时,我将其粘贴到记事本,因为++和更改编码到Windows 1252的字符串保持不变...
我可能做错了什么在这里,任何人都知道如何正确地转换成以上?
谢谢
米奇
常量字符串str =âìéåï_1;
编码latinEncoding = Encoding.GetEncoding(视窗1252);
编码hebrewEncoding = Encoding.GetEncoding(视窗-1255);
byte []的latinBytes = latinEncoding.GetBytes(STR);
字符串hebrewString = hebrewEncoding.GetString(latinBytes);
hebrewString:
גליון_1
在您提供的示例窗口1252不是actualy ASCII,这是扩展ASCII,出于某种原因, Encoding.Convert
这两个编码不能转换扩展范围ASCII码,因此所有+127字符被转换为63(即?)。当从一个扩展ASCII字符的byte []转换到另一个,我期望的字节数是一样的,只有当你将其转换为净UNI code字符串我希望他们是不同的。不知道为什么转换
正在转换+127字符为?。
I'm trying to fetch and parse an online excel document which is written in hebrew but unfortunately in a non-hebrew encoding.
As an example I'm trying to convert the following string: "âìéåï_1", which serves as the 1st sheet name to hebrew using C# code, but I'm unable to do so.
I know the above is convertible, since when I open it up in NotePad++ and select Encoding/Character Sets/Hebrew/Windows 1255, I can see: "גליון_1" which is the correct hebrew representation of the above string.
I'm using the below code
string str = "âìéåï_1";
Encoding windows = Encoding.GetEncoding("Windows-1255");
Encoding ascii = Encoding.GetEncoding("Windows-1252");
byte[] asciiBytes = ascii.GetBytes(str);
byte[] windowsBytes = Encoding.Convert(ascii, windows, asciiBytes);
char[] windowsChars = new char[windows.GetCharCount(windowsBytes, 0, windowsBytes.Length)];
windows.GetChars(windowsBytes, 0, windowsBytes.Length, windowsChars, 0);
string windowsString = new string(windowsChars);
I assumed that the encoding of the origin string is Windows-1252 since when I paste it in NotePad++ and change the encoding to Windows-1252 the string remains the same...
I'm probably doing something wrong here, anyone know how to convert the above correctly?
Thanks,
Mikey
const string Str = "âìéåï_1";
Encoding latinEncoding = Encoding.GetEncoding("Windows-1252");
Encoding hebrewEncoding = Encoding.GetEncoding("Windows-1255");
byte[] latinBytes = latinEncoding.GetBytes(Str);
string hebrewString = hebrewEncoding.GetString(latinBytes);
hebrewString:
גליון_1
In your supplied example "Window-1252" is not actualy ASCII, it is extended ASCII, and for some reason Encoding.Convert
with these two encodings cannot convert extended range ASCII, so all +127 characters are converted as 63 (i.e. ?). When "converting" from one extended ASCII character byte[] to another, I would expect the bytes to be the same, it is only when you convert them to a .Net unicode string I would expect them to be different. Not sure why Convert
is converting +127 chars to '?'.
这篇关于C#Encoding.Converting拉丁语希伯来语的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!