C＃Encoding.Converting拉丁语希伯来语 [英] C# Encoding.Converting Latin to Hebrew

查看：269 发布时间：2015/11/25 16:04:05 c# .net encoding hebrew

本文介绍了C＃Encoding.Converting拉丁语希伯来语的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想要获取和分析在线练成这是希伯来文写的，但不幸的是在非希伯来语编码文件。

作为一个例子，我尝试转换以下字符串：âìéåï_1，作为第一表名称使用C＃code希伯来语，但我不能这样做。

我知道上面是可变的，当我在记事本+ +和选择编码/字符集打开它，因为/希伯来文/ Windows的1255，我可以看到：גליון_1这是上面的正确的希伯来文重presentation字符串。

我用下面的code

 字符串str =âìéåï_1;

            编码窗口= Encoding.GetEncoding（的Windows-1255）;
            编码ASCII = Encoding.GetEncoding（视窗1252）;
            byte []的asciiBytes = ascii.GetBytes（STR）;
            byte []的windowsBytes = Encoding.Convert（ASCII，窗户，asciiBytes）;

            的char [] windowsChars =新的char [windows.GetCharCount（windowsBytes，0，windowsBytes.Length）];
            windows.GetChars（windowsBytes，0，windowsBytes.Length，windowsChars，0）;
            字符串windowsString =新的字符串（windowsChars）;

我认为起源字符串编码为Windows-1252时，我将其粘贴到记事本，因为++和更改编码到Windows 1252的字符串保持不变...

我可能做错了什么在这里，任何人都知道如何正确地转换成以上？

谢谢

米奇

解决方案

 常量字符串str =âìéåï_1;

编码latinEncoding = Encoding.GetEncoding（视窗1252）;
编码hebrewEncoding = Encoding.GetEncoding（视窗-1255）;

byte []的latinBytes = latinEncoding.GetBytes（STR）;

字符串hebrewString = hebrewEncoding.GetString（latinBytes）;

hebrewString：

גליון_1

在您提供的示例窗口1252不是actualy ASCII，这是扩展ASCII，出于某种原因， Encoding.Convert 这两个编码不能转换扩展范围ASCII码，因此所有+127字符被转换为63（即？）。当从一个扩展ASCII字符的byte []转换到另一个，我期望的字节数是一样的，只有当你将其转换为净UNI code字符串我希望他们是不同的。不知道为什么转换正在转换+127字符为？。

I'm trying to fetch and parse an online excel document which is written in hebrew but unfortunately in a non-hebrew encoding.

As an example I'm trying to convert the following string: "âìéåï_1", which serves as the 1st sheet name to hebrew using C# code, but I'm unable to do so.

I know the above is convertible, since when I open it up in NotePad++ and select Encoding/Character Sets/Hebrew/Windows 1255, I can see: "גליון_1" which is the correct hebrew representation of the above string.

I'm using the below code

            string str = "âìéåï_1";

            Encoding windows = Encoding.GetEncoding("Windows-1255");
            Encoding ascii = Encoding.GetEncoding("Windows-1252");
            byte[] asciiBytes = ascii.GetBytes(str);
            byte[] windowsBytes = Encoding.Convert(ascii, windows, asciiBytes);

            char[] windowsChars = new char[windows.GetCharCount(windowsBytes, 0, windowsBytes.Length)];
            windows.GetChars(windowsBytes, 0, windowsBytes.Length, windowsChars, 0);
            string windowsString = new string(windowsChars);

I assumed that the encoding of the origin string is Windows-1252 since when I paste it in NotePad++ and change the encoding to Windows-1252 the string remains the same...

I'm probably doing something wrong here, anyone know how to convert the above correctly?

Thanks,

Mikey

解决方案

const string Str = "âìéåï_1";

Encoding latinEncoding = Encoding.GetEncoding("Windows-1252");
Encoding hebrewEncoding = Encoding.GetEncoding("Windows-1255");

byte[] latinBytes = latinEncoding.GetBytes(Str);

string hebrewString = hebrewEncoding.GetString(latinBytes);

hebrewString:

גליון_1

In your supplied example "Window-1252" is not actualy ASCII, it is extended ASCII, and for some reason Encoding.Convert with these two encodings cannot convert extended range ASCII, so all +127 characters are converted as 63 (i.e. ?). When "converting" from one extended ASCII character byte[] to another, I would expect the bytes to be the same, it is only when you convert them to a .Net unicode string I would expect them to be different. Not sure why Convert is converting +127 chars to '?'.

这篇关于C＃Encoding.Converting拉丁语希伯来语的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

C＃Encoding.Converting拉丁语希伯来语 [英] C# Encoding.Converting Latin to Hebrew

问题描述

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

C＃Encoding.Converting拉丁语希伯来语 [英] C# Encoding.Converting Latin to Hebrew

问题描述

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭