阿拉伯语呈现形式B都支持在C# [英] Arabic presentation forms B support in c#

查看:203
本文介绍了阿拉伯语呈现形式B都支持在C#的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用C#中的编码的API我试图为从UTF-8将文件转换为阿拉伯语,1265 编码,但我遇到了一些字符不会转换正确,如لا一个奇怪的问题在下面的语句محمدصلاحعادل显示为محمدص?حعادل。我的一些朋友告诉我,这是因为这些字符是从阿拉伯语呈现形式B.我使用记事本++创建的文件并将其保存为UTF-8。

I was trying to convert a file from utf-8 to Arabic-1265 encoding using the Encoding APIs in C#, but I faced a strange problem that some characters are not converted correctly such as "لا" in the following statement "ﻣﺣﻣد ﺻﻼ ح عادل" it appears as "ﻣﺣﻣد ﺻ? ح عادل". Some of my friends told me that this is because these characters are from the Arabic Presentation Forms B. I create the file using notepad++ and save it as utf-8.

在这里是我使用

    StreamReader sr = new StreamReader(@"C:\utf-8.txt", Encoding.UTF8);
    string str = sr.ReadLine();
    StreamWriter sw = new StreamWriter(@"C:\windows-1256.txt", false, Encoding.GetEncoding("windows-1256"));
    sw.Write(str);
    sw.Flush();
    sw.Close();



不过,我不知道如何正确使用这种呈现形式在C#转换文件。

But, I don't know how to convert the file correctly using this presentation forms in C#.

推荐答案

是的,你的字符串包含大量无法在1256代码页中表示连字。你必须写它之前分解字符串。像这样的:

Yes, your string contains lots of ligatures that cannot be represented in the 1256 code page. You'll have to decompose the string before writing it. Like this:

  str = str.Normalize(NormalizationForm.FormKD);
  st.Write(str);

这篇关于阿拉伯语呈现形式B都支持在C#的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆