在C#替换字符(ASCII) [英] Replacing characters in C# (ascii)

查看:128
本文介绍了在C#替换字符(ASCII)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个文件,这样的特点:A,E,I,O,U - 一个。我需要做的是与普通字符替换那些字符如:A = A,E = E等.....这是我的code迄今:

I got a file with characters like these: à, è, ì, ò, ù - À. What i need to do is replace those characters with normal characters eg: à = a, è = e and so on..... This is my code so far:

StreamWriter sw = new StreamWriter(@"C:/JoinerOutput.csv");
string path = @"C:/Joiner.csv";
string line = File.ReadAllText(path);

if (line.Contains("à"))
{
    string asAscii = Encoding.ASCII.GetString(Encoding.Convert(Encoding.UTF8, Encoding.GetEncoding(Encoding.ASCII.EncodingName, new EncoderReplacementFallback("a"), new DecoderExceptionFallback()), Encoding.UTF8.GetBytes(line)));
    Console.WriteLine(asAscii);
    Console.ReadLine();

    sw.WriteLine(asAscii);
    sw.Flush();
}

基本上这个搜索特定的字符文件并与另一个替换它。那我遇到的问题是,我的if语句不起作用。我如何去解决呢?

Basically this searches the file for a specific character and replaces it with another. The problem that i am having is that my if statement doesn't work. How do i go about solving this?

这是输入文件的样本:


Dimàkàtso Mokgàlo
Màmà Ràtlàdi
Koos Nèl
Pàsèkà Modisè
Jèrèmiàh Morèmi
Khèthiwè Buthèlèzi
Tiànà Pillày
Viviàn Màswàngànyè
Thirèshàn Rèddy
Wàdè Cornèlius
ènos Nètshimbupfè

这是输出,如果使用:行= line.Replace('A','A');

This is the output if use : line = line.Replace('à', 'a'); :


Ch�rl�n� Kirst�n
M�m� R�tl�di
Koos N�l
P�s�k� Modis�
J�r�mi�h Mor�mi
Kh�thiw� Buth�l�zi
Ti�n� Pill�y
Vivi�n M�sw�ng�ny�
Thir�sh�n R�ddy
W�d� Corn�lius
�nos N�tshimbupf�

使用我的code标志将被全部删除

With my code the symbol will be removed completely

推荐答案

不知道它来写消息的LED显示屏,我们有如下替换上是有用的,但在一个内部工具(我敢肯定,有更智能的方式,使这项工作,为UNI code表,但是这一次就足够了这个小工具的内部):

Don't know if it is useful but in an internal tool to write message on a led screen we have the following replacements (i'm sure that there are more intelligent ways to make this work for the unicode tables, but this one is enough for this small internal tool) :

        strMessage = Regex.Replace(strMessage, "[éèëêð]", "e");
        strMessage = Regex.Replace(strMessage, "[ÉÈËÊ]", "E");
        strMessage = Regex.Replace(strMessage, "[àâä]", "a");
        strMessage = Regex.Replace(strMessage, "[ÀÁÂÃÄÅ]", "A");
        strMessage = Regex.Replace(strMessage, "[àáâãäå]", "a");
        strMessage = Regex.Replace(strMessage, "[ÙÚÛÜ]", "U");
        strMessage = Regex.Replace(strMessage, "[ùúûüµ]", "u");
        strMessage = Regex.Replace(strMessage, "[òóôõöø]", "o");
        strMessage = Regex.Replace(strMessage, "[ÒÓÔÕÖØ]", "O");
        strMessage = Regex.Replace(strMessage, "[ìíîï]", "i");
        strMessage = Regex.Replace(strMessage, "[ÌÍÎÏ]", "I");
        strMessage = Regex.Replace(strMessage, "[š]", "s");
        strMessage = Regex.Replace(strMessage, "[Š]", "S");
        strMessage = Regex.Replace(strMessage, "[ñ]", "n");
        strMessage = Regex.Replace(strMessage, "[Ñ]", "N");
        strMessage = Regex.Replace(strMessage, "[ç]", "c");
        strMessage = Regex.Replace(strMessage, "[Ç]", "C");
        strMessage = Regex.Replace(strMessage, "[ÿ]", "y");
        strMessage = Regex.Replace(strMessage, "[Ÿ]", "Y");
        strMessage = Regex.Replace(strMessage, "[ž]", "z");
        strMessage = Regex.Replace(strMessage, "[Ž]", "Z");
        strMessage = Regex.Replace(strMessage, "[Ð]", "D");
        strMessage = Regex.Replace(strMessage, "[œ]", "oe");
        strMessage = Regex.Replace(strMessage, "[Œ]", "Oe");
        strMessage = Regex.Replace(strMessage, "[«»\u201C\u201D\u201E\u201F\u2033\u2036]", "\"");
        strMessage = Regex.Replace(strMessage, "[\u2026]", "...");

有一点需要注意的是,如果在大多数语言文字仍是这样的处理后,可以理解它并不总是如此,往往会迫使读者指句子的上下文才能理解它。不是你想要的东西,如果你有选择的。

One thing to note is that if in most language the text is still understandable after such a treatment it's not always the case and will often force the reader to refer to the context of the sentence to be able to understand it. Not something you want if you have the choice.

请注意正确的解决办法是使用UNI code表,用他们的组合发音符号(S)+字形变音符号的综合替换字符,然后删除变音符号...

Note that the correct solution would be to use the unicode tables, replacing characters with integrated diacritics with their "combined diacritical mark(s)"+character form and then removing the diacritics...

这篇关于在C#替换字符(ASCII)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆