转换"奇思妙想"在字符串罗马字符数个字符 [英] Converting "Bizarre" Chars in String to Roman Chars

查看:127
本文介绍了转换"奇思妙想"在字符串罗马字符数个字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要能够对用户的输入转换为[AZ]罗马字符只(不区分大小写)。所以,只有26个,我感兴趣的是人物。

I need to be able to convert user input to [a-z] roman characters ONLY (not case sensitive). So, there are only 26 characters that I am interested in.

但是,该用户可以键入在那些他们希望的任何字符形式。西班牙N,法国的E,和德国U都可以有用户输入口音(这是由程序中删除)。

However, the user can type in any "form" of those characters that they wish. The Spanish "n", the French "e", and the German "u" can all have accents from the user input (which are removed by the program).

我已经得到了pretty的接近与这两个扩展方法:

I've gotten pretty close with these two extension methods:

    public static string LettersOnly(this string Instring)
    {
        char[] aChar = Instring.ToCharArray();
        int intCount = 0;
        string strTemp = "";

        for (intCount = 0; intCount <= Instring.Length - 1; intCount++)
        {
            if (char.IsLetter(aChar[intCount]) )
            {
                strTemp += aChar[intCount];
            }
        }

        return strTemp;
    }

    public static string RemoveAccentMarks(this string s)
    {
        string normalizedString = s.Normalize(NormalizationForm.FormD);
        StringBuilder sb = new StringBuilder();

        char c;
        for (int i = 0; i <= normalizedString.Length - 1; i++)
        {
            c = normalizedString[i];
            if (System.Globalization.CharUnicodeInfo.GetUnicodeCategory(c) != System.Globalization.UnicodeCategory.NonSpacingMark)
            {
                sb.Append(c);
            }
        }

        return sb.ToString();
    }

下面是一个例子测试:

string input = "Àlièñ451";
input = input.LettersOnly().RemoveAccentMarks().ToLower();
console.WriteLine(input);

结果:外星人(预期)

这适用于99.9%的情况下。然而,一些人物似乎通过所有的检查。

This works for 99.9% of the cases. However, a few characters seem to pass all of the checks.

例如,SS(德国双S,我认为)。这被认为是由净是一个字母。这不是由上面的函数被认为有任何重音符号...但它仍然不是在z的范围内,就像我需要它。理想情况下,我可以将此转换为B或SS(以适用者为准),但我需要将其转换为一些在亚利桑那州的范围内。

For instance, "ß" (a German double-s, I think). This is considered by .Net to be a letter. This is not considered by the function above to have any accent marks... but it STILL isn't in the range of a-z, like I need it to be. Ideally, I could convert this to a "B" or an "ss" (whichever is appropriate), but I need to convert it to SOMETHING in the range of a-z.

又如,所述dipthong(AE)。此外,净认为这是一个信。功能上面没有看到任何的口音,但同样,它不是在罗马26个字母。在这种情况下,我需要转换为两个字母AE(我认为)。

Another example, the dipthong ("æ"). Again, .Net considers this a "letter". The function above doesn't see any accent, but again, it isn't in the roman 26 character alphabet. In this case, I need to convert to the two letters "ae" (I think).

有没有一种简单的方法来转换任何世界各地输入到最接近的罗马字母相同呢?据预计,这可能不会是一个完全干净的翻译,但我需要相信,输入在FlipScript.com只得到字符AZ ......而已。

Is there an easy way to convert ANY worldwide input to the closest roman alphabet equivalent? It is expected that this probably won't be a perfectly clean translation, but I need to trust that the inputs at FlipScript.com are ONLY getting the characters a-z... and nothing else.

任何和所有帮助AP preciated。

Any and all help appreciated.

推荐答案

如果我是你,我会建立一个字典,其中将包括来自国外的信件,罗马字母映射。我会用这个有两个原因:

If I were you, I'd create a Dictionary which would contain the mappings from foreign letters to Roman letters. I'd use this for two reasons:

  1. 这将使理解你想要做的更容易的人谁是读你的code什么。
  2. 有一个小的,有限的,这些特殊的字母数字,所以你不必担心维护的数据结构。

我把映射到一个XML文件,然后将它们加载到在运行时的数据结构。这样一来,你不需要修改任何code,它使用的字符,你只需要指定映射自己。

I'd put the mappings into an xml file then load them into the data structure at run-time. That way, you do not need to modify any code which uses the characters, you only need to specify the mappings themselves.

这篇关于转换&QUOT;奇思妙想&QUOT;在字符串罗马字符数个字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆