我怎样才能从一个字符串[]删除无字母字符? [英] How can i remove none alphabet chars from a string[]?

查看:130
本文介绍了我怎样才能从一个字符串[]删除无字母字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是代码:

StringBuilder sb = new StringBuilder();
Regex rgx = new Regex("[^a-zA-Z0-9 -]");

var words = Regex.Split(textBox1.Text, @"(?=(?<=[^\s])\s+\w)");
for (int i = 0; i < words.Length; i++)
{
    words[i] = rgx.Replace(words[i], "");
}

在即时通讯做的 Regex.Split()的话还包含字符串与字符里面〔实施例:

When im doing the Regex.Split() the words contain also strings with chars inside for exmaple:

丹尼尔>

您好:

\r\\\
New

你好------------------- --------

和我需要得到只有一行字没有所有的迹象

And i need to get only the words without all the signs

所以我试图用这个循环,但我最终在的话有很多地方与
和一些地方只 ------------------------

So i tried to use this loop but i end that in words there are many places with "" And some places with only ------------------------

,我不能用这个字符串后面我的代码。

And i cant use this as strings later in my code.

推荐答案

您并不需要一个正则表达式来清除非字母。这将删除所有非Unicode字母。

You don't need a regex to clear non-letters. This will remove all non-unicode letters.

public string RemoveNonUnicodeLetters(string input)
{
    StringBuilder sb = new StringBuilder();
    foreach(char c in input)
    {
        if(Char.IsLetter(c))
           sb.Append(c);
    }

    return sb.ToString();
}



另外,如果你只想让拉丁字母,你可以使用这个

Alternatively, if you only want to allow Latin letters, you can use this

public string RemoveNonLatinLetters(string input)
{
    StringBuilder sb = new StringBuilder();
    foreach(char c in input)
    {
        if(c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')
           sb.Append(c);
    }

    return sb.ToString();
}



基准VS正则表达式

public static string RemoveNonUnicodeLetters(string input)
{
       StringBuilder sb = new StringBuilder();
       foreach (char c in input)
       {
            if (Char.IsLetter(c))
                sb.Append(c);
       }

            return sb.ToString();
}



static readonly Regex nonUnicodeRx = new Regex("\\P{L}");

public static string RemoveNonUnicodeLetters2(string input)
{
     return nonUnicodeRx.Replace(input, "");
}


static void Main(string[] args)
{

    Stopwatch sw = new Stopwatch();

    StringBuilder sb = new StringBuilder();


    //generate guids as input
    for (int j = 0; j < 1000; j++)
    {
        sb.Append(Guid.NewGuid().ToString());
    }

    string input = sb.ToString();

    sw.Start();

    for (int i = 0; i < 1000; i++)
    {
        RemoveNonUnicodeLetters(input);
    }

    sw.Stop();
    Console.WriteLine("SM: " + sw.ElapsedMilliseconds);

    sw.Restart();
    for (int i = 0; i < 1000; i++)
    {
        RemoveNonUnicodeLetters2(input);
    }

    sw.Stop();
    Console.WriteLine("RX: " + sw.ElapsedMilliseconds);


}

输出(SM =字符串操作,RX =正则表达式)

Output (SM = String Manipulation, RX = Regex)

SM: 581
RX: 9882

SM: 545
RX: 9557

SM: 664
RX: 10196

这篇关于我怎样才能从一个字符串[]删除无字母字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆