最快要去除串字符的方式 [英] Fastest way to remove chars from string

查看:137
本文介绍了最快要去除串字符的方式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个字符串从中我必须删除以下字符:'\ r','\ N'和'\ T'。 我曾尝试三种不同的删除这些字符的方法和基准他们,所以我可以得到最快的解决方案。

以下是方法和有执行时间,当我跑他们的100万倍:

这应该是最快的解决方案,如果我有1或2个字符删除。但正如我把更多的字符,它开始采取更多的时间

 海峡= str.Replace(\ r,即的String.Empty).Replace(\ N的String.Empty).Replace(\ t的,的String.Empty );
 

执行时间= 1695

有关1或2个字符,这是慢然后与string.replace,但对于3字符它表现出更好的性能。

 的String []分割= str.Split(新的char [] {'\ T','\ r','\ N'},StringSplitOptions.None);
海峡= split.Aggregate<字符串>((STR1,STR2)=> STR1 + STR2);
 

执行时间= 1030

所有的最慢的,即使有1字符。也许我经常EX pression是不是最好的。

 海峡= Regex.Replace(STR,[\ r \ñ\ T]的String.Empty,RegexOptions.Compiled);
 

执行时间= 3500

以上是三种解决方案,我想出了。有没有更好的,更快的解决方案,任何人都知道这里,或者任何改善,我可以在这个code吗?

字符串,我用的标杆:

  StringBuilder的建设者=新的StringBuilder();
        builder.AppendFormat("{0}\r\n{1}\t\t\t\r\n{2}\t\r\n{3}\r\n{4}\t\t\r\n{5}\r\n{6}\r\n{7}\r\n{8}\r\n{9}",
         选择 ,
         [Extent1]。[客户] AS [客户],
         [Extent1]。[NameStyle] AS [NameStyle],
         [Extent1]。[标题](职称),
           [Extent1] [名字] AS [名字],
           [Extent1]。[MiddleName] AS [MiddleName],
           [Extent1] [姓氏] AS [姓氏],
           [Extent1]。[说明] AS [后缀],
           [Extent1]。[公司名称] AS [公司名称],
           [Extent1] [业务员] AS [业务员]);
        字符串str = builder.ToString();
 

解决方案

下面是尤伯杯快速不安​​全的版本,版本2。

 公共静态不安全字符串StripTabsAndNewlines(字符串s)
    {
        INT LEN = s.Length;
        字符* newChars = stackalloc的char [LEN]
        字符* currentChar = newChars;

        的for(int i = 0; I< LEN ++ I)
        {
            焦炭C = S [I]
            开关(三)
            {
                案'\ r':
                案'\ n'的
                案\ t:
                    继续;
                默认:
                    * currentChar ++ = C;
                    打破;
            }
        }
        返回新的字符串(newChars,0,(INT)(currentChar  -  newChars));
    }
 

和这里的基准测试(时间,脱衣百万串MS)

 cornerback84的与string.replace:9433
    安迪·韦斯特的String.Concat:4756
    AviJ的字符数组:1374
    马特豪厄尔斯'字符指针:1163 

I have a string from which I have to remove following char: '\r', '\n', and '\t'. I have tried three different ways of removing these char and benchmarked them so I can get the fastest solution.

Following are the methods and there execution time when I ran them 1000000 times:

It should be fastest solution if I have 1 or 2 char to remove. But as I put in more char, it starts to take more time

str = str.Replace("\r", string.Empty).Replace("\n", string.Empty).Replace("\t", string.Empty);

Execution time = 1695

For 1 or 2 char, this was slower then String.Replace, but for 3 char it showed better performance.

string[] split = str.Split(new char[] { '\t', '\r', '\n' }, StringSplitOptions.None);
str = split.Aggregate<string>((str1, str2) => str1 + str2);

Execution time = 1030

The slowest of all, even with 1 char. Maybe my regular expression is not the best.

str = Regex.Replace(str, "[\r\n\t]", string.Empty, RegexOptions.Compiled);

Execution time = 3500

These are the three solutions I came up with. Is there any better and faster solution that anyone here know, or any improvements I can do in this code?

String that I used for benchmarking:

StringBuilder builder = new StringBuilder();
        builder.AppendFormat("{0}\r\n{1}\t\t\t\r\n{2}\t\r\n{3}\r\n{4}\t\t\r\n{5}\r\n{6}\r\n{7}\r\n{8}\r\n{9}",
         "SELECT ",
         "[Extent1].[CustomerID] AS [CustomerID], ",
         "[Extent1].[NameStyle] AS [NameStyle], ",
         "[Extent1].[Title] AS [Title], ",
           "[Extent1].[FirstName] AS [FirstName], ",
           "[Extent1].[MiddleName] AS [MiddleName], ",
           "[Extent1].[LastName] AS [LastName], ",
           "[Extent1].[Suffix] AS [Suffix], ",
           "[Extent1].[CompanyName] AS [CompanyName], ",
           "[Extent1].[SalesPerson] AS [SalesPerson], ");
        string str = builder.ToString();

解决方案

Here's the uber-fast unsafe version, version 2.

    public static unsafe string StripTabsAndNewlines(string s)
    {
        int len = s.Length;
        char* newChars = stackalloc char[len];
        char* currentChar = newChars;

        for (int i = 0; i < len; ++i)
        {
            char c = s[i];
            switch (c)
            {
                case '\r':
                case '\n':
                case '\t':
                    continue;
                default:
                    *currentChar++ = c;
                    break;
            }
        }
        return new string(newChars, 0, (int)(currentChar - newChars));
    }

And here are the benchmarks (time to strip 1000000 strings in ms)

    cornerback84's String.Replace:         9433
    Andy West's String.Concat:             4756
    AviJ's char array:                     1374
    Matt Howells' char pointers:           1163

这篇关于最快要去除串字符的方式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆