替换字符串中的多个字符,最快的方法? [英] Replacing multiple characters in a string, the fastest way?

查看:33
本文介绍了替换字符串中的多个字符,最快的方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将一些具有多个 string 字段的记录从旧数据库导入到新数据库.它似乎很慢,我怀疑是因为我这样做:

foreach (var oldObj in oldDB){NewObject newObj = new NewObject();newObj.Name = oldObj.Name.Trim().Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\', 'Đ');newObj.Surname = oldObj.Surname.Trim().Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\', 'Đ');newObj.Address = oldObj.Address.Trim().Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\', 'Đ');newObj.Note = oldObj.Note.Trim().Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\', 'Đ');/*... 一些处理 ...*/}

现在,我通过网络阅读了一些帖子和文章,在那里我看到了许多不同的想法.有人说如果我用 MatchEvaluator 做正则表达式会更好,有人说最好保持原样.

虽然我可能更容易为自己做一个基准案例,但我决定在这里提出一个问题,以防其他人一直想知道同样的问题,或者有人提前知道.

那么在 C# 中最快的方法是什么?

编辑

我已经在这里发布了基准测试.乍一看,Richard 的方式可能是最快的.然而,由于错误的正则表达式模式,他的方式,也不是 Marc 的方式.修正图案后

@"^@[]`}~{\"

@"^|@|[|]|`|}|~|{|\"

看起来好像链式 .Replace() 调用的旧方法毕竟是最快的

解决方案

感谢大家的投入.我写了一个快速而肮脏的基准测试来测试您的输入.我已经测试了用 500.000 次迭代解析 4 个字符串并完成了 4 次传递.结果如下:

<前>*** 通过 1Old (Chained String.Replace()) 方式在 814 毫秒内完成logicnp (ToCharArray) 方式在 916 毫秒内完成oleksii (StringBuilder) 方式在 943 毫秒内完成André Christoffer Andersen(Lambda w/Aggregate)方式在 2551 毫秒内完成Richard(Regex w/MatchEvaluator)方式在 215 毫秒内完成Marc Gravell(静态正则表达式)方式在 1008 毫秒内完成*** 通过 2旧(链式 String.Replace())方式在 786 毫秒内完成logicnp (ToCharArray) 方式在 920 毫秒内完成oleksii (StringBuilder) 方式在 905 毫秒内完成André Christoffer Andersen(Lambda w/Aggregate)方式在 2515 毫秒内完成Richard(Regex w/MatchEvaluator)方式在 217 毫秒内完成Marc Gravell(静态正则表达式)方式在 1025 毫秒内完成*** 通过 3旧的 (Chained String.Replace()) 方式在 775 毫秒内完成logicnp (ToCharArray) 方式在 903 毫秒内完成oleksii (StringBuilder) 方式在 931 毫秒内完成André Christoffer Andersen (Lambda w/Aggregate) 方式在 2529 毫秒内完成Richard(Regex w/MatchEvaluator)方式在 214 毫秒内完成Marc Gravell(静态正则表达式)方式在 1022 毫秒内完成*** 通过 4旧(链式 String.Replace())方式在 799 毫秒内完成logicnp (ToCharArray) 方式在 908 毫秒内完成oleksii (StringBuilder) 方式在 938 毫秒内完成André Christoffer Andersen (Lambda w/Aggregate) 方式在 2592 毫秒内完成Richard(Regex w/MatchEvaluator)方式在 225 毫秒内完成Marc Gravell(静态正则表达式)方式在 1050 毫秒内完成

此基准测试的代码如下.请查看代码并确认@Richard 获得了最快的方法.请注意,我没有检查输出是否正确,我假设它们是.

使用系统;使用 System.Collections.Generic;使用 System.Linq;使用 System.Text;使用 System.Diagnostics;使用 System.Text.RegularExpressions;命名空间 StringReplaceTest{课程计划{静态字符串 test1 = "A^@[BCD";静态字符串 test2 = "E]FGH\";静态字符串 test3 = "ijk`l}m";静态字符串 test4 = "nopq~{r";静态只读字典<字符,字符串>复制 =新字典<字符,字符串>{{'^', "Č"}, {'@', "Ž"}, {'[', "Š"}, {']', "Ć"}, {'`', "ž"},{'}', "ć"}, {'~', "č"}, {'{', "š"}, {'\', "Đ"}};静态只读正则表达式替换正则表达式;static Program()//静态初始化器{StringBuilder 模式 = new StringBuilder().Append('[');foreach(repl.Keys 中的 var 键)pattern.Append(Regex.Escape(key.ToString()));pattern.Append(']');replaceRegex = new Regex(pattern.ToString(), RegexOptions.Compiled);}公共静态字符串消毒(字符串输入){返回replaceRegex.Replace(输入,匹配=>{返回 repl[match.Value[0]];});}静态字符串 DoGeneralReplace(字符串输入){var sb = new StringBuilder(input);return sb.Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\', 'Đ').ToString();}//用映射替换字符的方法静态字符串替换(字符串输入,IDictionaryreplacementMap){返回替换Map.Keys.Aggregate(input, (current, oldChar)=>current.Replace(oldChar, replacementMap[oldChar]));}静态无效主(字符串 [] args){for (int i = 1; i <5; i++)做(一);}静态无效 DoIt(int n){秒表 sw = 新秒表();int idx = 0;Console.WriteLine("*** Pass" + n.ToString());//老方法sw.开始();for (idx = 0; idx <500000; idx++){string result1 = test1.Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\','ㄐ');string result2 = test2.Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\','ㄐ');string result3 = test3.Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\','ㄐ');string result4 = test4.Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\','ㄐ');}sw.停止();Console.WriteLine("Old (Chained String.Replace()) 方式在" + sw.ElapsedMilliseconds.ToString() + " ms");字典<字符,字符>替换 = 新字典();Replaces.Add('^', 'Č');Replaces.Add('@', 'Ž');Replaces.Add('[', 'Š');Replaces.Add(']', 'Ć');Replaces.Add('`', 'ž');Replaces.Add('}', 'ć');Replaces.Add('~', 'č');Replaces.Add('{', 'š');Replaces.Add('\', 'Đ');//逻辑np方式sw.Reset();sw.开始();for (idx = 0; idx <500000; idx++){char[] charArray1 = test1.ToCharArray();for (int i = 0; i < charArray1.Length; i++){char newChar;if (replacements.TryGetValue(test1[i], out newChar))charArray1[i] = newChar;}字符串 result1 = 新字符串(charArray1);char[] charArray2 = test2.ToCharArray();for (int i = 0; i < charArray2.Length; i++){char newChar;if (replacements.TryGetValue(test2[i], out newChar))charArray2[i] = newChar;}字符串 result2 = 新字符串(charArray2);char[] charArray3 = test3.ToCharArray();for (int i = 0; i < charArray3.Length; i++){char newChar;if (replacements.TryGetValue(test3[i], out newChar))charArray3[i] = newChar;}字符串 result3 = 新字符串(charArray3);char[] charArray4 = test4.ToCharArray();for (int i = 0; i < charArray4.Length; i++){char newChar;if (replacements.TryGetValue(test4[i], out newChar))charArray4[i] = newChar;}字符串 result4 = 新字符串(charArray4);}sw.停止();Console.WriteLine("logicnp(ToCharArray)方式在" + sw.ElapsedMilliseconds.ToString() + " ms");//oleksii 方式sw.Reset();sw.开始();for (idx = 0; idx <500000; idx++){字符串 result1 = DoGeneralReplace(test1);字符串 result2 = DoGeneralReplace(test2);字符串 result3 = DoGeneralReplace(test3);字符串 result4 = DoGeneralReplace(test4);}sw.停止();Console.WriteLine("oleksii(StringBuilder)方式在" + sw.ElapsedMilliseconds.ToString() + " ms");//安德烈·克里斯托弗·安徒生方式sw.Reset();sw.开始();for (idx = 0; idx <500000; idx++){字符串 result1 = 替换(测试 1,替换);字符串 result2 = 替换(测试 2,替换);字符串 result3 = 替换(测试 3,替换);字符串 result4 = 替换(测试 4,替换);}sw.停止();Console.WriteLine("André Christoffer Andersen (Lambda w/Aggregate) 以" + sw.ElapsedMilliseconds.ToString() + " ms" 方式完成);//理查德方式sw.Reset();sw.开始();Regex reg = new Regex(@"^|@|[|]|`|}|~|{|\");MatchEvaluator eval = 匹配 =>{开关(匹配.值){case "^": 返回 "Č";case "@": 返回 "Ž";case "[": 返回 "Š";case "]": 返回 "Ć";case "`": 返回 "ž";case "}": 返回 "ć";case "~": 返回 "č";case "{": 返回 "š";case "\": 返回 "Đ";默认值:throw new Exception("意外匹配!");}};for (idx = 0; idx <500000; idx++){字符串 result1 = reg.Replace(test1, eval);字符串 result2 = reg.Replace(test2, eval);字符串 result3 = reg.Replace(test3, eval);字符串 result4 = reg.Replace(test4, eval);}sw.停止();Console.WriteLine("Richard (Regex w/MatchEvaluator) 方式在" + sw.ElapsedMilliseconds.ToString() + " ms");//Marc Gravell 方式sw.Reset();sw.开始();for (idx = 0; idx <500000; idx++){字符串 result1 = 消毒(测试 1);字符串 result2 = Sanitize(test2);字符串 result3 = Sanitize(test3);字符串 result4 = Sanitize(test4);}sw.停止();Console.WriteLine("Marc Gravell (Static Regex) 方式在" + sw.ElapsedMilliseconds.ToString() + " ms
");}}}

编辑 2020 年 6 月
由于此问答仍然获得成功,因此我想使用带有 IndexOfAny 的 StringBuilder 使用来自 user1664043 的其他输入更新它,这次使用 .NET Core 3.1 编译,结果如下:

<前>*** 通过 1Old (Chained String.Replace()) 方式在 199 毫秒内完成logicnp (ToCharArray) 方式在 296 毫秒内完成oleksii (StringBuilder) 方式在 416 毫秒内完成André Christoffer Andersen (Lambda w/Aggregate) 方式在 870 毫秒内完成Richard(Regex w/MatchEvaluator)方式在 1722 毫秒内完成Marc Gravell(静态正则表达式)方式在 395 毫秒内完成user1664043 (StringBuilder w/IndexOfAny) 方式在 459 毫秒内完成*** 通过 2旧的 (Chained String.Replace()) 方式在 215 毫秒内完成logicnp (ToCharArray) 方式在 239 毫秒内完成oleksii (StringBuilder) 方式在 341 毫秒内完成André Christoffer Andersen (Lambda w/Aggregate) 方式在 758 毫秒内完成Richard(Regex w/MatchEvaluator)方式在 1591 毫秒内完成Marc Gravell(静态正则表达式)方式在 354 毫秒内完成user1664043 (StringBuilder w/IndexOfAny) 方式在 426 毫秒内完成*** 通过 3Old (Chained String.Replace()) 方式在 199 毫秒内完成logicnp (ToCharArray) 方式在 265 毫秒内完成oleksii (StringBuilder) 方式在 337 毫秒内完成André Christoffer Andersen(Lambda w/Aggregate)方式在 817 毫秒内完成Richard(Regex w/MatchEvaluator)方式在 1666 毫秒内完成Marc Gravell(静态正则表达式)方式在 373 毫秒内完成user1664043 (StringBuilder w/IndexOfAny) 方式在 412 毫秒内完成*** 通过 4Old (Chained String.Replace()) 方式在 199 毫秒内完成logicnp (ToCharArray) 方式在 230 毫秒内完成oleksii (StringBuilder) 方式在 324 毫秒内完成André Christoffer Andersen (Lambda w/Aggregate) 方式在 791 毫秒内完成Richard(Regex w/MatchEvaluator)方式在 1699 毫秒内完成Marc Gravell(静态正则表达式)方式在 359 毫秒内完成user1664043 (StringBuilder w/IndexOfAny) 方式在 413 毫秒内完成

以及更新后的代码:

使用系统;使用 System.Collections.Generic;使用 System.Diagnostics;使用 System.Linq;使用 System.Text;使用 System.Text.RegularExpressions;命名空间 Test.StringReplace{课程计划{静态字符串 test1 = "A^@[BCD";静态字符串 test2 = "E]FGH\";静态字符串 test3 = "ijk`l}m";静态字符串 test4 = "nopq~{r";静态只读字典<字符,字符串>复制 =新字典<字符,字符串>{{'^', "Č"}, {'@', "Ž"}, {'[', "Š"}, {']', "Ć"}, {'`', "ž"},{'}', "ć"}, {'~', "č"}, {'{', "š"}, {'\', "Đ"}};静态只读正则表达式替换正则表达式;static readonly char[] badChars = new char[] { '^', '@', '[', ']', '`', '}', '~', '{', '\' };static readonly char[] replacementChars = new char[] { 'Č', 'Ž', 'Š', 'Ć', 'ž', 'ć', 'č', 'š', 'Đ' };static Program()//静态初始化器{StringBuilder 模式 = new StringBuilder().Append('[');foreach(repl.Keys 中的 var 键)pattern.Append(Regex.Escape(key.ToString()));pattern.Append(']');replaceRegex = new Regex(pattern.ToString(), RegexOptions.Compiled);}公共静态字符串消毒(字符串输入){返回replaceRegex.Replace(输入,匹配=>{返回 repl[match.Value[0]];});}静态字符串 DoGeneralReplace(字符串输入){var sb = new StringBuilder(input);return sb.Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\', 'Đ').ToString();}//用映射替换字符的方法静态字符串替换(字符串输入,IDictionaryreplacementMap){返回替换Map.Keys.Aggregate(input, (current, oldChar)=>current.Replace(oldChar, replacementMap[oldChar]));}静态字符串 ReplaceCharsWithIndexOfAny(string sIn){int replChar = sIn.IndexOfAny(badChars);如果(replChar <0)返回 sIn;//甚至不要费心复制,除非你知道你有东西要交换StringBuilder sb = new StringBuilder(sIn, 0, replChar, sIn.Length + 10);while (replChar >= 0 && replChar < sIn.Length){var c = replacementChars[replChar];sb.Append(c);//////这种方法允许您将字符交换为字符串或删除一些字符//////如果您有用于字符交换的直字符,您可以将您的 repl 字符放在具有相同序数的数组中,并在与序数匹配的 2 行中完成所有操作.////c = c 开关////{////////case "^":////////c = "Č";////////...////'ufeff' =>空值,////_ =>替换字符[replChar],///};////if (c != null)////{////sb.Append(c);///}replChar++;//跳过我们刚刚替换的内容如果(replChar  0 ? nextRepChar : sIn.Length) - replChar);replChar = nextRepChar;}}返回 sb.ToString();}静态无效主(字符串 [] args){for (int i = 1; i <5; i++)做(一);}静态无效 DoIt(int n){秒表 sw = 新秒表();int idx = 0;Console.WriteLine("*** Pass" + n.ToString());//老方法sw.开始();for (idx = 0; idx <500000; idx++){string result1 = test1.Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\','ㄐ');string result2 = test2.Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\','ㄐ');string result3 = test3.Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\','ㄐ');string result4 = test4.Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\','ㄐ');}sw.停止();Console.WriteLine("Old (Chained String.Replace()) 方式在" + sw.ElapsedMilliseconds.ToString() + " ms");字典<字符,字符>替换 = 新字典();Replaces.Add('^', 'Č');Replaces.Add('@', 'Ž');Replaces.Add('[', 'Š');Replaces.Add(']', 'Ć');Replaces.Add('`', 'ž');Replaces.Add('}', 'ć');Replaces.Add('~', 'č');Replaces.Add('{', 'š');Replaces.Add('\', 'Đ');//逻辑np方式sw.Reset();sw.开始();for (idx = 0; idx <500000; idx++){char[] charArray1 = test1.ToCharArray();for (int i = 0; i < charArray1.Length; i++){char newChar;if (replacements.TryGetValue(test1[i], out newChar))charArray1[i] = newChar;}字符串 result1 = 新字符串(charArray1);char[] charArray2 = test2.ToCharArray();for (int i = 0; i < charArray2.Length; i++){char newChar;if (replacements.TryGetValue(test2[i], out newChar))charArray2[i] = newChar;}字符串 result2 = 新字符串(charArray2);char[] charArray3 = test3.ToCharArray();for (int i = 0; i < charArray3.Length; i++){char newChar;if (replacements.TryGetValue(test3[i], out newChar))charArray3[i] = newChar;}字符串 result3 = 新字符串(charArray3);char[] charArray4 = test4.ToCharArray();for (int i = 0; i < charArray4.Length; i++){char newChar;if (replacements.TryGetValue(test4[i], out newChar))charArray4[i] = newChar;}字符串 result4 = 新字符串(charArray4);}sw.停止();Console.WriteLine("logicnp(ToCharArray)方式在" + sw.ElapsedMilliseconds.ToString() + " ms");//oleksii 方式sw.Reset();sw.开始();for (idx = 0; idx <500000; idx++){字符串 result1 = DoGeneralReplace(test1);字符串 result2 = DoGeneralReplace(test2);字符串 result3 = DoGeneralReplace(test3);字符串 result4 = DoGeneralReplace(test4);}sw.停止();Console.WriteLine("oleksii(StringBuilder)方式在" + sw.ElapsedMilliseconds.ToString() + " ms");//安德烈·克里斯托弗·安徒生方式sw.Reset();sw.开始();for (idx = 0; idx <500000; idx++){字符串 result1 = 替换(测试 1,替换);字符串 result2 = 替换(测试 2,替换);字符串 result3 = 替换(测试 3,替换);字符串 result4 = 替换(测试 4,替换);}sw.停止();Console.WriteLine("André Christoffer Andersen (Lambda w/Aggregate) 以" + sw.ElapsedMilliseconds.ToString() + " ms" 方式完成);//理查德方式sw.Reset();sw.开始();Regex reg = new Regex(@"^|@|[|]|`|}|~|{|\");MatchEvaluator eval = 匹配 =>{开关(匹配.值){case "^": 返回 "Č";case "@": 返回 "Ž";case "[": 返回 "Š";case "]": 返回 "Ć";case "`": 返回 "ž";case "}": 返回 "ć";case "~": 返回 "č";case "{": 返回 "š";case "\": 返回 "Đ";默认值:throw new Exception("意外匹配!");}};for (idx = 0; idx <500000; idx++){字符串 result1 = reg.Replace(test1, eval);字符串 result2 = reg.Replace(test2, eval);字符串 result3 = reg.Replace(test3, eval);字符串 result4 = reg.Replace(test4, eval);}sw.停止();Console.WriteLine("Richard (Regex w/MatchEvaluator) 方式在" + sw.ElapsedMilliseconds.ToString() + " ms");//Marc Gravell 方式sw.Reset();sw.开始();for (idx = 0; idx <500000; idx++){字符串 result1 = 消毒(测试 1);字符串 result2 = Sanitize(test2);字符串 result3 = Sanitize(test3);字符串 result4 = Sanitize(test4);}sw.停止();Console.WriteLine("Marc Gravell (Static Regex) 方式在" + sw.ElapsedMilliseconds.ToString() + " ms");//user1664043方式sw.Reset();sw.开始();for (idx = 0; idx <500000; idx++){字符串 result1 = ReplaceCharsWithIndexOfAny(test1);字符串 result2 = ReplaceCharsWithIndexOfAny(test2);字符串 result3 = ReplaceCharsWithIndexOfAny(test3);字符串 result4 = ReplaceCharsWithIndexOfAny(test4);}sw.停止();Console.WriteLine("user1664043 (StringBuilder w/IndexOfAny) 方式在" + sw.ElapsedMilliseconds.ToString() + " ms
");}}}

I am importing some number of records with multiple string fields from an old db to a new db. It seems to be very slow and I suspect it's because I do this:

foreach (var oldObj in oldDB)
{
    NewObject newObj = new NewObject();
    newObj.Name = oldObj.Name.Trim().Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š')
        .Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć')
        .Replace('~', 'č').Replace('{', 'š').Replace('\', 'Đ');
    newObj.Surname = oldObj.Surname.Trim().Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š')
        .Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć')
        .Replace('~', 'č').Replace('{', 'š').Replace('\', 'Đ');
    newObj.Address = oldObj.Address.Trim().Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š')
        .Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć')
        .Replace('~', 'č').Replace('{', 'š').Replace('\', 'Đ');
    newObj.Note = oldObj.Note.Trim().Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š')
        .Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć')
        .Replace('~', 'č').Replace('{', 'š').Replace('\', 'Đ');
    /*
    ... some processing ...
    */
}

Now, I have read some posts and articles through the Net where I have seen many different thoughts about this. Some say it's better if I'd do regex with MatchEvaluator, some say it's the best to leave it as is.

While it's possible that it'd be easier for me to just do a benchmark case for myself, I decided to ask a question here in case someone else has been wondering about the same question, or in case someone knows in advance.

So what is the fastest way to do this in C#?

EDIT

I have posted the benchmark here. At the first sight it looks like Richard's way might be the fastest. However, his way, nor Marc's, would do anything because of the wrong Regex pattern. After correcting the pattern from

@"^@[]`}~{\" 

to

@"^|@|[|]|`|}|~|{|\" 

it appears as if the old way with chained .Replace() calls is the fastest after all

解决方案

Thanks for your inputs guys. I wrote a quick and dirty benchmark to test your inputs. I have tested parsing 4 strings with 500.000 iterations and have done 4 passes. The result is as follows:

*** Pass 1
Old (Chained String.Replace()) way completed in 814 ms
logicnp (ToCharArray) way completed in 916 ms
oleksii (StringBuilder) way completed in 943 ms
André Christoffer Andersen (Lambda w/ Aggregate) way completed in 2551 ms
Richard (Regex w/ MatchEvaluator) way completed in 215 ms
Marc Gravell (Static Regex) way completed in 1008 ms

*** Pass 2
Old (Chained String.Replace()) way completed in 786 ms
logicnp (ToCharArray) way completed in 920 ms
oleksii (StringBuilder) way completed in 905 ms
André Christoffer Andersen (Lambda w/ Aggregate) way completed in 2515 ms
Richard (Regex w/ MatchEvaluator) way completed in 217 ms
Marc Gravell (Static Regex) way completed in 1025 ms

*** Pass 3
Old (Chained String.Replace()) way completed in 775 ms
logicnp (ToCharArray) way completed in 903 ms
oleksii (StringBuilder) way completed in 931 ms
André Christoffer Andersen (Lambda w/ Aggregate) way completed in 2529 ms
Richard (Regex w/ MatchEvaluator) way completed in 214 ms
Marc Gravell (Static Regex) way completed in 1022 ms

*** Pass 4
Old (Chained String.Replace()) way completed in 799 ms
logicnp (ToCharArray) way completed in 908 ms
oleksii (StringBuilder) way completed in 938 ms
André Christoffer Andersen (Lambda w/ Aggregate) way completed in 2592 ms
Richard (Regex w/ MatchEvaluator) way completed in 225 ms
Marc Gravell (Static Regex) way completed in 1050 ms

The code for this benchmark is below. Please review the code and confirm that @Richard has got the fastest way. Note that I haven't checked if outputs were correct, I assumed they were.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;
using System.Text.RegularExpressions;

namespace StringReplaceTest
{
    class Program
    {
        static string test1 = "A^@[BCD";
        static string test2 = "E]FGH\";
        static string test3 = "ijk`l}m";
        static string test4 = "nopq~{r";

        static readonly Dictionary<char, string> repl =
            new Dictionary<char, string> 
            { 
                {'^', "Č"}, {'@', "Ž"}, {'[', "Š"}, {']', "Ć"}, {'`', "ž"}, {'}', "ć"}, {'~', "č"}, {'{', "š"}, {'\', "Đ"} 
            };

        static readonly Regex replaceRegex;

        static Program() // static initializer 
        {
            StringBuilder pattern = new StringBuilder().Append('[');
            foreach (var key in repl.Keys)
                pattern.Append(Regex.Escape(key.ToString()));
            pattern.Append(']');
            replaceRegex = new Regex(pattern.ToString(), RegexOptions.Compiled);
        }

        public static string Sanitize(string input)
        {
            return replaceRegex.Replace(input, match =>
            {
                return repl[match.Value[0]];
            });
        } 

        static string DoGeneralReplace(string input) 
        { 
            var sb = new StringBuilder(input);
            return sb.Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\', 'Đ').ToString(); 
        }

        //Method for replacing chars with a mapping 
        static string Replace(string input, IDictionary<char, char> replacementMap)
        {
            return replacementMap.Keys
                .Aggregate(input, (current, oldChar)
                    => current.Replace(oldChar, replacementMap[oldChar]));
        } 

        static void Main(string[] args)
        {
            for (int i = 1; i < 5; i++)
                DoIt(i);
        }

        static void DoIt(int n)
        {
            Stopwatch sw = new Stopwatch();
            int idx = 0;

            Console.WriteLine("*** Pass " + n.ToString());
            // old way
            sw.Start();
            for (idx = 0; idx < 500000; idx++)
            {
                string result1 = test1.Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\', 'Đ');
                string result2 = test2.Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\', 'Đ');
                string result3 = test3.Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\', 'Đ');
                string result4 = test4.Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\', 'Đ');
            }
            sw.Stop();
            Console.WriteLine("Old (Chained String.Replace()) way completed in " + sw.ElapsedMilliseconds.ToString() + " ms");

            Dictionary<char, char> replacements = new Dictionary<char, char>();
            replacements.Add('^', 'Č');
            replacements.Add('@', 'Ž');
            replacements.Add('[', 'Š');
            replacements.Add(']', 'Ć');
            replacements.Add('`', 'ž');
            replacements.Add('}', 'ć');
            replacements.Add('~', 'č');
            replacements.Add('{', 'š');
            replacements.Add('\', 'Đ');

            // logicnp way
            sw.Reset();
            sw.Start();
            for (idx = 0; idx < 500000; idx++)
            {
                char[] charArray1 = test1.ToCharArray();
                for (int i = 0; i < charArray1.Length; i++)
                {
                    char newChar;
                    if (replacements.TryGetValue(test1[i], out newChar))
                        charArray1[i] = newChar;
                }
                string result1 = new string(charArray1);

                char[] charArray2 = test2.ToCharArray();
                for (int i = 0; i < charArray2.Length; i++)
                {
                    char newChar;
                    if (replacements.TryGetValue(test2[i], out newChar))
                        charArray2[i] = newChar;
                }
                string result2 = new string(charArray2);

                char[] charArray3 = test3.ToCharArray();
                for (int i = 0; i < charArray3.Length; i++)
                {
                    char newChar;
                    if (replacements.TryGetValue(test3[i], out newChar))
                        charArray3[i] = newChar;
                }
                string result3 = new string(charArray3);

                char[] charArray4 = test4.ToCharArray();
                for (int i = 0; i < charArray4.Length; i++)
                {
                    char newChar;
                    if (replacements.TryGetValue(test4[i], out newChar))
                        charArray4[i] = newChar;
                }
                string result4 = new string(charArray4);
            }
            sw.Stop();
            Console.WriteLine("logicnp (ToCharArray) way completed in " + sw.ElapsedMilliseconds.ToString() + " ms");

            // oleksii way
            sw.Reset();
            sw.Start();
            for (idx = 0; idx < 500000; idx++)
            {
                string result1 = DoGeneralReplace(test1);
                string result2 = DoGeneralReplace(test2);
                string result3 = DoGeneralReplace(test3);
                string result4 = DoGeneralReplace(test4);
            }
            sw.Stop();
            Console.WriteLine("oleksii (StringBuilder) way completed in " + sw.ElapsedMilliseconds.ToString() + " ms");

            // André Christoffer Andersen way
            sw.Reset();
            sw.Start();
            for (idx = 0; idx < 500000; idx++)
            {
                string result1 = Replace(test1, replacements);
                string result2 = Replace(test2, replacements);
                string result3 = Replace(test3, replacements);
                string result4 = Replace(test4, replacements);
            }
            sw.Stop();
            Console.WriteLine("André Christoffer Andersen (Lambda w/ Aggregate) way completed in " + sw.ElapsedMilliseconds.ToString() + " ms");

            // Richard way
            sw.Reset();
            sw.Start();
            Regex reg = new Regex(@"^|@|[|]|`|}|~|{|\");
            MatchEvaluator eval = match =>
            {
                switch (match.Value)
                {
                    case "^": return "Č";
                    case "@": return "Ž";
                    case "[": return "Š";
                    case "]": return "Ć";
                    case "`": return "ž";
                    case "}": return "ć";
                    case "~": return "č";
                    case "{": return "š";
                    case "\": return "Đ";
                    default: throw new Exception("Unexpected match!");
                }
            };
            for (idx = 0; idx < 500000; idx++)
            {
                string result1 = reg.Replace(test1, eval);
                string result2 = reg.Replace(test2, eval);
                string result3 = reg.Replace(test3, eval);
                string result4 = reg.Replace(test4, eval);
            }
            sw.Stop();
            Console.WriteLine("Richard (Regex w/ MatchEvaluator) way completed in " + sw.ElapsedMilliseconds.ToString() + " ms");

            // Marc Gravell way
            sw.Reset();
            sw.Start();
            for (idx = 0; idx < 500000; idx++)
            {
                string result1 = Sanitize(test1);
                string result2 = Sanitize(test2);
                string result3 = Sanitize(test3);
                string result4 = Sanitize(test4);
            }
            sw.Stop();
            Console.WriteLine("Marc Gravell (Static Regex) way completed in " + sw.ElapsedMilliseconds.ToString() + " ms
");
        }
    }
}

EDIT June 2020
Since this Q&A is still getting hits, I wanted to update it with additional input from user1664043 using StringBuilder w/ IndexOfAny, this time compiled using .NET Core 3.1, and here are the results:

*** Pass 1
Old (Chained String.Replace()) way completed in 199 ms
logicnp (ToCharArray) way completed in 296 ms
oleksii (StringBuilder) way completed in 416 ms
André Christoffer Andersen (Lambda w/ Aggregate) way completed in 870 ms
Richard (Regex w/ MatchEvaluator) way completed in 1722 ms
Marc Gravell (Static Regex) way completed in 395 ms
user1664043 (StringBuilder w/ IndexOfAny) way completed in 459 ms

*** Pass 2
Old (Chained String.Replace()) way completed in 215 ms
logicnp (ToCharArray) way completed in 239 ms
oleksii (StringBuilder) way completed in 341 ms
André Christoffer Andersen (Lambda w/ Aggregate) way completed in 758 ms
Richard (Regex w/ MatchEvaluator) way completed in 1591 ms
Marc Gravell (Static Regex) way completed in 354 ms
user1664043 (StringBuilder w/ IndexOfAny) way completed in 426 ms

*** Pass 3
Old (Chained String.Replace()) way completed in 199 ms
logicnp (ToCharArray) way completed in 265 ms
oleksii (StringBuilder) way completed in 337 ms
André Christoffer Andersen (Lambda w/ Aggregate) way completed in 817 ms
Richard (Regex w/ MatchEvaluator) way completed in 1666 ms
Marc Gravell (Static Regex) way completed in 373 ms
user1664043 (StringBuilder w/ IndexOfAny) way completed in 412 ms

*** Pass 4
Old (Chained String.Replace()) way completed in 199 ms
logicnp (ToCharArray) way completed in 230 ms
oleksii (StringBuilder) way completed in 324 ms
André Christoffer Andersen (Lambda w/ Aggregate) way completed in 791 ms
Richard (Regex w/ MatchEvaluator) way completed in 1699 ms
Marc Gravell (Static Regex) way completed in 359 ms
user1664043 (StringBuilder w/ IndexOfAny) way completed in 413 ms

And the updated code:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;

namespace Test.StringReplace
{
    class Program
    {
        static string test1 = "A^@[BCD";
        static string test2 = "E]FGH\";
        static string test3 = "ijk`l}m";
        static string test4 = "nopq~{r";

        static readonly Dictionary<char, string> repl =
            new Dictionary<char, string>
            {
                {'^', "Č"}, {'@', "Ž"}, {'[', "Š"}, {']', "Ć"}, {'`', "ž"}, {'}', "ć"}, {'~', "č"}, {'{', "š"}, {'\', "Đ"}
            };

        static readonly Regex replaceRegex;

        static readonly char[] badChars = new char[] { '^', '@', '[', ']', '`', '}', '~', '{', '\' };

        static readonly char[] replacementChars = new char[] { 'Č', 'Ž', 'Š', 'Ć', 'ž', 'ć', 'č', 'š', 'Đ' };

        static Program() // static initializer 
        {
            StringBuilder pattern = new StringBuilder().Append('[');
            foreach (var key in repl.Keys)
                pattern.Append(Regex.Escape(key.ToString()));
            pattern.Append(']');
            replaceRegex = new Regex(pattern.ToString(), RegexOptions.Compiled);
        }

        public static string Sanitize(string input)
        {
            return replaceRegex.Replace(input, match =>
            {
                return repl[match.Value[0]];
            });
        }

        static string DoGeneralReplace(string input)
        {
            var sb = new StringBuilder(input);
            return sb.Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\', 'Đ').ToString();
        }

        //Method for replacing chars with a mapping 
        static string Replace(string input, IDictionary<char, char> replacementMap)
        {
            return replacementMap.Keys
                .Aggregate(input, (current, oldChar)
                    => current.Replace(oldChar, replacementMap[oldChar]));
        }

        static string ReplaceCharsWithIndexOfAny(string sIn)
        {
            int replChar = sIn.IndexOfAny(badChars);
            if (replChar < 0)
                return sIn;

            // Don't even bother making a copy unless you know you have something to swap
            StringBuilder sb = new StringBuilder(sIn, 0, replChar, sIn.Length + 10);
            while (replChar >= 0 && replChar < sIn.Length)
            {
                var c = replacementChars[replChar];
                sb.Append(c);

                ////// This approach lets you swap a char for a string or to remove some
                ////// If you had a straight char for char swap, you could just have your repl chars in an array with the same ordinals and do it all in 2 lines matching the ordinals.
                ////c = c switch
                ////{
                ////    ////case "^":
                ////    ////    c = "Č";
                ////    ////    ...
                ////    'ufeff' => null,
                ////    _ => replacementChars[replChar],
                ////};

                ////if (c != null)
                ////{
                ////    sb.Append(c);
                ////}

                replChar++; // skip over what we just replaced
                if (replChar < sIn.Length)
                {
                    int nextRepChar = sIn.IndexOfAny(badChars, replChar);
                    sb.Append(sIn, replChar, (nextRepChar > 0 ? nextRepChar : sIn.Length) - replChar);
                    replChar = nextRepChar;
                }
            }

            return sb.ToString();
        }

        static void Main(string[] args)
        {
            for (int i = 1; i < 5; i++)
                DoIt(i);
        }

        static void DoIt(int n)
        {
            Stopwatch sw = new Stopwatch();
            int idx = 0;

            Console.WriteLine("*** Pass " + n.ToString());
            // old way
            sw.Start();
            for (idx = 0; idx < 500000; idx++)
            {
                string result1 = test1.Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\', 'Đ');
                string result2 = test2.Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\', 'Đ');
                string result3 = test3.Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\', 'Đ');
                string result4 = test4.Replace('^', 'Č').Replace('@', 'Ž').Replace('[', 'Š').Replace(']', 'Ć').Replace('`', 'ž').Replace('}', 'ć').Replace('~', 'č').Replace('{', 'š').Replace('\', 'Đ');
            }

            sw.Stop();
            Console.WriteLine("Old (Chained String.Replace()) way completed in " + sw.ElapsedMilliseconds.ToString() + " ms");

            Dictionary<char, char> replacements = new Dictionary<char, char>();
            replacements.Add('^', 'Č');
            replacements.Add('@', 'Ž');
            replacements.Add('[', 'Š');
            replacements.Add(']', 'Ć');
            replacements.Add('`', 'ž');
            replacements.Add('}', 'ć');
            replacements.Add('~', 'č');
            replacements.Add('{', 'š');
            replacements.Add('\', 'Đ');

            // logicnp way
            sw.Reset();
            sw.Start();
            for (idx = 0; idx < 500000; idx++)
            {
                char[] charArray1 = test1.ToCharArray();
                for (int i = 0; i < charArray1.Length; i++)
                {
                    char newChar;
                    if (replacements.TryGetValue(test1[i], out newChar))
                        charArray1[i] = newChar;
                }

                string result1 = new string(charArray1);

                char[] charArray2 = test2.ToCharArray();
                for (int i = 0; i < charArray2.Length; i++)
                {
                    char newChar;
                    if (replacements.TryGetValue(test2[i], out newChar))
                        charArray2[i] = newChar;
                }

                string result2 = new string(charArray2);

                char[] charArray3 = test3.ToCharArray();
                for (int i = 0; i < charArray3.Length; i++)
                {
                    char newChar;
                    if (replacements.TryGetValue(test3[i], out newChar))
                        charArray3[i] = newChar;
                }

                string result3 = new string(charArray3);

                char[] charArray4 = test4.ToCharArray();
                for (int i = 0; i < charArray4.Length; i++)
                {
                    char newChar;
                    if (replacements.TryGetValue(test4[i], out newChar))
                        charArray4[i] = newChar;
                }

                string result4 = new string(charArray4);
            }

            sw.Stop();
            Console.WriteLine("logicnp (ToCharArray) way completed in " + sw.ElapsedMilliseconds.ToString() + " ms");

            // oleksii way
            sw.Reset();
            sw.Start();
            for (idx = 0; idx < 500000; idx++)
            {
                string result1 = DoGeneralReplace(test1);
                string result2 = DoGeneralReplace(test2);
                string result3 = DoGeneralReplace(test3);
                string result4 = DoGeneralReplace(test4);
            }

            sw.Stop();
            Console.WriteLine("oleksii (StringBuilder) way completed in " + sw.ElapsedMilliseconds.ToString() + " ms");

            // André Christoffer Andersen way
            sw.Reset();
            sw.Start();
            for (idx = 0; idx < 500000; idx++)
            {
                string result1 = Replace(test1, replacements);
                string result2 = Replace(test2, replacements);
                string result3 = Replace(test3, replacements);
                string result4 = Replace(test4, replacements);
            }

            sw.Stop();
            Console.WriteLine("André Christoffer Andersen (Lambda w/ Aggregate) way completed in " + sw.ElapsedMilliseconds.ToString() + " ms");

            // Richard way
            sw.Reset();
            sw.Start();
            Regex reg = new Regex(@"^|@|[|]|`|}|~|{|\");
            MatchEvaluator eval = match =>
            {
                switch (match.Value)
                {
                    case "^": return "Č";
                    case "@": return "Ž";
                    case "[": return "Š";
                    case "]": return "Ć";
                    case "`": return "ž";
                    case "}": return "ć";
                    case "~": return "č";
                    case "{": return "š";
                    case "\": return "Đ";
                    default: throw new Exception("Unexpected match!");
                }
            };
            for (idx = 0; idx < 500000; idx++)
            {
                string result1 = reg.Replace(test1, eval);
                string result2 = reg.Replace(test2, eval);
                string result3 = reg.Replace(test3, eval);
                string result4 = reg.Replace(test4, eval);
            }

            sw.Stop();
            Console.WriteLine("Richard (Regex w/ MatchEvaluator) way completed in " + sw.ElapsedMilliseconds.ToString() + " ms");

            // Marc Gravell way
            sw.Reset();
            sw.Start();
            for (idx = 0; idx < 500000; idx++)
            {
                string result1 = Sanitize(test1);
                string result2 = Sanitize(test2);
                string result3 = Sanitize(test3);
                string result4 = Sanitize(test4);
            }

            sw.Stop();
            Console.WriteLine("Marc Gravell (Static Regex) way completed in " + sw.ElapsedMilliseconds.ToString() + " ms");

            // user1664043 way
            sw.Reset();
            sw.Start();
            for (idx = 0; idx < 500000; idx++)
            {
                string result1 = ReplaceCharsWithIndexOfAny(test1);
                string result2 = ReplaceCharsWithIndexOfAny(test2);
                string result3 = ReplaceCharsWithIndexOfAny(test3);
                string result4 = ReplaceCharsWithIndexOfAny(test4);
            }

            sw.Stop();
            Console.WriteLine("user1664043 (StringBuilder w/ IndexOfAny) way completed in " + sw.ElapsedMilliseconds.ToString() + " ms
");
        }
    }
}

这篇关于替换字符串中的多个字符,最快的方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆