与string.replace .NET框架的内存效率和性能 [英] Memory Efficiency and Performance of String.Replace .NET Framework
问题描述
string str1 = "12345ABC...\\...ABC100000";
// Hypothetically huge string of 100000 + Unicode Chars
str1 = str1.Replace("1", string.Empty);
str1 = str1.Replace("22", string.Empty);
str1 = str1.Replace("656", string.Empty);
str1 = str1.Replace("77ABC", string.Empty);
// ... this replace anti-pattern might happen with upto 50 consecutive lines of code.
str1 = str1.Replace("ABCDEFGHIJD", string.Empty);
我继承了一些code,做上述相同的片段。它需要一个庞大的字符串和替换(删除)不断较小的字符串从大的字符串。
I have inherited some code that does the same as the snippet above. It takes a huge string and replaces (removes) constant smaller strings from the large string.
我相信这是因为新的大型不可改变字符串被分配在内存中的每个取代,通过GC等待死亡很内存密集型的过程。
I believe this is a very memory intensive process given that new large immutable strings are being allocated in memory for each replace, awaiting death via the GC.
1。什么是更换这些值,忽略内存关注的最快方法是什么?
2。什么是殊途同归的结果的最有效的记忆方法?
我希望,这些都是相同的答案!
I am hoping that these are the same answer!
实用的解决方案,满足这些目标之间的某处也AP preciated。
Practical solutions that fit somewhere in between these goals are also appreciated.
假设:
- 全部替换为常数,预先知道
- 在底层人物确实含有一些单code [非ASCII]字符
推荐答案
所有的字符在.NET字符串是单code字符。你的意思是他们非ASCII?这应该不作任何赔率 - 除非你碰到组成问题,如一个E +锐音符当您尝试更换一个E急没有被替换。
All characters in a .NET string are "unicode chars". Do you mean they're non-ascii? That shouldn't make any odds - unless you run into composition issues, e.g. an "e + acute accent" not being replaced when you try to replace an "e acute".
您可以尝试使用一个普通的前pression与<一个href="http://msdn.microsoft.com/en-us/library/system.text.regularex$p$pssions.regex.replace.aspx"><$c$c>Regex.Replace$c$c>,或<一href="http://msdn.microsoft.com/en-us/library/system.text.stringbuilder.replace.aspx"><$c$c>StringBuilder.Replace$c$c>.下面是示例code做同样的事情既:
You could try using a regular expression with Regex.Replace
, or StringBuilder.Replace
. Here's sample code doing the same thing with both:
using System;
using System.Text;
using System.Text.RegularExpressions;
class Test
{
static void Main(string[] args)
{
string original = "abcdefghijkl";
Regex regex = new Regex("a|c|e|g|i|k", RegexOptions.Compiled);
string removedByRegex = regex.Replace(original, "");
string removedByStringBuilder = new StringBuilder(original)
.Replace("a", "")
.Replace("c", "")
.Replace("e", "")
.Replace("g", "")
.Replace("i", "")
.Replace("k", "")
.ToString();
Console.WriteLine(removedByRegex);
Console.WriteLine(removedByStringBuilder);
}
}
我不想去猜测这是更有效的 - 你必须标杆与您的具体应用。正则表达式的方法也许能够做到这一切在一通,但通会相对CPU密集型与每个许多内容替换StringBuilder的比较。
I wouldn't like to guess which is more efficient - you'd have to benchmark with your specific application. The regex way may be able to do it all in one pass, but that pass will be relatively CPU-intensive compared with each of the many replaces in StringBuilder.
这篇关于与string.replace .NET框架的内存效率和性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!