C#:与相同内容的字符串 [英] C#: Strings with same contents

查看:186
本文介绍了C#:与相同内容的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经听到和读到一个字符串不能改变的(不可变的?)。这应该是正确的我猜。但我也听说两个字符串具有相同内容的共享相同的内存空间(或者你叫什么)。这是正确的?

I have heard and read that a string can not be changed (immutable?). That should be correct I guess. But I have also heard that two strings with the same contents share the same memory-space (or what you call it). Is this correct?

如果是这样,这是否意味着,如果我创造数以千计的字符串列表,就不会真正占用太多的空间可言,如果大多数这些字符串都是彼此相等?

And if so, does that mean that if I create a List with thousands of strings, it wouldn't really take up much space at all if most of those strings were equal to each other?

推荐答案

编辑:在下面我简称实习生池是AppDomain中特定的答案;我是pretty肯定这就是我之前观察到的,但是MSDN文档为的中的String.intern 表明,有整个过程的单一实习生池,使得这个更重要。

In the answer below I've referred to the intern pool as being AppDomain-specific; I'm pretty sure that's what I've observed before, but the MSDN docs for String.Intern suggest that there's a single intern pool for the whole process, making this even more important.

原来的答复

(我要添加为一个评论,但我认为这是一个足够重要点需要一个额外的答案... ...)

(I was going to add this as a comment, but I think it's an important enough point to need an extra answer...)

正如其他人所解释的那样,串实习发生的所有字符串,但不能在动态创建字符串(例如那些从数据库或文件,读取或建成使用的StringBuilder 的String.Format

As others have explained, string interning occurs for all string literals, but not on "dynamically created" strings (e.g. those read from a database or file, or built using StringBuilder or String.Format.)

不过,我的不会的建议呼吁中的String.intern 来避开后一点:它会填充实习生池的对于生命周期的的AppDomain 的。相反,使用一个游泳池这是本地的只是你的使用情况。这里有这样一个池的例子:

However, I wouldn't suggest calling String.Intern to get round the latter point: it will populate the intern pool for the lifetime of your AppDomain. Instead, use a pool which is local to just your usage. Here's an example of such a pool:

public class StringPool
{
    private readonly Dictionary<string,string> contents =
        new Dictionary<string,string>();

    public string Add(string item)
    {
        string ret;
        if (!contents.TryGetValue(item, out ret))
        {
            contents[item] = item;
            ret = item;
        }
        return ret;
    }
}

您最好就用这样的:

string data = pool.Add(ReadItemFromDatabase());

(注意,游泳池不是线程安全的;正常使用不会需要它)

(Note that the pool isn't thread-safe; normal usage wouldn't need it to be.)

这样你可以只要你不再需要它扔掉你的游泳池,而不是潜在的大量内存中的字符串永远。你也可以把它变得更聪明,实现一个LRU缓存或东西,如果你真的想。

This way you can throw away your pool as soon as you no longer need it, rather than having a potentially large number of strings in memory forever. You could also make it smarter, implementing an LRU cache or something if you really wanted to.

编辑:只是为了澄清为什么这比使用更好的中的String.intern ...假设你从数据库中读取一串字符串或日志文件,其中的过程,然后移动到另一项任务。如果你调用中的String.intern 对这些字符串,他们会的从不的是,只要收集你的的AppDomain垃圾是活的 - 可能甚至没有然后。如果您加载多个不同的日志文件,你会逐渐积累在你的实习生池串,直到完成或耗尽内存。相反,我建议这样的模式:

Just to clarify why this is better than using String.Intern... suppose you read a bunch of strings from a database or log file, process them, and then move onto another task. If you call String.Intern on those strings, they will never be garbage collected as long as your AppDomain is alive - and possibly not even then. If you load several different log files, you'll gradually accumulate strings in your intern pool until you either finish or run out of memory. Instead, I'm suggesting a pattern like this:

void ProcessLogFile(string file)
{
    StringPool pool = new StringPool();
    // Process the log file using strings in the pool
} // The pool can now be garbage collected

在这里,你会得到多个字符串的效益在同一个文件只在内存中存在一次(或至少,只获得过去gen0一次),但你不要污染全球性的资源(实习生池)。

Here you get the benefit of multiple strings in the same file only existing once in memory (or at least, only getting past gen0 once) but you don't pollute a "global" resource (the intern pool).

这篇关于C#:与相同内容的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆