C ++字符串内存管理 [英] C++ string memory management

查看:81
本文介绍了C ++字符串内存管理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

上周,我用C#编写了几行代码,以将一个大文本文件(300,000行)启动到Dictionary中.花费了十分钟的时间,不到一秒钟就执行了.

Last week I wrote a few lines of code in C# to fire up a large text file (300,000 lines) into a Dictionary. It took ten minutes to write and it executed in less than a second.

现在,我要将这段代码转换为C ++(因为我需要在旧的C ++ COM对象中使用它).到目前为止,我已经花了两天时间. :-(尽管生产率的差异本身令人震惊,但我仍然需要对它的性能提出一些建议.

Now I'm converting that piece of code into C++ (because I need it in an old C++ COM object). I've spent two days on it this far. :-( Although the productivity difference is shocking on its own, it's the performance that I would need some advice on.

加载需要七秒钟,甚至更糟:此后释放所有CStringW的时间恰好就是这么多.这是不可接受的,我必须找到一种提高性能的方法.

It takes seven seconds to load, and even worse: it takes just exactly that much time to free all the CStringWs afterwards. This is not acceptable, and I must find a way to increase the performance.

我是否有机会分配这么多字符串而不会看到这种可怕的性能下降?

Are there any chance that I can allocate this many strings without seeing this horrible performace degradation?

我现在的猜测是,我必须将所有文本填充到一个大数组中,然后让我的哈希表指向该数组中每个字符串的开头,并放下CStringW东西.

My guess right now is that I'll have to stuff all the text into a large array and then let my hash table point to the beginning of each string within this array and drop the CStringW stuff.

但是在此之前,您的C ++专家有什么建议吗?

But before that, any advice from you C++ experts out there?

编辑:下面是我对自己的回答.我意识到这对我来说是最快的路线,并且朝着认为正确的方向的方向发展-朝着更多托管代码的方向迈进.

EDIT: My answer to myself is given below. I realized that that is the fastest route for me, and also step in what I consider the right direction - towards more managed code.

推荐答案

您正在步入Raymond Chen的行列.他做了完全一样的事情,用不受管理的C ++编写了中文字典. Rico Mariani也这样做,用C#编写.马里亚尼先生做了一个版本. Chen先生写了6个版本,试图与Mariani版本的表现相吻合.他几乎重写了C/C ++运行时库的重要部分以达到目标.

You are stepping into the shoes of Raymond Chen. He did the exact same thing, writing a Chinese dictionary in unmanaged C++. Rico Mariani did too, writing it in C#. Mr. Mariani made one version. Mr. Chen wrote 6 versions, trying to match the perf of Mariani's version. He pretty much rewrote significant chunks of the C/C++ runtime library to get there.

此后,托管代码得到了更多的尊重. GC分配器是无法击败的.检查此博客发布链接.这篇博客帖子可能也会引起您的兴趣,对了解STL的价值很有帮助语义是问题的一部分.

Managed code got a lot more respect after that. The GC allocator is impossible to beat. Check this blog post for the links. This blog post might interest you too, instructive to see how the STL value semantics are part of the problem.

这篇关于C ++字符串内存管理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆