std :: string与以null结尾的字符串相比效率如何? [英] How efficient is std::string compared to null-terminated strings?

查看:136
本文介绍了std :: string与以null结尾的字符串相比效率如何?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我发现 std :: string 与老式的以null结束的字符串相比非常慢,这样慢得让他们显着减慢我的整个程序因子为2。

I've discovered that std::strings are very slow compared to old-fashioned null-terminated strings, so much slow that they significantly slow down my overall program by a factor of 2.

我预计STL会更慢,我没有意识到这会慢得多。

I expected STL to be slower, I didn't realise it was going to be this much slower.

我使用的是Visual Studio 2008,发布模式。它显示一个字符串的分配比 char * 分配慢100-1000倍(很难测试 char *的运行时间) 赋值)。我知道这不是一个公平的比较,指针赋值与字符串复制,但我的程序有很多字符串分配,我不确定我可以在所有地方使用 const引用技巧。使用引用计数实现我的程序会很好,但这些实现似乎不再存在。

I'm using Visual Studio 2008, release mode. It shows assignment of a string to be 100-1000 times slower than char* assignment (it's very difficult to test the run-time of a char* assignment). I know it's not a fair comparison, a pointer assignment versus string copy, but my program has lots of string assignments and I'm not sure I could use the "const reference" trick in all places. With a reference counting implementation my program would have been fine, but these implementations don't seem to exist anymore.

我的真正的问题是:为什么不使用引用计数实现,这是否意味着我们都需要更加小心避免std :: string的常见性能陷阱?

My real question is: why don't people use reference counting implementations anymore, and does this mean we all need to be much more careful about avoiding common performance pitfalls of std::string?

我的完整代码如下。

#include <string>
#include <iostream>
#include <time.h>

using std::cout;

void stop()
{
}

int main(int argc, char* argv[])
{
    #define LIMIT 100000000
    clock_t start;
    std::string foo1 = "Hello there buddy";
    std::string foo2 = "Hello there buddy, yeah you too";
    std::string f;
    start = clock();
    for (int i=0; i < LIMIT; i++) {
        stop();
        f = foo1;
        foo1 = foo2;
        foo2 = f;
    }
    double stl = double(clock() - start) / CLOCKS\_PER\_SEC;

    start = clock();
    for (int i=0; i < LIMIT; i++) {
        stop();
    }
    double emptyLoop = double(clock() - start) / CLOCKS_PER_SEC;

    char* goo1 = "Hello there buddy";
    char* goo2 = "Hello there buddy, yeah you too";
    char *g;
    start = clock();
    for (int i=0; i < LIMIT; i++) {
        stop();
        g = goo1;
        goo1 = goo2;
        goo2 = g;
    }
    double charLoop = double(clock() - start) / CLOCKS_PER_SEC;
    cout << "Empty loop = " << emptyLoop << "\n";
    cout << "char* loop = " << charLoop << "\n";
    cout << "std::string = " << stl << "\n";
    cout << "slowdown = " << (stl - emptyLoop) / (charLoop - emptyLoop) << "\n";
    std::string wait;
    std::cin >> wait;
    return 0;
}


推荐答案

关于琴弦和其他容器的性能。他们大多与临时和不必要的副本有关。

Well there are definitely known problems regarding the performance of strings and other containers. Most of them have to do with temporaries and unnecessary copies.

使用它不是太难,但它也很容易做错了。例如,如果您看到您的代码接受字符串的值,您不需要可修改的参数,您会错误:

It's not too hard to use it right, but it's also quite easy to Do It Wrong. For example, if you see your code accepting strings by value where you don't need a modifiable parameter, you Do It Wrong:

// you do it wrong
void setMember(string a) {
    this->a = a; // better: swap(this->a, a);
}

你最好使用const引用或者在里面做一个交换操作的另一个副本。在这种情况下,向量或列表的性能损失增加。但是,你肯定是有已知的问题。例如在这里:

You better had taken that by const reference or done a swap operation inside, instead of yet another copy. Performance penalty increases for a vector or list in that case. However, you are right definitely that there are known problems. For example in this:

// let's add a Foo into the vector
v.push_back(Foo(a, b));

我们正在创建一个临时 Foo 在我们的向量中添加一个新的 Foo 。在手动解决方案中,这可能会将 Foo 直接创建到向量中。如果向量达到其容量限制,则必须为其元素重新分配更大的存储器缓冲器。它有什么作用?它使用其副本构造函数将每个元素单独复制到其新位置。如果手动解决方案知道前面元素的类型,则可能表现得更加智能。

We are creating one temporary Foo just to add a new Foo into our vector. In a manual solution, that might create the Foo directly into the vector. And if the vector reaches its capacity limit, it has to reallocate a larger memory buffer for its elements. What does it do? It copies each element separately to their new place using their copy constructor. A manual solution might behave more intelligent if it knows the type of the elements before-hand.

另一个常见问题是介绍临时。看看这个

Another common problem is introduced temporaries. Have a look at this

string a = b + c + e;

创建了临时性载荷,您可以在实际优化性能的自定义解决方案中避免。那时, std :: string 的界面被设计为写时复制友好。然而,随着线程变得越来越流行,写字符串上的透明复制有保持它们的状态一致的问题。最近的实现往往避免写入字符串上的副本,而是适当时应用其他技巧。

There are loads of temporaries created, which you might avoid in a custom solution that you actually optimize onto performance. Back then, the interface of std::string was designed to be copy-on-write friendly. However, with threads becoming more popular, transparent copy on write strings have problems keeping their state consistent. Recent implementations tend to avoid copy on write strings and instead apply other tricks where appropriate.

下一版本的标准解决了大部分问题。例如代替 push_back ,您可以使用 emplace_back 直接创建 Foo 到您的向量中

Most of those problems are solved however for the next version of the Standard. For example instead of push_back, you can use emplace_back to directly create a Foo into your vector

v.emplace_back(a, b);

而不是在上面的连接中创建副本, std :: string 将识别何时连接临时和优化这些情况。重新分配也将避免复制,但会将元素移动到适当的地方。

And instead of creating copies in a concatenation above, std::string will recognize when it concatenates temporaries and optimize for those cases. Reallocation will also avoid making copies, but will move elements where appropriate to their new places.

如需优秀阅读,请考虑Andrei Alexandrescu的移动构造函数

For an excellent read, consider Move Constructors by Andrei Alexandrescu.

然而,有时候,比较也往往是不公平的。标准容器必须支持它们必须支持的特征。例如,如果您的容器在从地图中添加/删除元素时没有使地图元素引用有效,那么将更快的地图与标准地图进行比较可能会变得不公平,因为标准地图必须确保元素保持有效。这只是一个例子,当然,有很多这样的情况下,你必须记住,当说我的容器比标准的快!!!。

Sometimes, however, comparisons also tend to be unfair. Standard containers have to support the features they have to support. For example if your container does not keep map element references valid while adding/removing elements from your map, then comparing your "faster" map to the standard map can become unfair, because the standard map has to ensure that elements keep being valid. That was just an example, of course, and there are many such cases that you have to keep in mind when stating "my container is faster than standard ones!!!".

这篇关于std :: string与以null结尾的字符串相比效率如何?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆