std :: string处理短字符串的性能 [英] std::string performance for handling short strings

查看:571
本文介绍了std :: string处理短字符串的性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Bjarne Stroustrup和其他专家说,在中处理短字符串时,C ++比C更快Bjarne Stroustrup的文章我上一个问题



但在我的测试中, C ++ C 慢了110% >

g ++版本是4.4.6(在CentOS 6.3上运行)。这是因为g ++ 4.4.6具有较少的c ++ 11特性,例如 Rvalue Reference(move semantics)



测试结果



输出 $ time a.out input_file 减去执行时间无调用 compose_X()函数




  • cpp版本:0.192秒

  • C版本:0.091秒



源代码使用 -O2



编辑编译的
$ b



compose_cpp() compose_p()来自Bjarne's文章。他说, compose_cpp() compose_p()更好。



如果我测试错误,我该如何改善测试?

  #include< iostream> 
#include< fstream>

#include< cstdlib>
#include< cstring>

std :: string compose_cpp(const std :: string& name,const std :: string& domain)
{
return name +'@'+ domain;
}

char * compose_c(const char * name,const char * domain)
{
char * res =(char *)malloc + strlen(domain)+2);
char * p = strcpy(res,name);

p + = strlen(name);
* p ='@'
strcpy(p + 1,domain);

return res;
}

int main(int argc,char * argv [])
{
std :: ifstream ifs;
ifs.open(argv [1]);

std :: string email,domain;

while(ifs.good())
{
ifs>>电子邮件;
ifs>>域;

// std :: string composed = compose_cpp(email,domain);

char * composed = compose_c(email.c_str(),domain.c_str());
免费(组成);
}

ifs.close();
}

输入文件



输入文件为1毫线长。每行小于20字节,随机生成。

  $ head -n 10 input.txt.1m 
9742720 1981857.com
22504 4127435.com
342760 69167.com
53075 26710.com
3837481 1851920.com
98441 278536.com
4503887 9588108。 com
193947 90885.com
42603 8166125.com
3587671 2972​​96.com


解决方案

我只是想在这里猜测,因为我没有要测试的数据文件。我认为你的结果可能不符合Stroustrup的期望,因为他在这里说:


是的,C ++版本,因为它不必


但是,我的理解是

code> libstdc ++ 对所有字符串使用动态内存(零长度字符串除外)。查看这个最近的答案关于 std :: string 对象在 libstdc ++ 中的小尺寸的问题: http://stackoverflow.com/a/27631366/12711



有可能你会有一个实现使用短字符串优化(如MSVC - 我不知道如果clang的libc ++使用或不是)更好的结果。


Bjarne Stroustrup and other experts said that C++ is faster than C for handling for short strings in Bjarne Stroustrup' article and my previous question

But in my test, C++ was about 110% slower than C.

g++ version is 4.4.6 (runs on CentOS 6.3). Is this because g++ 4.4.6 has less c++11 feature such as Rvalue Reference (move semantics)?

Test Result

output of $ time a.out input_file minus execution time of no calling compose_X() function

  • cpp version : 0.192 sec
  • C version : 0.091 sec

source code

compiled with -O2

Edit

compose_cpp() and compose_p() come from Bjarne's article. He said that compose_cpp() is fater than compose_p(). I want to check this fact with real test.

If I have tested wrong way, how can I improve test?

#include <iostream>
#include <fstream>

#include <cstdlib>
#include <cstring>

std::string compose_cpp(const std::string& name, const std::string& domain)
{
    return name + '@' + domain;
}

char* compose_c(const char* name, const char* domain)
{
    char* res = (char*) malloc(strlen(name)+strlen(domain)+2);
    char* p = strcpy(res,name);

    p += strlen(name);
    *p = '@';
    strcpy(p+1,domain);

    return res;
}

int main(int argc, char* argv[])
{
    std::ifstream ifs;
    ifs.open(argv[1]);

    std::string email, domain;

    while (ifs.good())
    {
        ifs >> email;
        ifs >> domain;

        // std::string composed = compose_cpp(email, domain);

        char* composed = compose_c(email.c_str(), domain.c_str());
        free(composed);
    }

    ifs.close();
}

input file

input file is 1 millon lines long. every line is less than 20 bytes, generated randomly.

$ head -n 10 input.txt.1m
9742720 1981857.com
22504 4127435.com
342760 69167.com
53075 26710.com
3837481 1851920.com
98441 278536.com
4503887 9588108.com
193947 90885.com
42603 8166125.com
3587671 297296.com

解决方案

I'm just going to put a guess here because I don't have the data file to test with. I think your results may not match Stroustrup's expectation because of what he says here:

Yes, the C++ version, because it does not have to count the argument characters and does not use the free store (dynamic memory) for short argument strings.

However, my understanding is that libstdc++ uses dynamic memory for all strings (except for zero length strings). See this recent SO answer to a question about the small size of the std::string object in libstdc++: http://stackoverflow.com/a/27631366/12711

It's possible you would have better results with an implementation that uses the short string optimization (like MSVC - I'm not sure if clang's libc++ uses it or not).

这篇关于std :: string处理短字符串的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆