字符串性能 [英] String performance

查看:51
本文介绍了字符串性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近在C ++中对std :: string性能做了一个简单的测量,

我很震惊它有多糟糕。 C ++首先应该是一种性能语言,其开销几乎为零。

关于语言的问题非常好,但std库的开销绝对是b
abysmal。字符串是最常用的数据类型之一。


测试程序创建1000万个字符串并将它们放入容器中

(参见源代码#1 ,#2

结果:

C ++(VS2005,-O2):7.265s

C#(VS2005) :0.56s

C#字符串比C ++ std :: string快13倍。可怕。


C ++和C#字符串之间存在重要区别 - C#字符串是

不可变的。这有什么不同吗?为了测试它,我创建了最简单的
和最快的C ++字符串实现(参见下面的代码#3)。这是绝对简约的
(仅处理建筑)。它不会进行任何内存分配,所有内存都是通过简单地增加一个指针从静态池获得的。没有任何类型的解除分配方案

实施 - 这肯定会减慢额外的速度。我不认为
认为可以在C ++中创建比

#3更快的字符串实现。 _Any_现实生活实施会比较慢。


当std :: string被我的String替换时的结果?

C ++(VS2005,-O2): 0.64秒 - 仍然慢于0.56秒的C#


C#字符串仍然比理论上可以想象的最快的C ++


一个原因可能是C ++字符串是由const char *

指针构造的。这是一个致命的C / C ++缺陷 - 这种字符串的大小是未知的,

,即使它在编译时以文字形式给出,例如foo。要找出

大小,必须在运行时扫描字符串为0个字符。 C#没有这个限制。
有此限制。编译器可以将字符串直接放在需要它的位置,并将其长度存储在成员变量中。所有在编译期间

时间。


有任何意见吗?


测试程序供参考:


//#1:C ++

std :: vector< std :: stringv;

int main()

{

clock_t t1 = clock();

for(int i = 0; i< 10000000; ++ i)

v .push_back(" poo");

clock_t t2 = clock();

cout<< double(t2 - t1)/ CLOCKS_PER_SEC;

}

//#2:C#

class测试

{

public static void Main()

{

DateTime t1 = System.DateTime.Now;

List< ; stringv = new List< string>();

for(int i = 0; i< 10000000; ++ i)

v.Add(" foo" ;);

DateTime t2 = System.DateTime.Now;

Console.WriteLine(t2-t1);

}

}


//#3:std :: string替换用于测试:理论上最快的
可能的C ++字符串实现使用无限"内存池(没有

释放)

struct String

{

String(const char * s =") ;")

{

str = ptr;

do //将字符串复制到池递增池指针

{

* ptr = * s;

++ ptr;

++ s;

} while(* s);

}

char * str; //指向此字符串的指针

池中的数据

static char buffer [100000000]; //全局池

static char * ptr; //全球免费记忆
池中的
指针

};

解决方案

Marcin Kalicinski写道:

::我最近在C ++中对std :: string

::性能做了一个简单的测量,我很震惊它有多糟糕。 C ++首先是

::并且最重要的是成为一种性能语言,接近零

::开销。这对语言来说非常好,但是std库的
::开销绝对是非常糟糕的。字符串是

::最常用的数据类型之一。

::

::测试程序供参考:

::

:: //#1:C ++

:: std :: vector< std :: stringv;

:: int main()

:: {


如果添加


v.reserve会怎样? 10000000);


这里?

:: clock_t t1 = clock();

:: for(int i = 0 ; i< 10000000; ++ i)

:: v.push_back(" poo");

:: clock_t t2 = clock();

:: cout<< double(t2 - t1)/ CLOCKS_PER_SEC;

::}

::

::

:: //#2:C#

:: class测试

:: {

:: public static void Main()

:: {

:: DateTime t1 = System.DateTime.Now;

:: List< stringv = new List< string>();

:: for(int i = 0; i< 10000000; ++ i)

:: v.Add(" foo");

:: DateTime t2 = System.DateTime.Now;

:: Console.WriteLine(t2-t1);

::}

::}

::

您确定要测试std :: string而不是std :: vector吗?


Bo Persson


如果你添加
会发生什么


>

v。保留(10000000);


您确定要测试std :: string而不是std :: vector吗?



添加储备当然会让事情变得更快:


C ++(VS2005,-O2):7.265s


C ++(VS2005,-O2 with vector :: reserve):1.42s

C#(VS2005):0.56s


但是仍然比没有任何预分配提示的.NET慢2.5倍。为了使
更加平等,我还改变了C#代码以预先分配收集

存储(参见下面的程序)。


结果:C#(VS2005使用阵列):0.14s

它比C ++快了10倍以上......


//带有预分配存储空间的C#代码(即大约相当于在C ++向量上使用reserve的


public static void Main()

{

DateTime t1 = System.DateTime.Now;

string [] v = new string [10000000];

for(int i = 0; i< 10000000; ++ i)

v [i] =" poo";

DateTime t2 = System.DateTime.Now;

Console.WriteLine(t2 - t1);

}



* Marcin Kalicinski:


>如果你添加会发生什么

v.reserve(10000000);

你确定你正在测试std :: string而不是s TD ::载体?



添加储备当然会让事情变得更快:


C ++(VS2005,-O2):7.265s


C ++(VS2005,-O2 with vector :: reserve):1.42s

C#(VS2005):0.56s


但是仍然比没有任何预分配提示的.NET慢2.5倍。为了使
更加平等,我还改变了C#代码以预先分配收集

存储(参见下面的程序)。


结果:C#(VS2005使用阵列):0.14s

它比C ++快了10倍以上......


//带有预分配存储空间的C#代码(即大约相当于在C ++向量上使用reserve的


public static void Main()

{

DateTime t1 = System.DateTime.Now;

string [] v = new string [10000000];

for(int i = 0; i< 10000000; ++ i)

v [i] =" poo";

DateTime t2 = System.DateTime.Now;

Console.WriteLine(t2 - t1);

}



你没有回答Bo的问题。首先你在测试向量对付

字符串,现在你已经抛出一个字符串数组。

-

Derek


I have recently done a simple measurement of std::string performance in C++,
and I''m shocked how bad it is. C++ is first and foremost meant to be a
performance language, with near-zero overhead. This holds quite well with
regards to the language, but overheads of the std library are absolutely
abysmal. And string is one of the most often used datatypes.

The test program creates 10 million strings and puts them in a container
(see source code #1, #2 below).

Results:
C++ (VS2005, -O2): 7.265s
C# (VS2005): 0.56s

C# string is 13 times faster than C++ std::string. Horrible.

There''s a crucial difference between C++ and C# strings - C# strings are
immutable. Does this make a difference? To test it, I created the simplest
and fastest possible C++ string implementation (see code #3 below). It is
absolutely minimalistic (only handles construction). It does not make any
memory allocations, all the memory is obtained from a static pool by simply
incrementing a pointer. There''s no deallocation scheme of any sort
implemented - this would certainly slow things down additionally. I don''t
think it is possible to create a faster implementation of string in C++ than
#3. _Any_ real-life implementation will be slower.

The results when std::string was replaced with my String?
C++ (VS2005, -O2): 0.64s - still slower than 0.56s of C#

C# string is still 14% faster than the fastest theoretically imaginable C++
string. What?! How did they do it?

One reason may be that C++ string is constructed from a const char *
pointer. This is a fatal C/C++ flaw - the size of such string is unknown,
even if it''s given at compile time as literal such as "foo". To find out the
size, the string must be scanned at runtime for 0 character. C# does not
have this limitation. Compiler can place the string directly where it''s
needed, and also store its length in a member variable. All during compile
time.

Any comments?

Test programs for reference:

// #1: C++
std::vector<std::stringv;
int main()
{
clock_t t1 = clock();
for (int i = 0; i < 10000000; ++i)
v.push_back("poo");
clock_t t2 = clock();
cout << double(t2 - t1) / CLOCKS_PER_SEC;
}
// #2: C#
class Test
{
public static void Main()
{
DateTime t1 = System.DateTime.Now;
List<stringv = new List<string>();
for (int i = 0; i < 10000000; ++i)
v.Add("foo");
DateTime t2 = System.DateTime.Now;
Console.WriteLine(t2 - t1);
}
}

// #3: std::string replacement used for testing: a fastest theoretically
possible C++ string implementation using an "infinite" memory pool (with no
deallocation)
struct String
{
String(const char *s = "")
{
str = ptr;
do // Copy string to the pool incrementing pool pointer
{
*ptr = *s;
++ptr;
++s;
} while (*s);
}
char *str; // Pointer to this string
data in the pool
static char buffer[100000000]; // Global pool
static char *ptr; // Global free memory
pointer in the pool
};

解决方案

Marcin Kalicinski wrote:
:: I have recently done a simple measurement of std::string
:: performance in C++, and I''m shocked how bad it is. C++ is first
:: and foremost meant to be a performance language, with near-zero
:: overhead. This holds quite well with regards to the language, but
:: overheads of the std library are absolutely abysmal. And string is
:: one of the most often used datatypes.
::
:: Test programs for reference:
::
:: // #1: C++
:: std::vector<std::stringv;
:: int main()
:: {

What happens if you add

v.reserve(10000000);

here?
:: clock_t t1 = clock();
:: for (int i = 0; i < 10000000; ++i)
:: v.push_back("poo");
:: clock_t t2 = clock();
:: cout << double(t2 - t1) / CLOCKS_PER_SEC;
:: }
::
::
:: // #2: C#
:: class Test
:: {
:: public static void Main()
:: {
:: DateTime t1 = System.DateTime.Now;
:: List<stringv = new List<string>();
:: for (int i = 0; i < 10000000; ++i)
:: v.Add("foo");
:: DateTime t2 = System.DateTime.Now;
:: Console.WriteLine(t2 - t1);
:: }
:: }
::
Are you sure you are testing std::string and not std::vector?

Bo Persson


What happens if you add

>
v.reserve(10000000);

Are you sure you are testing std::string and not std::vector?

Adding reserve of course makes things way faster:

C++ (VS2005, -O2): 7.265s

C++ (VS2005, -O2 with vector::reserve): 1.42s
C# (VS2005): 0.56s

But still 2.5 times slower than .NET without any "preallocation hints". To
make things more equal, I also altered C# code to preallocate collection
storage as well (see program below).

Result: C# (VS2005 using array): 0.14s

And it''s back to over 10x faster than C++...

// C# code with preallocated storage for collection (i.e. roughly an
equivalent of using reserve on C++ vector)
public static void Main()
{
DateTime t1 = System.DateTime.Now;
string[] v = new string[10000000];
for (int i = 0; i < 10000000; ++i)
v[i] = "poo";
DateTime t2 = System.DateTime.Now;
Console.WriteLine(t2 - t1);
}



* Marcin Kalicinski:

>What happens if you add

v.reserve(10000000);

Are you sure you are testing std::string and not std::vector?


Adding reserve of course makes things way faster:

C++ (VS2005, -O2): 7.265s

C++ (VS2005, -O2 with vector::reserve): 1.42s
C# (VS2005): 0.56s

But still 2.5 times slower than .NET without any "preallocation hints". To
make things more equal, I also altered C# code to preallocate collection
storage as well (see program below).

Result: C# (VS2005 using array): 0.14s

And it''s back to over 10x faster than C++...

// C# code with preallocated storage for collection (i.e. roughly an
equivalent of using reserve on C++ vector)
public static void Main()
{
DateTime t1 = System.DateTime.Now;
string[] v = new string[10000000];
for (int i = 0; i < 10000000; ++i)
v[i] = "poo";
DateTime t2 = System.DateTime.Now;
Console.WriteLine(t2 - t1);
}

You didn''t answer Bo''s question. First you were testing vector against
string, now you''ve thrown in a string array.
--
Derek


这篇关于字符串性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆