性能:memset [英] Performance: memset

查看:89
本文介绍了性能:memset的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有执行此操作的简单 C 代码(伪代码):

I have simple C code that does this (pseudo code):

#define N 100000000
int *DataSrc = (int *) malloc(N);
int *DataDest = (int *) malloc(N);
memset(DataSrc, 0, N);
for (int i = 0 ; i < 4 ; i++) {
    StartTimer();
    memcpy(DataDest, DataSrc, N);
    StopTimer();
}
printf("%d\n", DataDest[RandomInteger]);

我的电脑:Intel Core i7-3930,配备 4x4GB DDR3 1600 内存,运行 RedHat 6.1 64 位.

My PC: Intel Core i7-3930, with 4x4GB DDR3 1600 memory running RedHat 6.1 64-bit.

第一个 memcpy() 发生在 1.9 GB/秒,而接下来的三个发生在 6.2 GB/秒.缓冲区大小 (N) 太大了,这不会是由缓存效应引起的.所以,我的第一个问题:

The first memcpy() occurs at 1.9 GB/sec, while the next three occur at 6.2 GB/s. The buffer size (N) is too big for this to be caused by cache effects. So, my first Question:

  • 为什么第一个 memcpy() 这么慢?也许 malloc() 在您使用它之前不会完全分配内存?
  • Why is the first memcpy() so much slower? Maybe malloc() doesn't fully allocate the memory until you use it?

如果我消除了 memset(),那么第一个 memcpy() 的运行速度约为 1.5 GB/秒,但接下来的三个运行速度为 11.8 GB/秒.几乎是 2 倍的加速.我的第二个问题:

If I eliminate the memset(), then the first memcpy() runs at about 1.5 GB/sec, but the next three run at 11.8 GB/sec. Almost 2x speedup. My second question:

  • 如果我不调用 memset(),为什么 memcpy() 会快 2 倍?
  • Why is memcpy() 2x faster if I don't call memset()?

推荐答案

正如其他人已经指出的,Linux 使用 乐观内存分配策略.

As others already pointed out, Linux uses an optimistic memory allocation strategy.

第一个和后面的memcpy的区别在于DataDest的初始化.

The difference between the first and the following memcpys is the initialization of DataDest.

正如你已经看到的,当你消除 memset(DataSrc, 0, N) 时,第一个 memcpy 甚至更慢,因为源的页面必须是也分配.当您同时初始化 DataSrc DataDest 时,例如

As you have already seen, when you eliminate memset(DataSrc, 0, N), the first memcpy is even slower, because the pages for the source must be allocated as well. When you initialize both, DataSrc and DataDest, e.g.

memset(DataSrc, 0, N);
memset(DataDest, 0, N);

所有 memcpy 的运行速度大致相同.

all memcpys will run with roughly the same speed.

对于第二个问题:当您使用memset 初始化分配的内存时,所有页面将被连续布置.另一方面,当在复制时分配内存时,源页面和目标页面将被交错分配,这可能会有所不同.

For the second question: when you initialize the allocated memory with memset all pages will be laid out consecutively. On the other side, when the memory is allocated as you copy, the source and destination pages will be allocated interleaved, which might make the difference.

这篇关于性能:memset的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆