正在访问堆比栈更快的数据? [英] Is accessing data in the heap faster than from the stack?

查看:126
本文介绍了正在访问堆比栈更快的数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道这听起​​来像一个普遍的问题,我见过很多类似的问题(在这里和在网络上),但他们都不是真的很喜欢我的困境。

说我有这个code:

 无效GetSomeData(字符*缓冲区)
{
    //放缓存一些数据
}诠释的main()
{
     字符缓冲区[1024];
     而(1)
     {
          GetSomeData(缓冲液);
          //做一些与数据
     }
     返回0;
}

我将获得任何性能,如果我宣布缓冲区[1024]全球?

我跑通过time命令在UNIX上一些测试,并有执行时间之间几乎没有差异。

但我真的不相信...

在理论上应这一变化有所作为?


解决方案

  

在堆中访问数据比从栈快?


并非固有...上的每个体系结构我曾经制作,所有的过程记忆可以预期在相同的一组速度来操作,根据CPU的高速缓存的哪个水平/ RAM /交换文件保持当前数据,并且在该存储器操作可以触发以使其向其他进程可见的任何硬件级同步的延迟,结合其他工艺'/ CPU(核心)的变化等。

操作系统(负责页面错误/交换)和硬件(CPU)捕获的访问换出或还未访问的页面,甚至不会被跟踪哪些网页堆栈VS堆......一个内存页是一个内存页。这就是说,全球数据的虚拟地址可以是能够被计算,并在编译时硬codeD,基于堆栈的数据的地址通常相对堆栈指针,而在堆内存必须几乎总是使用访问三分球,这可能会稍微慢一点是在一些系统上 - 这取决于CPU的寻址方式和周期,但它几乎总是微不足道的 - 甚至不值得一看还是第二个想法,除非你写的东西,其中百万分之一秒,是极其重要的。

不管怎样,在你的例子你对比与函数局部(堆栈/自动)变量一个全局变量...有没有涉及堆。堆内存来自的malloc / 的realloc 。对于堆内存,性能问题值得一提的是,应用程序本身就是多少内存在使用中在该地址跟踪 - 所有这些需要一些时间的记录更新为指向的内存是由<$ C $颁出C>新 / 的malloc / 的realloc ,和更多的时间来更新为指针是删除 D或免费 D。

有关全局变量,内存的分配可以有效地在编译时完成,而基于堆栈变量有通常是一个的局部变量的大小的编译时计算的总和(和一些管理数据递增堆栈指针)每次一个函数被调用。所以,当的main()被称为有可能是一段时间来修改堆栈指针,但它可能只是被不同的量修改,而不是不能修改,如果没有缓存和修改,如果有,所以有在运行时的性能没有区别的。

I know this sounds like a general question and I've seen many similar questions (both here and on the web) but none of them are really like my dilemma.

Say I have this code:

void GetSomeData(char* buffer)
{
    // put some data in buffer
}

int main()
{
     char buffer[1024];
     while(1)
     {
          GetSomeData(buffer);
          // do something with the data
     }
     return 0;
}

Would I gain any performance if I declared buffer[1024] globally?

I ran some tests on unix via the time command and there are virtually no differences between the execution times.

But I'm not really convinced...

In theory should this change make a difference?

解决方案

Is accessing data in the heap faster than from the stack?

Not inherently... on every architecture I've ever worked on, all the process "memory" can be expected to operate at the same set of speeds, based on which level of CPU cache / RAM / swap file is holding the current data, and any hardware-level synchronisation delays that operations on that memory may trigger to make it visible to other processes, incorporate other processes'/CPU (core)'s changes etc..

The OS (which is responsible for page faulting / swapping), and the hardware (CPU) trapping on accesses to swapped-out or not-yet-accessed pages, would not even be tracking which pages are "stack" vs "heap"... a memory page is a memory page. That said, the virtual address of global data may be able to be calculated and hardcoded at compile time, the addresses of stack-based data are typically stack-pointer relative, while memory on the heap must almost always be accessed using pointers, which might be slightly slower on some systems - it depends on the CPU addressing modes and cycles, but it's almost always insignificant - not even worth a look or second thought unless you're writing something where millionths of a second are enormously important.

Anyway, in your example you're contrasting a global variable with a function-local (stack/automatic) variable... there's no heap involved. Heap memory comes from new or malloc/realloc. For heap memory, the performance issue worth noting is that the application itself is keeping track of how much memory is in use at which addresses - the records of all that take some time to update as pointers to memory are handed out by new/malloc/realloc, and some more time to update as the pointers are deleted or freed.

For global variables, the allocation of memory may effectively be done at compile time, while for stack based variables there's normally a stack pointer that's incremented by the compile-time-calculated sum of the sizes of local variables (and some housekeeping data) each time a function is called. So, when main() is called there may be some time to modify the stack pointer, but it's probably just being modified by a different amount rather than not modified if there's no buffer and modified if there is, so there's no difference in runtime performance at all.

这篇关于正在访问堆比栈更快的数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆