为什么较大的搜索值的实际运行时间小于排序数组中较低的搜索值? [英] Why actual runtime for a larger search value is smaller than a lower search value in a sorted array?

查看：62 发布时间：2021/4/2 20:40:58 algorithm caching runtime ram processor

本文介绍了为什么较大的搜索值的实际运行时间小于排序数组中较低的搜索值?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在包含范围为[1，10000]的所有唯一元素的数组上执行了线性搜索，对所有搜索值(即从1到10000)以递增顺序进行排序，并绘制了运行时与搜索值的关系图，如下所示:

按如下所述仔细分析绘图的放大版本:

我发现一些较大搜索值的运行时间小于较低搜索值，反之亦然

对此现象我的最佳猜测是，它与CPU使用主内存和缓存处理数据的方式有关，但没有确切的量化理由对此进行解释.

任何提示将不胜感激.

PS:该代码是用C ++编写的，并在由Google Cloud上具有4个VCPU的虚拟机上托管的linux平台上执行.使用C ++ Chrono库对运行时进行了测量.

解决方案

CPU缓存大小取决于CPU型号，有多个缓存级别，因此您的实验应考虑所有这些因素.L1缓存通常为8 KiB，大约比10000阵列小4倍.但是我不认为这是缓存未命中.L2延迟约为100ns，这比最低和第二行之间的差异(约5微秒)小得多.我想这(第二线云)是由上下文切换贡献的.任务越长，发生上下文切换的可能性就越大.这就是为什么右侧的云更厚的原因.

现在放大的图.由于Linux不是实时操作系统，因此时间测量不是很可靠.IIRC的最小报告单位是微秒.现在，如果某个任务恰好花费15.45微秒，那么它的结束时间取决于它开始的时间.如果任务在确切的零时时钟开始，则报告的时间将为15微秒.如果它在内部时钟为0.1微秒时启动，则您将获得16微秒.您在图表上看到的是模拟直线到离散值轴的线性近似值.因此，您获得的任务持续时间不是实际的任务持续时间，而是实际值加上任务开始时间(以微秒为单位(均匀分布为〜U [0,1]))，所有内容均四舍五入为最接近的整数值.

I executed a linear search on an array containing all unique elements in range [1, 10000], sorted in increasing order with all search values i.e., from 1 to 10000 and plotted the runtime vs search value graph as follows:

Upon closely analysing the zoomed in version of the plot as follows:

I found that the runtime for some larger search values is smaller than the lower search values and vice versa

My best guess for this phenomenon is that it is related to how data is processed by CPU using primary memory and cache, but don't have a firm quantifiable reason to explain this.

Any hint would be greatly appreciated.

PS: The code was written in C++ and executed on linux platform hosted on virtual machine with 4 VCPUs on Google Cloud. The runtime was measured using the C++ Chrono library.

解决方案

CPU cache size depends on the CPU model, there are several cache levels, so your experiment should take all those factors into account. L1 cache is usually 8 KiB, which is about 4 times smaller than your 10000 array. But I don't think this is cache misses. L2 latency is about 100ns, which is much smaller than the difference between lowest and second line, which is about 5 usec. I suppose this (second line-cloud) is contributed from the context switching. The longer the task, the more probable the context switching to occur. This is why the cloud on the right side is thicker.

Now for the zoomed in figure. As Linux is not a real time OS, it's time measuring is not very reliable. IIRC it's minimal reporting unit is microsecond. Now, if a certain task takes exactly 15.45 microseconds, then its ending time depends on when it started. If the task started at exact zero time clock, the time reported would be 15 microseconds. If it started when the internal clock was at 0.1 microsecond in, than you will get 16 microsecond. What you see on the graph is a linear approximation of the analogue straight line to the discrete-valued axis. So the tasks duration you get is not actual task duration, but the real value plus task start time into microsecond (which is uniformly distributed ~U[0,1]) and all that rounded to the closest integer value.

这篇关于为什么较大的搜索值的实际运行时间小于排序数组中较低的搜索值?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

为什么较大的搜索值的实际运行时间小于排序数组中较低的搜索值? [英] Why actual runtime for a larger search value is smaller than a lower search value in a sorted array?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

为什么较大的搜索值的实际运行时间小于排序数组中较低的搜索值? [英] Why actual runtime for a larger search value is smaller than a lower search value in a sorted array?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭