是什么原因导致std :: sort()访问地址超出范围 [英] What causes std::sort() to access address out of range

查看:197
本文介绍了是什么原因导致std :: sort()访问地址超出范围的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道要使用std :: sort(),比较函数必须严格按弱顺序执行,否则会由于超出范围访问地址而崩溃. ( https://gcc.gnu.org/ml/gcc- bugs/2013-12/msg00333.html )

I understand that to use std::sort(), the compare function must be strict weak order, otherwise it will crash due to accessing address out of bound. (https://gcc.gnu.org/ml/gcc-bugs/2013-12/msg00333.html)

但是,当比较功能不是严格的弱顺序时,为什么std :: sort()访问越界地址?试图比较什么?

However, why would std::sort() access out-of-bound address when the compare function is not strict weak order? What is it trying to compare?

我还想知道STL是否还有其他陷阱,我应该意识到.

Also I wonder if there are other pitfalls in STL that I should be aware of.

推荐答案

第一件事是,使用不符合要求的比较器调用算法是未定义的行为,一切都会发生...

The first thing is that calling the algorithm with a comparator that does not comply with the requirements is undefined behavior and anything goes...

但是除此之外,我假设您有兴趣知道如果比较器损坏,哪种类型的实现可能最终超出范围. 在访问元素之前,实现是否应该不检查边界?即在调用比较器之前

But other than that, I assume that you are interested in knowing what type of implementation might end up accessing out of bounds if the comparator is bad. Should the implementation not check the bounds before accessing the elements in the first place? i.e. before calling the comparator

答案是性能,这仅仅是可能导致此类问题的可能原因之一.排序算法有不同的实现方式,但是std::sort通常是在quicksort的变体之上构建的,该变种将基于其他排序算法(如mergesort)退化,以避免quicksort在最坏情况下的性能.

The answer is performance, and this is just one of the possible things that could lead to this type of issues. There are different implementations of sorting algorithms, but more often than not, std::sort is built on top of a variant of quicksort that will degenerate on a different sorting algorithm like mergesort to avoid the quicksort worst case performance.

quicksort的实现选择一个枢轴,然后围绕枢轴对输入进行分区,然后对两侧进行独立排序.选择枢轴的策略有不同,但是一个常见的选择是三个值的中位数:该算法获取第一个,最后一个和中间元素的值,选择三个元素的中值并将其用作枢轴值.

The implementation of quicksort selects a pivot and then partitions the input around the pivot, then independently sorts both sides. There are different strategies for selection of the pivot, but a common one is the median of three: the algorithm gets the values of the first, last and middle element, selects the median of the three and uses that as the pivot value.

从概念上讲,分区从左侧开始走,直到找到一个不小于枢轴的元素为止,然后分区从右侧开始走,试图找到一个小于枢轴的元素.如果两个光标相遇,则分区完成.如果找到不适当的元素,则将交换值,并且过程将在两个游标确定的范围内继续.从左侧走来查找要交换的元素的循环看起来像:

Conceptually partition walks from the left until it finds an element that is not smaller than the pivot, it then walks from the right trying to find an element that is smaller than the pivot. If the two cursors meet, partition completed. If the out of place elements are found, the values are swapped and the process continues in the range determined by both cursors. The loop walking from the left to find the element to swap would look like:

while (pos < end && value(pos) < pivot) { ++pos; }

虽然在常规分区中不能假定数据透视表的值将在该范围内,但快速排序知道,毕竟,它已从该范围内的元素中选择了数据透视表.在这种情况下,一种常见的优化方法是将中位数的值交换为循环的最后一个元素.这样可以保证value(pos) < pivot之前 pos == end是正确的(最坏的情况是:pos == end - 1).这意味着我们可以删除范围末尾的检查,并且可以使用unchecked_partition(选择您选择的名称),条件更简单:

While in general partition cannot assume that the value of pivot will be in the range, quicksort knows that it is, after all it selected the pivot out of the elements in the range. A common optimization in this case is to swap the value of the median to be in the last element of the loop. This guarantees that value(pos) < pivot will be true before pos == end (worst case: pos == end - 1). The implication here is that we can drop the check for the end of the range and we can use a unchecked_partition (pick your choice of name) with a simpler faster condition:

while (/*pos < end &&*/ value(pos) < pivot) ++pos;

一切都很好,除了<被拼写为comparator(value(pos), pivot).现在,如果comparator的实现不正确,您可能会以comparator(pivot,pivot) == true结尾,并且光标将超出范围.

All perfectly good, except that < is spelled comparator(value(pos), pivot). Now if the comparator is incorrectly implemented you might end up with comparator(pivot,pivot) == true and the cursor will run out of bounds.

请注意,这只是优化算法的一个示例,该算法可以消除性能的边界检查:假设有效顺序,如果使用quicksort,则在上面的循环中走出数组是不可能的将枢轴设置为调用此已修改分区的最后一个元素 .

Note that this is just one example of optimization of the algorithm that can remove bounds check for performance: assuming a valid order, it is impossible to walk out of the array in the above loop if quicksort set the pivot to the last element before calling this modified partition.

回到问题:

在访问元素之前,实现是否应该不检查边界?即在调用比较器之前

Should the implementation not check the bounds before accessing the elements in the first place? i.e. before calling the comparator

不,如果它通过证明不会走出数组来消除边界检查,那不是,但这证明是在比较器有效的前提下建立的.

No, not if it removed bounds checking by proving that it won't walk out of the array, but that prove is built on the premise that the comparator is valid.

这篇关于是什么原因导致std :: sort()访问地址超出范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆