TimSort什么时候抱怨破坏的比较器? [英] When does TimSort complain about broken comparator?

查看:118
本文介绍了TimSort什么时候抱怨破坏的比较器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Java 7 更改了排序算法,以便它抛出一个

Java 7 changed the sorting algorithm such that it throws an


java.lang.IllegalArgumentException:比较方法违反了它的一般合同!

java.lang.IllegalArgumentException: "Comparison method violates its general contract!"

在某些情况下使用的比较器有问题。是否可以判断比较器中的哪种错误导致这种情况?在我的实验中,如果x!= x无关紧要,如果x

in some cases when the used comparator is buggy. Is it possible to tell what kind of bug in the comparator causes this? In my experiments it did not matter if x != x , it also did not matter if x < y and y < z but z < x , but it did matter if x = y and y = z but x < z for some values x, y, z. Is this generally so?

(如果对此有一般规则,可能更容易在比较器中查找错误。但当然最好是修复所有错误。:-))

(If there were a general rule to this, it might be easier to look for the bug in the comparator. But of course it is better to fix all bugs. :-) )

特别是,以下两个比较器没有让TimSort抱怨:

In particular, the following two comparators did not make TimSort complain:

    final Random rnd = new Random(52);

    Comparator<Integer> brokenButNoProblem1 = new Comparator<Integer>() {
        @Override
        public int compare(Integer o1, Integer o2) {
            if (o1 < o2) {
                return Compare.LESSER;
            } else if (o1 > o2) {
                return Compare.GREATER;
            }
            return rnd.nextBoolean() ? Compare.LESSER : Compare.GREATER;
        }
    };

    Comparator<Integer> brokenButNoProblem2 = new Comparator<Integer>() {
        @Override
        public int compare(Integer o1, Integer o2) {
            if (o1 == o2) {
                return Compare.EQUAL;
            }
            return rnd.nextBoolean() ? Compare.LESSER : Compare.GREATER;
        }
    };

但是下面的比较器确实让它呕吐:

but the following comparator did make it throw up:

    Comparator<Integer> brokenAndThrowsUp = new Comparator<Integer>() {
        @Override
        public int compare(Integer o1, Integer o2) {
            if (Math.abs(o1 - o2) < 10) {
                return Compare.EQUAL; // WRONG and does matter
            }
            return Ordering.natural().compare(o1, o2);
        }
    };

更新:在一些现实生活中,我们遇到了一个失败的地方,那里没有x,y,z x = y且y = z但x < z。所以看起来我的猜测是错误的,而且似乎并不是这种特殊的失败。还有更好的想法吗?

UPDATE: in some real life data we had a failure where there were no x,y,z with x = y and y = z but x < z . So It seems my guess was wrong, and it doesn't seem this specific kind failure only. Any better ideas?

推荐答案

查看 ComparableTimSort 的代码之后我我不太确定。我们来分析吧。这是抛出它的唯一方法(有一种类似的方法只对交换的角色做同样的事情,因此分析其中一个就足够了。)

After looking at the code of ComparableTimSort I am not quite sure. Let's analyze it. Here is the only method that throws it (there is a similar method that does the same only with exchanged roles, so analyzing one of them is enough).

private void mergeLo(int base1, int len1, int base2, int len2) {
        assert len1 > 0 && len2 > 0 && base1 + len1 == base2;

        // Copy first run into temp array
        Object[] a = this.a; // For performance
        Object[] tmp = ensureCapacity(len1);

        int cursor1 = tmpBase; // Indexes into tmp array
        int cursor2 = base2;   // Indexes int a
        int dest = base1;      // Indexes int a
        System.arraycopy(a, base1, tmp, cursor1, len1);

        // Move first element of second run and deal with degenerate cases
        a[dest++] = a[cursor2++];
        if (--len2 == 0) {
            System.arraycopy(tmp, cursor1, a, dest, len1);
            return;
        }
        if (len1 == 1) {
            System.arraycopy(a, cursor2, a, dest, len2);
            a[dest + len2] = tmp[cursor1]; // Last elt of run 1 to end of merge
            return;
        }

        int minGallop = this.minGallop;  // Use local variable for performance
    outer:
        while (true) {
            int count1 = 0; // Number of times in a row that first run won
            int count2 = 0; // Number of times in a row that second run won

            /*
             * Do the straightforward thing until (if ever) one run starts
             * winning consistently.
             */
// ------------------ USUAL MERGE
            do {
                assert len1 > 1 && len2 > 0;
                if (((Comparable) a[cursor2]).compareTo(tmp[cursor1]) < 0) {
                    a[dest++] = a[cursor2++];
                    count2++;
                    count1 = 0;
                    if (--len2 == 0)
                        break outer;
                } else {
                    a[dest++] = tmp[cursor1++];
                    count1++;
                    count2 = 0;
                    if (--len1 == 1)
                        break outer;
                }
            } while ((count1 | count2) < minGallop);

// ------------------ GALLOP
            /*
             * One run is winning so consistently that galloping may be a
             * huge win. So try that, and continue galloping until (if ever)
             * neither run appears to be winning consistently anymore.
             */
            do {
                assert len1 > 1 && len2 > 0;
                count1 = gallopRight((Comparable) a[cursor2], tmp, cursor1, len1, 0);
                if (count1 != 0) {
                    System.arraycopy(tmp, cursor1, a, dest, count1);
                    dest += count1;
                    cursor1 += count1;
                    len1 -= count1;
// -->>>>>>>> HERE IS WHERE GALLOPPING TOO FAR WILL TRIGGER THE EXCEPTION
                    if (len1 <= 1)  // len1 == 1 || len1 == 0
                        break outer;
                }
                a[dest++] = a[cursor2++];
                if (--len2 == 0)
                    break outer;

                count2 = gallopLeft((Comparable) tmp[cursor1], a, cursor2, len2, 0);
                if (count2 != 0) {
                    System.arraycopy(a, cursor2, a, dest, count2);
                    dest += count2;
                    cursor2 += count2;
                    len2 -= count2;
                    if (len2 == 0)
                        break outer;
                }
                a[dest++] = tmp[cursor1++];
                if (--len1 == 1)
                    break outer;
                minGallop--;
            } while (count1 >= MIN_GALLOP | count2 >= MIN_GALLOP);
            if (minGallop < 0)
                minGallop = 0;
            minGallop += 2;  // Penalize for leaving gallop mode
        }  // End of "outer" loop
        this.minGallop = minGallop < 1 ? 1 : minGallop;  // Write back to field

        if (len1 == 1) {
            assert len2 > 0;
            System.arraycopy(a, cursor2, a, dest, len2);
            a[dest + len2] = tmp[cursor1]; //  Last elt of run 1 to end of merge
        } else if (len1 == 0) {
            throw new IllegalArgumentException(
                "Comparison method violates its general contract!");
        } else {
            assert len2 == 0;
            assert len1 > 1;
            System.arraycopy(tmp, cursor1, a, dest, len1);
        }
    }

该方法执行两个已排序运行的合并。它通常合并,但一旦遇到一方开始赢(即,总是小于另一方),就开始驰骋。 Gallopping试图通过向前看更多元素而不是一次比较一个元素来加快速度。由于运行应该排序,因此展望未来。

The method performs a merging of two sorted runs. It does a usual merge but starts "gallopping" once it encounters that one side starts "winning" (I.e., being always less than the other) all the time. Gallopping tries to make things faster by looking ahead more elements instead of comparing one element at a time. Since the runs should be sorted, looking ahead is fine.

您会看到只有在 len1时抛出异常最后是 0
第一个观察结果如下:在通常的合并期间,异常可以从不被抛出,因为循环直接中止一次 len 这个 1 因此,异常只能由于疾驰而抛出。

You see that the exception is only throw when len1 is 0 at the end. The first observation is the following: During the usual merge, the exception can never be thrown since the loop aborts directly once len this 1. Thus, the exception can only be thrown as result of a gallop.

这已经强烈暗示异常行为是不可靠的:As只要你有小数据集(这么小,生成的运行可能永远不会驰骋,因为 MIN_GALLOP 7 )或生成的运行始终巧合生成一个永不磨损的合并,您将永远不会收到异常。因此,在不进一步检查 gallopRight 方法的情况下,我们可以得出结论:您不能依赖异常:它可能永远不会抛出无论您的比较器有多么错误是

This already gives a strong hint that the exception behaviour is unreliable: As long as you have small data sets (so small that a generated run may never gallop, as MIN_GALLOP is 7) or the generated runs always coincidentally generate a merge that never gallops, you will never receive the exception. Thus, without further reviewing the gallopRight method, we can come to the conclusion that you cannot rely on the exception: It may never be thrown no matter how wrong your comparator is.

这篇关于TimSort什么时候抱怨破坏的比较器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆