如果我将所有 [1, 2, 3, ..., n] 放入具有任何打乱顺序的 HashSet 并迭代 HashSet,为什么我会得到保证的排序顺序? [英] why I will get a guranteed sorted order, if I put all [1, 2, 3, ..., n] into a HashSet with any shuffled order and iterate the HashSet?

查看:14
本文介绍了如果我将所有 [1, 2, 3, ..., n] 放入具有任何打乱顺序的 HashSet 并迭代 HashSet,为什么我会得到保证的排序顺序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

PS:这个 HashSet 是如何产生排序输出的?这篇文章没有回答我的问题.我知道如果我将任何数字放入哈希集中,我将不会得到排序.

PS: How is this HashSet producing sorted output? this post doesn't answer my question. I know that if I put any numbers into hashset, I will not get sorted order.

但是,我发现如果我将所有 [1, 2, 3, ..., n] 放入一个 HashSet 任意打乱顺序 并迭代 HashSet,我将得到一个 保证排序顺序.我无法理解为什么它总是会发生.我已经测试了任何 n <10000很多次,一直都是真的,所以这不应该是巧合,应该是有原因的!尽管我不应该依赖这个实现细节,但请告诉我为什么它总是会发生.

However, I found that if I put all [1, 2, 3, ..., n] into a HashSet with any shuffled order and iterate the HashSet, I will get a guranteed sorted order. I cannot understand why it will always happen. I've tested any n < 10000 for many times, it's always true, therefore it should not be a coincidence and it should have some reason! Even though I should not rely on this implement details, please tell me why it always happens.

PS:我知道如果我插入 [0,1,2, ..., n-1] 或 [1+k, 2+k, .., n+k] (k != 0)进入HashSet,迭代顺序是未排序的,我已经测试过了.HashSet 的迭代顺序无序是正常的.但是,为什么 [1,2,3,4,..,n] 的任何插入顺序意外地总是正确的?我已经检查了实施细节.如果我跟踪路径,整个过程将包括调整桶数组的大小,以及从链表到红黑树的转换.如果我以打乱顺序插入整个 [1-n],则 HashSet 的中间状态是未排序的.然而它如果我完成所有插入,会意外地排序.

PS: I know that if I insert [0,1,2, ..., n-1], or [1+k, 2+k, .., n+k] (k != 0) into HashSet, the iteration order is unsorted and I've tested. It's normal that iteration order of HashSet is unsorted. However, why any insertion order of [1,2,3,4,..,n] is accidentally always true? I've checked the implementation details. If I track the path, the whole process will inculde the resizing the bucket array, and transformation from linkedlist to red-black tree. If I insert the whole [1-n] in shuffled order, the intermediate status of the HashSet is unsorted. However it will accidentally have sorted order, if I complete the all insertions.

我使用JDK 1.8做了以下测试.

I used the JDK 1.8 to do the following test.

public class Test {

    public static void main(String[] args) throws IOException {
        List<Integer> res = printUnsortedCase(10000);
        System.out.println(res);
    }


    private static List<Integer> printUnsortedCase(int n){
        List<Integer> res = new ArrayList<>();
        for (int i = 2; i < n; i++) {
            if (!checkSize(i)) {
                res.add(i);
            }
        }
        return res;
    }

    private static boolean checkSize(int n) {
        List<Integer> list = new ArrayList<>();
        for (int i = 0; i < n; i++) {
            list.add(i);
        }
 
        // here I've shuffled the list of [1,2,3,4, ...n]        
        Collections.shuffle(list);

        Set<Integer> set = new HashSet<>();
        for (int i = 0; i < n; i++) {
            set.add(list.get(i)); // I insert the set in an unsorted order of [1,2,3,..,n]
        }

        list = new ArrayList<>(set);// iterate over the HashSet and insert into ArrayList
        return isSorted(list);
    }

    private static boolean isSorted(List<Integer> list) {
        for (int i = 1; i < list.size(); i++) {
            if (list.get(i - 1) > list.get(i)) return false;
        }
        return true;
    }
}

上面的检查代码我写过,看来是真的.

I've wrote the above checking code and it seems true.

推荐答案

你混淆了两个相关的概念:

You are conflating two related concepts:

  1. 保证顺序:规范规定您将以特定顺序取回元素,并且所有符合该规范的实现都会这样做.
  2. 可重现的顺序:特定的实现以特定的顺序返回所有元素.

保证顺序必然意味着可重复的顺序(否则你会遇到错误).

Guaranteed order necessarily implies reproducible order (otherwise you'd have a bug).

可重复的顺序并不意味着保证顺序.可重现的顺序可能只是某些实现细节的副作用,这些实现细节恰好对齐,以便您在某些情况下以相同的顺序获得元素,但这不能保证.

Reproducible order doesn't imply guaranteed order. It's possible that the reproducible order is just a side effect of some implementation details that happens to align so that you get the elements in the same order under some circumstances, but this isn't guaranteed.

在这种特定情况下,几个因素共同导致可重复的顺序:

In this specific case several factors together result in a reproducible order:

  • Integer 具有高度可重现和可预测的 hashCode(它只是数字本身)
  • HashMap 对该哈希码进行一些小的操作,以通过简单的哈希码实现来减少冲突的机会,这在这种情况下无关紧要(因为它只是 hash ^ (hash >>> 16) 保持数字 <= 216 均等排序).
  • 您使用一种非常一致且可重现的方式来构建您的 HashMap.生成的哈希图将始终经历相同的成长阶段.
  • Integer has a highly reproducible and predictable hashCode (it's just the number itself)
  • HashMap does some minor manipulation on that hash code to decrease the chances of collisions by simple hash code implementations, which doesn't matter in this case (because it just does hash ^ (hash >>> 16) which keeps number <= 216 equally-sorted).
  • You use a very consistent and reproducible way to construct your HashMaps. The resulting hashmaps will always have gone through the same growing stages.

如果不是

        list.add(i);

你做到了

        list.add(i + 65000);

(即使用数字 65000 到 65000+n 而不是 0 到 n)然后您会看到未排序的结果出现.

(i.e. use the number 65000 to 65000+n instead of 0 to n) then you'd see the non-sorted results emerge.

事实上,可复制的顺序"是你得到的非常脆弱,仅仅添加 10 已经导致一些列表没有排序.

In fact the "reproducible order" that you get is so fragile that just adding 10 already causes some of the lists to be unsorted.

这篇关于如果我将所有 [1, 2, 3, ..., n] 放入具有任何打乱顺序的 HashSet 并迭代 HashSet,为什么我会得到保证的排序顺序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆