为什么改组列表(范围(n))比改组[0] * n慢? [英] Why is shuffling list(range(n)) slower than shuffling [0]*n?

查看:51
本文介绍了为什么改组列表(范围(n))比改组[0] * n慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用 random.shuffle,我注意到改组 list(range(n)) 比改组 [0] * n<花费大约 25% 的时间/代码>.以下是 n 大小从 100 万到 200 万的时间:

为什么改组 list(range(n)) 会变慢?与排序列表(需要查看对象)或复制列表(这会增加对象内的引用计数器)不同,对象在这里无关紧要.这应该只是重新排列列表内的指针.

我也尝试过 numpy.random.shuffle,其中混洗 list(range(n)) 比混洗 [0] 慢三倍 (!)* n:

我还尝试了第三种方法来重新排列列表中的元素,即list.reverse.正如预期的那样,这两个列表花费的时间相同:

为了防止洗牌顺序很重要,我还在洗牌后尝试了 list.reverse.同样,正如预期的那样,两个列表花费的时间相同,并且与没有事先改组的时间相同:

那有什么区别呢?shuffle 和reversing 都只需要重新排列列表内的指针,为什么对象对shuffle 很重要,对reversing 不重要?

我的基准代码产生时间:

随机导入导入 numpy从 timeit 导入重复,timeit从集合导入 defaultdict洗牌器 = {'random.shuffle(mylist)': random.shuffle,'numpy.random.shuffle(mylist)': numpy.random.shuffle,'list.reverse(mylist)': list.reverse,}创作者 = {'列表(范围(n))':lambda n:列表(范围(n)),'[0] * n': lambda n: [0] * n,}对于洗牌机中的洗牌机:打印(洗牌)对于创作者中的创作者:打印(创作者)时间 = 默认字典(列表)对于 _ 范围(10):对于范围内的 i (10, 21):n = i * 100_000mylist = 创作者[创作者](n)# 取消注释下一行进行预洗牌# numpy.random.shuffle(mylist)time = timeit(lambda: shufflers[shuffler](mylist), number=1)时间[n].追加(时间)s = '%.6f ' * len(times[n])# 进一步缩进下一行以查看中间结果print([round(min(times[n]), 9) for n in sorted(times)])

解决方案

区别在于 list.reverse 作为一个 list 函数,可以访问底层的指针大批.所以它确实可以在不以任何方式查看对象的情况下重新排列指针(

反转 list(range(n)) 和反转 [0] * n 一样快,尽管加载了对象.原因是 Python 在内存中几乎按顺序创建对象.这是一百万个对象的测试.几乎所有的都位于前一个之后 16 个字节:

<预><代码>>>>mylist = 列表(范围(10**6))>>>从集合导入计数器>>>ctr = Counter(id(b) - id(a) for a, b in zip(mylist, mylist[1:]))>>>对于距离, ctr.most_common() 中的 how_often:打印(距离,how_often)16 99605648 3933-1584548240 1-3024 12416 1-2240 12832 1-304 1-96 1-45005904 16160432 138862896 1

难怪它很快,因为它对缓存非常友好.

但是现在让我们在 shuffled 列表上使用我们的 Python 反转(就像在 list.reverse 的问题中一样,它没有区别):

大不同,现在 my_reverse 从各地随机加载对象,这与缓存友好相反.

当然,shuffle 函数也是如此.虽然 list(range(n)) 最初是缓存友好的,但改组选择随机索引 j 进行交换,这对缓存非常不友好.虽然 i 只是按顺序移动,但它会遇到很多已经随机交换的对象,因此这也不利于缓存.

Using random.shuffle, I noticed that shuffling list(range(n)) takes about 25% more time than shuffling [0] * n. Here are times for sizes n from 1 million to 2 million:

Why is shuffling list(range(n)) slower? Unlike for sorting a list (which needs to look at the objects) or copying a list (which increases reference counters inside the objects), the objects shouldn't matter here. This should just rearrange pointers inside the list.

I also tried numpy.random.shuffle, where shuffling list(range(n)) is three times (!) slower than shuffling [0] * n:

I also tried a third way to rearrange the elements in the list, namely list.reverse. Which, as expected, took equally long for both lists:

Just in case the shuffled order matters, I also tried list.reverse after shuffling the lists. Again, as expected, it took equally long for both lists, and also equally long as without that prior shuffling:

So what's the difference? Both shuffling and reversing only need to rearrange pointers inside the list, why do the objects matter for shuffling but not for reversing?

My benchmark code producing the times:

import random
import numpy
from timeit import repeat, timeit
from collections import defaultdict

shufflers = {
    'random.shuffle(mylist)': random.shuffle,
    'numpy.random.shuffle(mylist)': numpy.random.shuffle,
    'list.reverse(mylist)': list.reverse,
    }

creators = {
    'list(range(n))': lambda n: list(range(n)),
    '[0] * n': lambda n: [0] * n,
    }

for shuffler in shufflers:
    print(shuffler)
    for creator in creators:
        print(creator)
        times = defaultdict(list)
        for _ in range(10):
            for i in range(10, 21):
                n = i * 100_000
                mylist = creators[creator](n)
                # Uncomment next line for pre-shuffling
                # numpy.random.shuffle(mylist)
                time = timeit(lambda: shufflers[shuffler](mylist), number=1)
                times[n].append(time)
                s = '%.6f ' * len(times[n])
        # Indent next line further to see intermediate results
        print([round(min(times[n]), 9) for n in sorted(times)])

解决方案

The difference is that list.reverse, as a list function, has access to the underlying pointers array. So it can indeed rearrange the pointers without looking at the objects in any way (source):

reverse_slice(PyObject **lo, PyObject **hi)
{
    assert(lo && hi);

    --hi;
    while (lo < hi) {
        PyObject *t = *lo;
        *lo = *hi;
        *hi = t;
        ++lo;
        --hi;
    }
}

The random.shuffle and numpy.random.shuffle functions on the other hand only have an outsider view and go through the list's interface, which involves briefly loading the objects to swap them:

random.shuffle:

    def shuffle(self, x, random=None):
        ...
            for i in reversed(range(1, len(x))):
                # pick an element in x[:i+1] with which to exchange x[i]
                j = randbelow(i+1)
                x[i], x[j] = x[j], x[i]

numpy.random.shuffle:

    def shuffle(self, object x, axis=0):
          ...
                for i in reversed(range(1, n)):
                    j = random_interval(&self._bitgen, i)
                    x[i], x[j] = x[j], x[i]

So there's at least potential for a lot of cache misses. But let's as a test try reverse in Python:

    def my_reverse(x):
        lo = 0
        hi = len(x) - 1
        while lo < hi:
            x[lo], x[hi] = x[hi], x[lo]
            lo += 1
            hi -= 1

Benchmarking that:

Reversing list(range(n)) was just as fast as reversing [0] * n, despite loading the objects. The reason is that Python creates the objects pretty much sequentially in memory. Here's a test with a million objects. Almost all were located 16 bytes after the previous one:

>>> mylist = list(range(10**6))
>>> from collections import Counter
>>> ctr = Counter(id(b) - id(a) for a, b in zip(mylist, mylist[1:]))
>>> for distance, how_often in ctr.most_common():
        print(distance, how_often)

16 996056
48 3933
-1584548240 1
-3024 1
2416 1
-2240 1
2832 1
-304 1
-96 1
-45005904 1
6160432 1
38862896 1

So no wonder it's fast, as that's very cache-friendly.

But now let's use our Python reversal on a shuffled list (like in the question with list.reverse, where it didn't make a difference):

Big difference, now that my_reverse loads objects from randomly all over the place, which is the opposite of cache-friendly.

And of course that's what happens with the shuffle functions as well. While list(range(n)) initially is cache-friendly, the shuffling picks random indices j to swap with, which is very cache-unfriendly. And while i just moves sequentially, it's going to encounter a lot of already randomly swapped objects, so that's cache-unfriendly as well.

这篇关于为什么改组列表(范围(n))比改组[0] * n慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆