python列表中的不连续切片 [英] Discontinuous slice in python list

查看:1186
本文介绍了python列表中的不连续切片的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种有效的方法来实现这一目标,我认为这是一种类似于切片的操作:

I'm looking for an efficient way of achieving this, which I think is a slicing-like operation:

>>> mylist = range(100)
>>>magicslicer(mylist, 10, 20)
[0,1,2,3,4,5,6,7,8,9,30,31,32,33,34,35,36,37,38,39,60,61,62,63......,97,98,99]

想法是:切片可获取10个元素,然后跳过 20个元素,然后获取下一个10个元素,然后跳过下一个20个元素,依此类推.

the idea is: the slicing gets 10 elements, then skips 20 elements, then gets next 10, then skips next 20, and so on.

我认为如果可能的话,我不应该使用循环,因为使用切片的原因(我想)是在单个操作中有效地进行提取".

I think I should not use loops if possible, for the very reason to use slice is (I guess) to do the "extraction" efficiently in a single operation.

感谢阅读.

推荐答案

itertools.compress(2.7/3.1中的新功能)很好地支持像这样的用例,尤其是与itertools.cycle结合使用时:

itertools.compress (new in 2.7/3.1) nicely supports use cases like this one, especially when combined with itertools.cycle:

from itertools import cycle, compress
seq = range(100)
criteria = cycle([True]*10 + [False]*20) # Use whatever pattern you like
>>> list(compress(seq, criteria))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]

Python 2.7计时(相对于Sven的显式列表理解):

Python 2.7 timing (relative to Sven's explicit list comprehension):

$ ./python -m timeit -s "a = range(100)" "[x for start in range(0, len(a), 30) for x in a[start:start+10]]"
100000 loops, best of 3: 4.96 usec per loop

$ ./python -m timeit -s "from itertools import cycle, compress" -s "a = range(100)" -s "criteria = cycle([True]*10 + [False]*20)" "list(compress(a, criteria))"
100000 loops, best of 3: 4.76 usec per loop

Python 3.2计时(也相对于Sven的显式列表理解):

Python 3.2 timing (also relative to Sven's explicit list comprehension):

$ ./python -m timeit -s "a = range(100)" "[x for start in range(0, len(a), 30) for x in a[start:start+10]]"
100000 loops, best of 3: 7.41 usec per loop

$ ./python -m timeit -s "from itertools import cycle, compress" -s "a = range(100)" -s "criteria = cycle([True]*10 + [False]*20)" "list(compress(a, criteria))"
100000 loops, best of 3: 4.78 usec per loop

可以看出,相对于2.7中的内联列表理解而言,它没有太大的区别,但是通过避免隐式嵌套范围的开销,在3.2中有很大帮助.

As can be seen, it doesn't make a great deal of difference relative to the in-line list comprehension in 2.7, but helps significantly in 3.2 by avoiding the overhead of the implicit nested scope.

如果目标是遍历结果序列而不是将其转换为完全实现的列表,那么在2.7中也可以看到类似的区别:

A similar difference can also be seen in 2.7 if the aim is to iterate over the resulting sequence rather than turn it into a fully realised list:

$ ./python -m timeit -s "a = range(100)" "for x in (x for start in range(0, len(a), 30) for x in a[start:start+10]): pass"
100000 loops, best of 3: 6.82 usec per loop
$ ./python -m timeit -s "from itertools import cycle, compress" -s "a = range(100)" -s "criteria = cycle([True]*10 + [False]*20)" "for x in compress(a, criteria): pass"
100000 loops, best of 3: 3.61 usec per loop

对于特别长的模式,可以用chain(repeat(True, 10), repeat(False, 20))之类的表达式替换模式表达式中的列表,这样就不必在内存中完全创建该列表.

For especially long patterns, it is possible to replace the list in the pattern expression with an expression like chain(repeat(True, 10), repeat(False, 20)) so that it never has to be fully created in memory.

这篇关于python列表中的不连续切片的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆