找到列表中第n项的索引 [英] Find the index of the n'th item in a list
问题描述
我想找到列表中项目第n次出现的索引。例如,
I want to find the index of the n'th occurrence of an item in a list. e.g.,
x=[False,True,True,False,True,False,True,False,False,False,True,False,True]
n的真实指数是多少?如果我想要第五次出现(第四次,如果零索引),答案是10。
What is the index of the n'th true? If I wanted the fifth occurrence (4th if zero-indexed), the answer is 10.
我想出了:
indargs = [ i for i,a in enumerate(x) if a ]
indargs[n]
请注意 x.index
返回第一次出现或第一次出现后的某个点,因此据我所知,这不是一个解决方案。
Note that x.index
returns the first occurrence or the first occurrence after some point, and therefore as far as I can tell is not a solution.
对于与上述情况类似的情况,numpy还有一个解决方案,例如:使用 cumsum
和其中
,但我想知道是否有一种无懈可击的方法来解决问题。
There is also a solution in numpy for cases similar to the above, e.g. using cumsum
and where
, but I'd like to know if there's a numpy-free way to solve the problem.
我很关心自从我第一次遇到这个时的性能,同时为项目Euler 问题,但这是我在其他情况下遇到的一个更普遍的问题。
I'm concerned about performance since I first encountered this while implemented a Sieve of Eratosthenes for a Project Euler problem, but this is a more general question that I have encountered in other situations.
编辑:我已经得到了很多很好的答案,所以我决定做一些性能测试。以下是 timeit
以 len
列表搜索4000'th / 1000'th True的列表的执行时间(以秒为单位)。列表是随机的True / False。源代码链接如下;这是一个混乱的触摸。我使用海报名称的短/修改版本来描述除 listcomp
之外的函数,这是上面的简单列表推导。
I've gotten a lot of great answers, so I decided to do some performance tests. Below are timeit
execution times in seconds for lists with len
nelements searching for the 4000'th/1000'th True. The lists are random True/False. Source code linked below; it's a touch messy. I used short / modified versions of the posters' names to describe the functions except listcomp
, which is the simple list comprehension above.
True Test (100'th True in a list containing True/False)
nelements eyquem_occur eyquem_occurrence graddy taymon listcomp hettinger26 hettinger
3000: 0.007824 0.031117 0.002144 0.007694 0.026908 0.003563 0.003563
10000: 0.018424 0.103049 0.002233 0.018063 0.088245 0.003610 0.003769
50000: 0.078383 0.515265 0.002140 0.078074 0.442630 0.003719 0.003608
100000: 0.152804 1.054196 0.002129 0.152691 0.903827 0.003741 0.003769
200000: 0.303084 2.123534 0.002212 0.301918 1.837870 0.003522 0.003601
True Test (1000'th True in a list containing True/False)
nelements eyquem_occur eyquem_occurrence graddy taymon listcomp hettinger26 hettinger
3000: 0.038461 0.031358 0.024167 0.039277 0.026640 0.035283 0.034482
10000: 0.049063 0.103241 0.024120 0.049383 0.088688 0.035515 0.034700
50000: 0.108860 0.516037 0.023956 0.109546 0.442078 0.035269 0.035373
100000: 0.183568 1.049817 0.024228 0.184406 0.906709 0.035135 0.036027
200000: 0.333501 2.141629 0.024239 0.333908 1.826397 0.034879 0.036551
True Test (20000'th True in a list containing True/False)
nelements eyquem_occur eyquem_occurrence graddy taymon listcomp hettinger26 hettinger
3000: 0.004520 0.004439 0.036853 0.004458 0.026900 0.053460 0.053734
10000: 0.014925 0.014715 0.126084 0.014864 0.088470 0.177792 0.177716
50000: 0.766154 0.515107 0.499068 0.781289 0.443654 0.707134 0.711072
100000: 0.837363 1.051426 0.501842 0.862350 0.903189 0.707552 0.706808
200000: 0.991740 2.124445 0.498408 1.008187 1.839797 0.715844 0.709063
Number Test (750'th 0 in a list containing 0-9)
nelements eyquem_occur eyquem_occurrence graddy taymon listcomp hettinger26 hettinger
3000: 0.026996 0.026887 0.015494 0.030343 0.022417 0.026557 0.026236
10000: 0.037887 0.089267 0.015839 0.040519 0.074941 0.026525 0.027057
50000: 0.097777 0.445236 0.015396 0.101242 0.371496 0.025945 0.026156
100000: 0.173794 0.905993 0.015409 0.176317 0.762155 0.026215 0.026871
200000: 0.324930 1.847375 0.015506 0.327957 1.536012 0.027390 0.026657
Hettinger的itertools解决方案几乎总是最好的。 taymon和graddy的解决方案在大多数情况下都是最好的,但是当你想要第n个实例使得n为高或者列出少于n次的列表时,列表理解方法对于短数组可能更好。如果出现少于n次的可能性,则初始计数
检查可以节省时间。此外,在搜索数字而不是真/假时,graddy的效率更高......不清楚为什么会这样。 eyquem的解决方案基本上与其他解决方案相当,但开销略有增加或减少; eyquem_occur与taymon的解决方案大致相同,而eyquem_occurrence类似于listcomp。
Hettinger's itertools solution is almost always the best. taymon's and graddy's solutions are next best for most situations, though the list comprehension approach can be better for short arrays when you want the n'th instance such that n is high or lists in which there are fewer than n occurrences. If there's a chance that there are fewer than n occurrences, the initial count
check saves time. Also, graddy's is more efficient when searching for numbers instead of True/False... not clear why that is. eyquem's solutions are essentially equivalent to others with slightly more or less overhead; eyquem_occur is approximately the same as taymon's solution, while eyquem_occurrence is similar to listcomp.
推荐答案
@Taymon使用的答案list.index 很棒。
The answer from @Taymon using list.index was great.
FWIW,这是一个使用 itertools模块。它适用于任何可迭代的输入,而不仅仅是列表:
FWIW, here's a functional approach using the itertools module. It works with any iterable input, not just lists:
>>> from itertools import compress, count, imap, islice
>>> from functools import partial
>>> from operator import eq
>>> def nth_item(n, item, iterable):
indicies = compress(count(), imap(partial(eq, item), iterable))
return next(islice(indicies, n, None), -1)
这个例子很好,因为它展示了如何有效地结合Python的功能工具集。注意,一旦管道建立起来,就不会出现围绕Python的eval循环 - 所有内容都以C速度完成,内存占用空间很小,评估范围很小,没有变量赋值,还有可单独测试的组件。 IOW,这是程序员梦寐以求的一切: - )
The example is nice because it shows off how to effectively combine Python's functional toolset. Note, that once the pipeline is set-up, there are no trips around Python's eval loop -- everything gets done at C speed, with a tiny memory footprint, with lazy evaluation, with no variable assignments, and with separately testable components. IOW, it is everything functional programmers dream about :-)
样本运行:
>>> x = [False,True,True,False,True,False,True,False,False,False,True,False,True]
>>> nth_item(50, True, x)
-1
>>> nth_item(0, True, x)
1
>>> nth_item(1, True, x)
2
>>> nth_item(2, True, x)
4
>>> nth_item(3, True, x)
6
这篇关于找到列表中第n项的索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!