使用python从列表中随机提取x项 [英] Randomly extract x items from a list using python

查看:97
本文介绍了使用python从列表中随机提取x项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从两个列表开始,例如:

Starting with two lists such as:

lstOne = [ '1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
lstTwo = [ '1', '2', '3', '4', '5', '6', '7', '8', '9', '10']

我想让用户输入要提取的项目数(占列表总长度的百分比),以及要从每个列表中随机提取的相同索引.例如说我想要50%的输出是

I want to have the user input how many items they want to extract, as a percentage of the overall list length, and the same indices from each list to be randomly extracted. For example say I wanted 50% the output would be

newLstOne = ['8', '1', '3', '7', '5']
newLstTwo = ['8', '1', '3', '7', '5']

我使用以下代码实现了这一点:

I have achieved this using the following code:

from random import randrange

lstOne = [ '1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
lstTwo = [ '1', '2', '3', '4', '5', '6', '7', '8', '9', '10']

LengthOfList = len(lstOne)
print LengthOfList

PercentageToUse = input("What Percentage Of Reads Do you want to extract? ")
RangeOfListIndices = []

HowManyIndicesToMake = (float(PercentageToUse)/100)*float(LengthOfList)
print HowManyIndicesToMake

for x in lstOne:
    if len(RangeOfListIndices)==int(HowManyIndicesToMake):
        break
    else:
        random_index = randrange(0,LengthOfList)
        RangeOfListIndices.append(random_index)

print RangeOfListIndices


newlstOne = []
newlstTwo = []

for x in RangeOfListIndices:
    newlstOne.append(lstOne[int(x)])
for x in RangeOfListIndices:
    newlstTwo.append(lstTwo[int(x)])

print newlstOne
print newlstTwo

但是我想知道是否有更有效的方法,在我的实际用例中,这是从145,000个项目中进行子采样的.此外,兰德兰奇是否在这种规模上没有偏见?

But I was wondering if there was a more efficient way of doing this, in my actual use case this is subsampling from 145,000 items. Furthermore, is randrange sufficiently free of bias at this scale?

谢谢

推荐答案

问. I want to have the user input how many items they want to extract, as a percentage of the overall list length, and the same indices from each list to be randomly extracted.

A.最直接的方法直接符合您的规范:

A. The most straight-forward approach directly matches your specification:

 percentage = float(raw_input('What percentage? '))
 k = len(data) * percentage // 100
 indicies = random.sample(xrange(len(data)), k)
 new_list1 = [list1[i] for i in indicies]
 new_list2 = [list2[i] for i in indicies]

.in my actual use case this is subsampling from 145,000 items. Furthermore, is randrange sufficiently free of bias at this scale?

A..在Python 2和Python 3中, random.randrange()函数完全消除了偏差(它使用内部的 _randbelow()可以做出多个随机选择,直到找到无偏差结果的方法.

A. In Python 2 and Python 3, the random.randrange() function completely eliminates bias (it uses the internal _randbelow() method that makes multiple random choices until a bias-free result is found).

在Python 2中, random.sample()函数略有偏差,但仅在四舍五入的最后53位中.在Python 3中, random.sample()函数使用内部的 _randbelow()方法,并且没有偏差.

In Python 2, the random.sample() function is slightly biased but only in the round-off in the last of 53 bits. In Python 3, the random.sample() function uses the internal _randbelow() method and is bias-free.

这篇关于使用python从列表中随机提取x项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆