如何从一个文件中随机读取一行? [英] How do I read a random line from one file?
问题描述
有没有内置的方法来做到这一点?如果不是,我怎么能在不花费太多开销的情况下做到这一点?
Is there a built-in method to do it? If not how can I do this without costing too much overhead?
推荐答案
不是内置的,而是算法 R(3.4.2)
(Waterman 的Reservoir Algorithm")来自 Knuth 的The计算机编程艺术"很好(在非常简化的版本中):
Not built-in, but algorithm R(3.4.2)
(Waterman's "Reservoir Algorithm") from Knuth's "The Art of Computer Programming" is good (in a very simplified version):
import random
def random_line(afile):
line = next(afile)
for num, aline in enumerate(afile, 2):
if random.randrange(num):
continue
line = aline
return line
num, ... in enumerate(..., 2)
迭代器产生序列 2, 3, 4... 因此 randrange
将为 0概率为 1.0/num
-- 这就是我们必须替换当前选定行的概率(引用算法的样本大小为 1 的特例 -- 请参阅 Knuth 的书以获取证明正确性 == 当然,我们也是在一个足够小的水库"以适合内存的情况下;-))...以及我们这样做的准确概率.
The num, ... in enumerate(..., 2)
iterator produces the sequence 2, 3, 4... The randrange
will therefore be 0 with a probability of 1.0/num
-- and that's the probability with which we must replace the currently selected line (the special-case of sample size 1 of the referenced algorithm -- see Knuth's book for proof of correctness == and of course we're also in the case of a small-enough "reservoir" to fit in memory ;-))... and exactly the probability with which we do so.
这篇关于如何从一个文件中随机读取一行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!