如何使用Perl从文件中准确获取n条随机行? [英] How can I get exactly n random lines from a file with Perl?
问题描述
Following up on this question, I need to get exactly n
lines at random out of a file (or stdin
). This would be similar to head
or tail
, except I want some from the middle.
现在,除了用链接的问题的解决方案遍历文件外,一次运行准确获得n
行的最佳方法是什么?
Now, other than looping over the file with the solutions to the linked question, what's the best way to get exactly n
lines in one run?
作为参考,我尝试过:
#!/usr/bin/perl -w
use strict;
my $ratio = shift;
print $ratio, "\n";
while () {
print if ((int rand $ratio) == 1);
}
其中$ratio
是我想要的行的粗略百分比.例如,如果我想要每10行中就有1行:
where $ratio
is the rough percentage of lines I want. For instance, if I want 1 in 10 lines:
random_select 10 a.list
但是,这不能给我确切的金额:
However, this doesn't give me an exact amount:
aaa> foreach i ( 0 1 2 3 4 5 6 7 8 9 )
foreach? random_select 10 a.list | wc -l
foreach? end
4739
4865
4739
4889
4934
4809
4712
4842
4814
4817
我想到的另一个想法是对输入文件进行采样,然后从数组中随机选择n
,但是如果我有一个很大的文件,那就是个问题.
The other thought I had was slurping the input file and then choosing n
at random from the array, but that's a problem if I have a really big file.
有什么想法吗?