替代PHP的in_array的大型阵列,以避免重复项 [英] alternatives to php in_array for large arrays for avoiding duplicates entries
问题描述
我需要生成随机数的大名单,从600K到2000K,但 列表不能有重复的。
I need to generate a large list of random numbers from 600k to 2000k, but the list can not have duplicates.
我目前的实施看起来是这样的:
My current 'implementation' looks like this:
<?php
header('Content-type: text/plain');
$startTime = microtime(true);
$used = array();
for ($i=0; $i < 600000; ) {
$random = mt_rand();
//if (!in_array($random, $used)) {
$used[] = $random;
$i++;
//}
}
$endTime = microtime(true);
$runningTime = $endTime - $startTime;
echo 'Running Time: ' . $runningTime;
//print_r($used);
?>
如果我把 in_array
测试评价处理时间为1秒左右,所以
在 mt_rand
通话和用
阵列填充相对便宜,但是当我取消
在in_array测试不好的事情发生了! (我只是在等待 - 它一直超过10分钟 - 为脚本终止...)
If I keep the in_array
test commented the processing time is around 1 second, so
the mt_rand
calls and the used
array filling are relatively 'cheap' but when I uncomment
the in_array test bad things happens! (I'm just waiting -it's been more then 10 minutes- for the script to terminate...)
所以我在寻找替代品无论是在重复检测方或生成部(我怎么能生成随机数没有得到重复的风险)
So I'm looking for alternatives either on the duplicate detection side or in the generation part (How could i generate random numbers without the risk of getting duplicates)
我愿意接受任何建议。
I'm open to any suggestion.
推荐答案
对于一个快速/肮脏的解决方案,并使用/检查数组键提高你的速度呢?
For a quick/dirty solution, does using/checking array keys improve your speed at all?
$used = array();
for ($i = 0; $i < 600000; ) {
$random = mt_rand();
if (!isset($used[$random])) {
$used[$random] = $random;
$i++;
}
}
$used = array_values($used);
这篇关于替代PHP的in_array的大型阵列,以避免重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!