PHP:处理未定义数组键的最快方法 [英] PHP: Fastest way to handle undefined array key

查看:1983
本文介绍了PHP:处理未定义数组键的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在一个非常紧密的循环中,我需要访问包含数百万个元素的数组中的数万个值.密钥可能不确定:在这种情况下,返回NULL而没有任何错误消息是合法的:

in a very tight loop I need to access tenthousands of values in an array containing millions of elements. The key can be undefinied: In that case it shall be legal to return NULL without any error message:

存在数组键:元素的返回值. 数组键不存在:返回null.

Array key exists: return value of element. Array key does not exist: return null.

我确实知道多种解决方案:

I do know multiple solutions:

    if (isset($lookup_table[$key])) {
        return $lookup_table[$key];
    } else {
        return;
    }

@return $lookup_table[$key];

error_reporting(0);
$return = $lookup_table[$key];
error_reporting(E_ALL);
return $return;

所有解决方案都不是最优的:

All solutions are far from optimal:

  • 第一个需要在B-TREE中进行2次查找:一个用于检查是否存在,另一个用于检索值.这实际上使运行时间加倍.
  • 第二个人使用错误抑制运算符,因此在该行上产生了巨大的开销.
  • 第三个调用错误处理程序(将检查error_reporting设置,然后不显示任何内容),从而产生开销.

我的问题是我是否错过了一种避免错误处理的方法,却只使用一次btree查找?

My question is if I miss a way to avoid error handling and yet work with a single btree lookup?

该数组可缓存复杂计算的结果-可以实时完成复杂的计算. 在数十亿美元的可能价值中,只有数百万美元得出了有效的结果.该数组看起来像1234567 => 23457,1234999 => 74361,....将其保存到一个数兆字节的php文件中,并在执行开始时将include_once-d保存到该文件中.初始加载时间无关紧要. 如果未找到密钥,则仅表示此特定提示将不会返回有效结果.问题是每秒要完成50k +.

The array caches the results of a complex calculation - to complex to be done in real time. Out of billions of possible values, only millions yied a valid result. The array looks like 1234567 => 23457, 1234999 => 74361, .... That is saved to a php-file of several megabyte, and include_once-d at the beginning of the execution. Initial load time does not matter. If the key is not found, it simply means that this specific calue will not return a valid result. The trouble is to get this done 50k+ per second.

由于没有找到通过单次查询且没有错误处理即可获取值的方法,因此我很难接受单个答案.相反,我赞成所有的伟大贡献.

As there is no way found to get the value with a single lookup and without error handling, I hve trouble accepting a single answer. Instead I upvoted all the great contributions.

最有价值的投入包括: -使用array_key_exists,因为它比替代方法要快 -查看php的QuickHash

The most valuable inputs where: - use array_key_exists, as it is faster than alternatives - Check out php's QuickHash

关于PHP如何处理数组存在很多困惑.如果检查源代码,您将看到所有阵列都是平衡树.构建自己的查找方法在C和C ++中很常见,但是在诸如php之类的高级脚本语言中却无法实现.

There was a lot of confusion on how PHP handles arrays. If you check the sourcecode, you will see that all arrays are balanced trees. Building own lookup methods is common in C and C++, but is not performant in higher script-languages like php.

推荐答案

更新

从PHP 7开始,您可以使用空合并运算符来完成此操作:

Update

Since PHP 7 you can accomplish this with the null coalesce operator:

return $table[$key] ?? null;

旧答案

首先,数组不是作为B树实现的,它是一个哈希表.一组存储桶(通过哈希函数建立索引),每个存储桶都有一个实际值的链接列表(如果发生哈希冲突).这意味着查找时间取决于散列函数在各个存储桶中散布"值的程度,即散列冲突次数是一个重要因素.

Old answer

First of all, arrays are not implemented as a B-tree, it's a hash table; an array of buckets (indexed via a hash function), each with a linked list of actual values (in case of hash collisions). This means that lookup times depend on how well the hash function has "spread" the values across the buckets, i.e. the number of hash collisions is an important factor.

从技术上讲,这种说法是最正确的:

Technically, this statement is the most correct:

return array_key_exists($key, $table) ? $table[$key] : null;

这引入了一个函数调用,因此比优化的isset()慢了许多.多少?慢了2e3倍.

This introduces a function call and is therefore much slower than the optimized isset(). How much? ~2e3 times slower.

下一步是使用引用来避免进行第二次查找:

Next up is using a reference to avoid the second lookup:

$tmp = &$lookup_table[$key];

return isset($tmp) ? $tmp : null;

不幸的是,如果项不存在,则此会修改原始的$lookup_table数组,因为PHP始终使引用有效.

Unfortunately, this modifies the original $lookup_table array if the item does not exist, because references are always made valid by PHP.

留下以下方法,该方法与您自己的方法很相似:

That leaves the following method, which is much like your own:

return isset($lookup_table[$key]) ? $lookup_table[$key] : null;

除了没有引用的副作用外,它还

Besides not having the side effect of references, it's also faster in runtime, even when performing the lookup twice.

您可以考虑将阵列分成较小的部分,以减少较长的查找时间.

You could look into dividing your arrays into smaller pieces as one way to mitigate long lookup times.

这篇关于PHP:处理未定义数组键的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆