PHP 函数的 Big-O 列表 [英] List of Big-O for PHP functions

查看:50
本文介绍了PHP 函数的 Big-O 列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用 PHP 一段时间后,我注意到并非所有内置的 PHP 函数都像预期的那样快.考虑使用缓存的素数数组来确定一个数是否为素数的函数的这两种可能实现.

After using PHP for a while now, I've noticed that not all built-in PHP functions are as fast as expected. Consider these two possible implementations of a function that finds if a number is prime using a cached array of primes.

//very slow for large $prime_array
$prime_array = array( 2, 3, 5, 7, 11, 13, .... 104729, ... );
$result_array = array();
foreach( $prime_array => $number ) {
    $result_array[$number] = in_array( $number, $large_prime_array );
}

//speed is much less dependent on size of $prime_array, and runs much faster.
$prime_array => array( 2 => NULL, 3 => NULL, 5 => NULL, 7 => NULL,
                       11 => NULL, 13 => NULL, .... 104729 => NULL, ... );
foreach( $prime_array => $number ) {
    $result_array[$number] = array_key_exists( $number, $large_prime_array );
}

这是因为 in_array 是通过线性搜索 O(n) 实现的,随着 $prime_array 的增长,它会线性减慢.array_key_exists 函数是通过哈希查找 O(1) 实现的,除非哈希表非常填充(在这种情况下它只是 O(n)),否则不会减慢速度.

This is because in_array is implemented with a linear search O(n) which will linearly slow down as $prime_array grows. Where the array_key_exists function is implemented with a hash lookup O(1) which will not slow down unless the hash table gets extremely populated (in which case it's only O(n)).

到目前为止,我不得不通过反复试验发现大O,偶尔查看源代码.现在的问题...

So far I've had to discover the big-O's via trial and error, and occasionally looking at the source code. Now for the question...

是否有所有*内置 PHP 函数的理论(或实践)大 O 时间列表?

*或者至少是有趣的

例如,我发现很难预测列出的函数的大 O,因为可能的实现取决于 PHP 未知的核心数据结构:array_mergearray_merge_recursivearray_reversearray_intersectarray_combinestr_replace(带数组输入)等

For example, I find it very hard to predict the big O of functions listed because the possible implementation depends on unknown core data structures of PHP: array_merge, array_merge_recursive, array_reverse, array_intersect, array_combine, str_replace (with array inputs), etc.

推荐答案

因为在我认为最好将它放在某处以供参考之前似乎没有人这样做过.我已经通过基准测试或代码略读来描述 array_* 函数的特征.我试图将更有趣的 Big-O 放在靠近顶部的位置.此列表不完整.

Since it doesn't seem like anyone has done this before I thought it'd be good idea to have it for reference somewhere. I've gone though and either via benchmark or code-skimming to characterize the array_* functions. I've tried to put the more interesting Big-O near the top. This list is not complete.

注意:假设哈希查找计算的所有 Big-O 都是 O(1),即使它实际上是 O(n).n 的系数如此之低,在查找 Big-O 的特性开始生效之前,存储足够大数组的 ram 开销会伤害您.例如,在 N=1 和 N=1,000,000 时调用 array_key_exists 之间的差异是大约 50% 的时间增加.

Note: All the Big-O where calculated assuming a hash lookup is O(1) even though it's really O(n). The coefficient of the n is so low, the ram overhead of storing a large enough array would hurt you before the characteristics of lookup Big-O would start taking effect. For example the difference between a call to array_key_exists at N=1 and N=1,000,000 is ~50% time increase.

有趣的点:

  1. isset/array_key_existsin_arrayarray_search
  2. 快得多
  3. +(union) 比 array_merge 快一点(而且看起来更好).但它的工作方式有所不同,所以请记住这一点.
  4. shufflearray_rand
  5. 位于同一个 Big-O 层
  6. array_pop/array_pusharray_shift/array_unshift 快,因为重新索引惩罚
  1. isset/array_key_exists is much faster than in_array and array_search
  2. +(union) is a bit faster than array_merge (and looks nicer). But it does work differently so keep that in mind.
  3. shuffle is on the same Big-O tier as array_rand
  4. array_pop/array_push is faster than array_shift/array_unshift due to re-index penalty

查找:

array_key_exists O(n) 但真的很接近 O(1) - 这是因为碰撞中的线性轮询,但是因为碰撞的机会很小,系数也很小.我发现您将哈希查找视为 O(1) 以提供更现实的大 O.例如,N=1000 和 N=100000 之间的差异仅降低了 50% 左右.

array_key_exists O(n) but really close to O(1) - this is because of linear polling in collisions, but because the chance of collisions is very small, the coefficient is also very small. I find you treat hash lookups as O(1) to give a more realistic big-O. For example the different between N=1000 and N=100000 is only about 50% slow down.

isset( $array[$index] ) O(n) 但非常接近 O(1) - 它使用与 array_key_exists 相同的查找.由于它是语言结构,如果键是硬编码的,将缓存查找,从而在重复使用相同键的情况下加快速度.

isset( $array[$index] ) O(n) but really close to O(1) - it uses the same lookup as array_key_exists. Since it's language construct, will cache the lookup if the key is hardcoded, resulting in speed up in cases where the same key is used repeatedly.

in_array O(n) - 这是因为它在数组中进行线性搜索,直到找到值.

in_array O(n) - this is because it does a linear search though the array until it finds the value.

array_search O(n) - 它使用与 in_array 相同的核心函数,但返回值.

array_search O(n) - it uses the same core function as in_array but returns value.

队列函数:

array_push O(∑ var_i, for all i)

array_push O(∑ var_i, for all i)

array_pop O(1)

array_shift O(n) - 它必须重新索引所有的键

array_shift O(n) - it has to reindex all the keys

array_unshift O(n + ∑ var_i, for all i) - 它必须重新索引所有的键

array_unshift O(n + ∑ var_i, for all i) - it has to reindex all the keys

数组交、并、减:

array_intersect_key 如果相交 100% 做 O(Max(param_i_size)*∑param_i_count, for all i),如果相交 0% 相交 O(∑param_i_size, for all i)

array_intersect_key if intersection 100% do O(Max(param_i_size)*∑param_i_count, for all i), if intersection 0% intersect O(∑param_i_size, for all i)

array_intersect 如果相交 100% 做 O(n^2*∑param_i_count, for all i),如果相交 0% 相交 O(n^2)

array_intersect if intersection 100% do O(n^2*∑param_i_count, for all i), if intersection 0% intersect O(n^2)

array_intersect_assoc 如果相交 100% 做 O(Max(param_i_size)*∑param_i_count, for all i),如果相交 0% 相交 O(∑param_i_size, for all i)

array_intersect_assoc if intersection 100% do O(Max(param_i_size)*∑param_i_count, for all i), if intersection 0% intersect O(∑param_i_size, for all i)

array_diff O(π param_i_size, for all i) - 这是所有 param_sizes 的乘积

array_diff O(π param_i_size, for all i) - That's product of all the param_sizes

array_diff_key O(∑ param_i_size, for i != 1) - 这是因为我们不需要迭代第一个数组.

array_diff_key O(∑ param_i_size, for i != 1) - this is because we don't need to iterate over the first array.

array_merge O( ∑ array_i, i != 1 ) - 不需要迭代第一个数组

array_merge O( ∑ array_i, i != 1 ) - doesn't need to iterate over the first array

+ (union) O(n),其中 n 是第二个数组的大小(即 array_first + array_second)——开销比 array_merge 少,因为它不必重新编号

+ (union) O(n), where n is size of the 2nd array (ie array_first + array_second) - less overhead than array_merge since it doesn't have to renumber

array_replace O( ∑ array_i, for all i )

array_replace O( ∑ array_i, for all i )

随机:

shuffle O(n)

array_rand O(n) - 需要线性轮询.

array_rand O(n) - Requires a linear poll.

明显的大O:

array_fill O(n)

array_fill_keys O(n)

range O(n)

array_splice O(offset + length)

array_splice O(offset + length)

array_slice O(offset + length) or O(n) if length = NULL

array_slice O(offset + length) or O(n) if length = NULL

array_keys O(n)

array_values O(n)

array_reverse O(n)

array_pad O(pad_size)

array_pad O(pad_size)

array_flip O(n)

array_sum O(n)

array_product O(n)

array_reduce O(n)

array_filter O(n)

array_map O(n)

array_chunk O(n)

array_combine O(n)

我要感谢 Eureqa 让我们可以轻松找到功能.这是一个了不起的免费程序,可以为任意数据找到最佳拟合函数.

I'd like to thank Eureqa for making it easy to find the Big-O of the functions. It's an amazing free program that can find the best fitting function for arbitrary data.

对于那些怀疑 PHP 数组查找是 O(N) 的人,我编写了一个基准来测试(它们仍然有效地O(1)最现实的价值观).

For those who doubt that PHP array lookups are O(N), I've written a benchmark to test that (they are still effectively O(1) for most realistic values).

$tests = 1000000;
$max = 5000001;


for( $i = 1; $i <= $max; $i += 10000 ) {
    //create lookup array
    $array = array_fill( 0, $i, NULL );

    //build test indexes
    $test_indexes = array();
    for( $j = 0; $j < $tests; $j++ ) {
        $test_indexes[] = rand( 0, $i-1 );
    }

    //benchmark array lookups
    $start = microtime( TRUE );
    foreach( $test_indexes as $test_index ) {
        $value = $array[ $test_index ];
        unset( $value );
    }
    $stop = microtime( TRUE );
    unset( $array, $test_indexes, $test_index );

    printf( "%d,%1.15f\n", $i, $stop - $start ); //time per 1mil lookups
    unset( $stop, $start );
}

这篇关于PHP 函数的 Big-O 列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆