访问在PHP中大数组 [英] Accessing Big Arrays in PHP

查看:155
本文介绍了访问在PHP中大数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在做关于访问PHP中的大数据(ISH)阵列的方法不同,一些剖析。用例是pretty简单:我们的一些工具的输出数据到PHP文件作为关联数组和这些文件的应用程序被认为是静态的数据。我们做游戏,所以数据文件的一些实例将包括映射在一个目录项,任务,用户必须完成,或定义:

 < PHP
$ some_data =阵列(
    ......很多很多东西在这里...
);
?>

由于这些数组是肥胖型(400K),和我们的很多code是关心这个数据,有必要尽可能高效地访问这些数据。我看中了定时3种不同的方式这样做。之后presenting方法下面我会分享我的成果。

我正在寻找的是这些方法和时机以及其他任何方法的一些经验,根据验证尝试。

方法1:getter函数

在该方法中,出口实际创建,看起来像一个文件:

 < PHP
功能getSomeData()
{
    $ some_data =阵列(
        ......很多很多的东西在这里...
    );
    返回$ some_data;
}
?>

然后

客户端code可以通过简单地调用getSomeData()时,他们希望得到的数据。

方法2:全球+包含

在此方法中的数据文件,看起来与原来的code以上的块,但是客户端code必须通过几个跳火圈的数据陷入局部范围。这是假设数组在一个名为some_data.php'文件;

 全球$ some_data; //必须相同的名称为在数据文件中的变量...
包括some_data.php';

这将使$ some_data数组的范围,但它是客户端code(我认为)有点麻烦。

方法3:引用的getter

这方法几乎是相同的方法1,但是getter函数不返回一个值,而是设置对数据的引用。

 < PHP
功能getSomeDataByRef($ some_data)
{
    $ some_data =阵列(
        ......很多很多的东西在这里...
    );
    返回$ some_data;
}
?>

然后

客户端code。通过声明一个局部变量(叫什么),并通过引用传递其吸气剂检索数据:

  $ some_data_anyname =阵列();
getSomeDataByRef(安培; $ some_data_anyname);

结果

于是我就每个运行的检索数据1000次和平均运行时间(通过在开始和结束microtime中(真)计算)这些方法的小脚本。下面是我的结果(以毫秒为单位,在2GHz的的MacBookPro运行,8GB RAM,PHP版本5.3.4):

方法1:


  

AVG:0.0031637034416199
  MAX:0.0043289661407471
  MIN:0.0025908946990967


方法2:


  

AVG:0.01434082698822
  MAX:0.018275022506714
  MIN:0.012722969055176


方法3:


  

AVG:0.00335768699646
  MAX:0.0043489933013916
  MIN:0.0029017925262451


这似乎pretty清楚,从这个数据无论如何,这对全球+包括方法不如另外两个,分别是微不足道的差别。

的思考?
难道我完全缺少什么? (可能是...)

在此先感谢!


解决方案

不知道这正是你寻找的东西,但它应该帮忙的速度和内存的问题。您可以使用固定的SPL数组:

  $ startMemory = memory_get_usage();
$阵列=新SplFixedArray(100000);
为($ I = 0; $ I< 100000; ++ $ I){
    $阵列[$ i] = $ I;
}
回声memory_get_usage() - $ startMemory,'字节';

了解更多关于此大PHP数组:
<一href=\"http://nikic.github.com/2011/12/12/How-big-are-PHP-arrays-really-Hint-BIG.html\">http://nikic.github.com/2011/12/12/How-big-are-PHP-arrays-really-Hint-BIG.html

另外你有没有想过存储在缓存/内存中的数据?例如,你可以使用mysqlite与inmemory引擎上的第一个执行然后从那里访问数据:

  $ PDO =新PDO('sqlite的内存:::');
$ pdo-&GT;的setAttribute(PDO :: ATTR_ERRMODE,PDO :: ERRMODE_EXCEPTION);
// ..使用PDO正常

I've been doing some profiling on different methods of accessing large(ish) arrays of data in PHP. The use case is pretty simple: some of our tools output data into PHP files as associative arrays and these files are considered static data by the application. We make games, so some examples of data files would include items in a catalog, tasks that a user must complete, or definitions for maps:

<?php
$some_data = array(
    ...lots and lots of stuff in here...
);
?>

Since these arrays are large-ish(400K), and much of our code is interested in this data, it becomes necessary to access this data as efficiently as possible. I settled on timing 3 different patterns for doing this. After presenting the methods I will share my results below.

What I'm looking for is some experience based validation on these methods and their timing as well as any other methods to try out.

Method #1: getter function

In the method, the exporter actually creates a file that looks like:

<?php
function getSomeData()
{
    $some_data = array(
        ...lots and lots of stuff here...
    );
    return $some_data;
}
?>

Client code can then get the data by simply calling getSomeData() when they want it.

Method #2: global + include

In this method the data file looks identical to the original code block above, however the client code must jump through a few hoops to get the data into a local scope. This assumes the array is in a file called 'some_data.php';

global $some_data; //must be the same name as the variable in the data file...
include 'some_data.php';

This will bring the $some_data array into scope, though it is a bit cumbersome for client code (my opinion).

Method #3: getter by reference

This method is nearly identical to Method #1, however the getter function does not return a value but rather sets a reference to the data.

<?php
function getSomeDataByRef($some_data)
{
    $some_data = array(
        ...lots and lots of stuff here...
    );
    return $some_data;
}
?>

Client code then retrieves the data by declaring a local variable (called anything) and passing it by reference to the getter:

$some_data_anyname = array();
getSomeDataByRef(&$some_data_anyname);

Results

So I ran a little script that runs each of these methods of retrieving data 1000 times on and averages the run time (computed by microtime(true) at the beginning and end). The following are my results (in ms, running on a MacBookPro 2GHz, 8GB RAM, PHP version 5.3.4):

METHOD #1:

AVG: 0.0031637034416199 MAX: 0.0043289661407471 MIN: 0.0025908946990967

METHOD #2:

AVG: 0.01434082698822 MAX: 0.018275022506714 MIN: 0.012722969055176

METHOD #3:

AVG: 0.00335768699646 MAX: 0.0043489933013916 MIN: 0.0029017925262451

It seems pretty clear, from this data anyway, that the global+include method is inferior to the other two, which are "negligible" difference.

Thoughts? Am I completely missing anything? (probably...)

Thanks in advance!

解决方案

Not sure if this is exactly what your looking for but it should help out with speed and memory issues. You can use the fixed spl array:

$startMemory = memory_get_usage();
$array = new SplFixedArray(100000);
for ($i = 0; $i < 100000; ++$i) {
    $array[$i] = $i;
}
echo memory_get_usage() - $startMemory, ' bytes';

Read more on big php arrays here: http://nikic.github.com/2011/12/12/How-big-are-PHP-arrays-really-Hint-BIG.html

Also have you thought about storing the data in a cache/memory? For example you could use mysqlite with the inmemory engine on the first execution then access data from there:

$pdo = new PDO('sqlite::memory:');
$pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
// .. Use PDO as normal

这篇关于访问在PHP中大数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆