增加CPU密集型PHP脚本演出 [英] increase performances of CPU-intensive PHP scripts

查看：196 发布时间：2016/8/23 11:40:09 php c performance

本文介绍了增加CPU密集型PHP脚本演出的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个PHP脚本，这需要时间（也许天）来执行。这是很简单，但非常CPU密集型，大部分的执行时间都花在为（具有成型的剧本后，我可以告诉）：

I have a PHP script which takes hours (maybe days) to execute. It is quite simple but very CPU-intensive, most of the execution time is spent into (I can tell after having profiled the script):

$阵列=爆炸（''，$ A [$ i]）;
其中， $ a [$ i] 是重新$ P $一个很长的字符串psents用逗号隔开30K元素的矢量

$array = explode(',', $a[$i]); where $a[$i] is a very long string which represents a vector of 30k elements separated by comma

的foreach（$数组$关键=＆GT; $值）循环;其中，每个回路的一些in_array（）和比较和赋值操作被执行

foreach($array as $key => $value) loops; where for each loop some in_array() and comparison and assignment operations are performed

$ A 实际上是一个非常大的稀疏矩阵（30K * 30K），但我不能保持它在内存（8GB似乎没有足够的RAM），所以我一直只是一个疏重新presentation（基本上每行是一个字符串），并使用爆炸（）任何时候，我需要在一排工作。

$a is actually a very large and sparse matrix (30k * 30k) but I can't keep it in memory (8GB seems to be not enough RAM) so I keep just a "sparse representation" (basically each row is a string) and use explode() any time I need to work on a row.

我知道，改写C（或其他语言）都将提高性能（多少钱？），但是，这样做，我想知道如果我可以做任何事情，以改善在PHP执行时间之前。

I know that rewriting everything in C (or other languages) would improve performances (how much?) but, before doing that I would like to know if I can do anything to improve the execution time in PHP.

答案后编辑。

我试了你的建议，这里是我的报告：

I tried several of your advises and here is my report:

1）str_getcsv是在最慢于爆炸的情况下

1) str_getcsv is in most of the case slower than explode

2）SPLFixedArray降低要求来存储矩阵记忆，但仍然，8GB是不够的，一个30K点¯x30K矩阵，所以我不认为它可以帮助很多;这里真正的问题是缺乏在PHP稀疏重新presentation矩阵我觉得

2) SPLFixedArray decrease the memory requested to store the matrix but still, 8GB was not enough for a 30k x 30k matrix so I don't think it can help much; the real problem here is the lack of a sparse representation for matrix in PHP I think

3）我不能存储的爆炸行动的所有结果，因为还在，这将意味着在保持内存中的整个矩阵（没有足够的RAM）

3) I can't store all the results of the explode operations because, still, that would mean keeping the whole matrix in memory (not enough RAM)

4）我试过，即使我确信这将是慢的数据库的方法：我已存储三元（I，J值）重新present每个矩阵元素;即使删除不太重要的值（我可以小于一个阈牺牲值，并获得以下precise的结果，但仍然是有用的），并存储刚刚18百万元组，与MySQL的myisam的方法比在存储器我的阵列的方法慢得多

4) I've tried the database approach even if I was sure it would have been slower: I've stored triples (i,j,value) to represent each matrix element; even deleting the less important values (I can sacrifice values less than a threshold and get a less precise result, but still useful) and storing just 18 millions tuples, the approach with mysql myisam is much slower than my array approach in memory.

5）我试着使用MEMORY引擎（RAM中的MySQL表）和存储除了具有零值的那些所有的矩阵元素数据库的方法;有42百万记录这一次...它的速度更快，不是一个数量级，但2-4倍的速度......我想我可以在5天内，而不是15-20完成任务......它仍然是太多了（我想在24小时内完成），如果您有任何其他建议，你都非常欢迎。

5) I've tried the database approach using the MEMORY engine (a mysql table in RAM) and storing all the matrix elements except the ones having value zero; having 42 millions records this time...it is faster, not an order of magnitude but 2-4 times faster...I think I can finish the job in 5 days instead of 15-20...it is still too much (I would like to finish in 24 hours), if you have any other suggestions you are very welcome

编辑2：我解释一下这个问题。

EDIT 2: I explain the problem

我给有关该问题的一些细节，我真的需要简化一切，否则这将是太多时间去解释，但我认为它是足够更好地了解有关情况。

I'll give some details about the problem, I really need to simplify everything otherwise it would be too long to explain but I think it is enough to understand better the situation.

我有节点之间的矩阵重新presenting距离;在一个整数的距离也可以是无限的。

I have a matrix representing distances among nodes; the distance in an integer and could also be infinite.

我有一个内存表重新presenting与三元每个距离：node_1，node_2，距离（只非无限距离重新presented）

I have a memory table representing each distance with triples: node_1, node_2, distance (just the non infinite distances are represented).

我有这种贪心算法的，我没有写，我应该优化一个可行的时间来执行它（比方说，不到一天的时间），在具有8GB内存的笔记本电脑。

I have this sort of greedy algorithm that I did not write and I should optimize to execute it in a feasible time (let's say less than one day) on a laptop having 8GB of RAM.

的算法中基本上得到在输入两个节点，并设计一个起始节点，并根据必须在每一步进行校验以下两个属性一个结束节点步步之间的路径：

The algo basically gets in input two nodes and designs a path between a starting node and an ending node step by step according to the following two properties that must be verified at each step:

新的中间节点，必须在组更接近结束节点相对于当前节点

这些节点之间，一个是到当前节点越近被选择

the new intermediate node must be chosen among the set of nodes that are closer to the ending node respect to the current node
among those nodes, the one which is the closer to the current node is chosen

请考虑
1）三角不等式不满意。
2）这不是一个最短路径问题

Please consider that 1) The triangle inequality is NOT satisfied. 2) It is NOT a shortest path problem

下面是一些伪code为我叫了几次的函数，直到我足够接近结束节点：

Here is some pseudo code for the function I call several times until I'm close enough to the ending node:

get_next_node($node_1, $node_2){

    $dist = select distance from distances_table where node_2 = $node_2 and node_1 = $node_1

    $candidates_ar = select node_1 from distances_table where node_2 = $node_2 and distance < $dist

    $distances_ar = select distance from distances_table where node_1 = $node_1 and node_2 in ($candidates_ar) // e.g. $distances_ar[12] contains distance between node 12 and $node_1

    $min = 1000;
    foreach ($candidates_ar as $value){
        if ($distances_ar[$value] < $min){
            $min = $distances_ar[$value]
            $next_node = $value
        }
    }

}

我省略了很多检查和额外的复杂性，但这是基本而这正是算法中最花费时间的。

I have omitted a lot of checks and additional complexity, but this is the basic and this is where the algo spends most of time.

我想它可以与A *的实现来解决，但我想避免它，如果它可以提高性能，这样我可以以小时（而不是几天）执行它。

I guess it can be solved with an implementation of A* but I would like to avoid it if it is possible to increase the performances so that I can execute it in hours (not days).

感谢。

增加CPU密集型PHP脚本演出 [英] increase performances of CPU-intensive PHP scripts

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

增加CPU密集型PHP脚本演出 [英] increase performances of CPU-intensive PHP scripts

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

登录关闭