XML XPath搜索和阵列与PHP的循环,内存问题 [英] XML xpath search and array looping with php, memory issue
问题描述
我处理大型XML文件(数兆字节),为此我不得不做出各种实物检查。不过,我有这非常快速增长的内存和时间的使用问题。我测试过这样的:
I'm dealing with large XML files (several megabytes) for which I have to make various kind of checks. However I have problem with memory and time usage which grows very quickly. I've tested it like this:
$xml = new SimpleXMLElement($string);
$sum_of_elements = (double)0.0;
foreach ( $xml->xpath('//Amt') as $amt ) {
$sum_of_elements += (double)$amt;
}
使用microtime中()和memory_get_usage()-functions我得到这个运行code以下结果:
With microtime() and memory_get_usage() -functions I get the following results by running this code:
- 5MB的文件(7480 AMT的元素):
- 执行时间0,69s
- 内存使用率从10.25Mb成长为29.75Mb
这仍然是相当好的。但后来有一个大一点的文件记忆和使用时间增长非常
That's still quite ok. But then with a bit bigger file memory and time usage grow very much:
- 6MB文件(8976 AMT的元素):
- 执行时间8,53s
- 内存使用率从10.25Mb成长为99.25Mb
这个问题似乎是在循环的结果集。我也试过循环代替的foreach但没有任何区别。如果没有循环内存使用量不会增长这么多。
The problem seems to be in looping the result set. I've also tried for-loop instead of foreach but with no difference. Without looping the memory usage does not grow so much.
任何想法,问题可能是什么?
Any idea where the problem could be?
推荐答案
SimpleXML的是基于树的,将整个文件加载到内存中。使用
取消设置
标记不再需要的 PHP的GC循环期间的进行清理<一个href=\"http://stackoverflow.com/questions/2617672/how-important-is-it-to-unset-variables-in-php/2617786#2617786\">might产量较少的内存使用。如果那不解决这一问题,可以考虑使用 XMLReader的为基于拉做法。虽然你将无法使用XPath,内存的消耗应该是显著下降。SimpleXML is tree-based and will load the entire document into memory. Using
unset
to mark no longer needed resources for PHP's GC for cleanup during a loop might yield less memory usage. If that doesnt solve the issue, consider using XMLReader for a pull-based approach. Though you won't be able to use XPath, memory consumption should be significantly lower.这篇关于XML XPath搜索和阵列与PHP的循环,内存问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!