PHP SimpleXML大文件没有额外的内存使用量 [英] PHP SimpleXML large file no extra memory usage

查看:67
本文介绍了PHP SimpleXML大文件没有额外的内存使用量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在每篇有关SimpleXML性能和内存使用的文章中,都提到所有解析的内容都存储在内存中,处理大文件将导致大量内存使用. 但是最近我发现,使用SimpleXML处理大型文件不会导致较大的内存使用,甚至几乎不会导致任何内存使用. 有我的测试脚本:

In every article about SimpleXML performance and memory usage it is mentioned that all parsed content is stored in memory and that processing large files will lead to large memory usage. But recently I found that processing large files with SimpleXML do not cause large memory usage even more it causes almost none memory usage. There is my test script:

<?php
error_reporting(E_ALL);
ini_set("display_errors", 1);
print "OS: " . php_uname() . "\n";
print "PHP version: " . phpversion() . "\n";

print round(memory_get_usage() / 1024 / 1024, 2) . " Mb\n";
$large_xml = '<?xml version="1.0" encoding="UTF-8"?><catalog><products>';
for ($i = 0; $i < 500000; $i++) {
    $large_xml .= "<product><id>{$i}</id><name>Product Name {$i}</name><description>Some Description {$i}</description><price>{$i}</price></product>\n";
}
$large_xml .= "</products></catalog>";
print round(memory_get_usage() / 1024 / 1024, 2) . " Mb\n";
$products_sxml = simplexml_load_string($large_xml);
print round(memory_get_usage() / 1024 / 1024, 2) . " Mb\n";
?>

我在Linux服务器上调试了该脚本,PHP版本:5.3.8,输出为:

I was tesing this script on Linux server, PHP version: 5.3.8 and the output was:

操作系统:Linux 2.6.32-5-amd64#1 SMP 2月25日星期一00:26:11 UTC 2013 x86_64

OS: Linux 2.6.32-5-amd64 #1 SMP Mon Feb 25 00:26:11 UTC 2013 x86_64

PHP版本:5.3.8

PHP version: 5.3.8

0.6 Mb

65.98 Mb

65.98 Mb

所以我的问题是-其他人是否注意到它,这可能是对此的解释,因为我无法在网络上的任何地方找到它的解释-甚至都没有确认?

So my question is - does anyone else has noticed it and what could be an explanation to this it, because I could not find anywhere in the web the explanaition of it - not even an confirmation about it?

推荐答案

PHP的内存管理功能非常复杂,要准确地衡量特定一段高级代码的影响是非常困难的. Julien Pauli在PHP UK会议上对此进行了很好的(非常技术性的)演讲,可在此处观看其视频.

The memory management functionality of PHP is quite sophisticated, and accurately measuring the impact of a particular piece of high-level code is quite difficult. There was quite a good (very technical) talk on this by Julien Pauli at the PHP UK Conference, a video of which is available here.

memory_get_usage可能对您撒谎有一些可能的原因:

There are a few possible reasons why memory_get_usage might be lying to you:

  • 首先,memory_get_usage使用一个可选参数$real_usage,该参数区分已分配的内存量和正在使用的 量-内存管理器分配内存一次阻止一个块,因此它通常会在OS中要求比实际使用更多的请求.当需要更多存储空间时,已经声明的内存将用完,这意味着不再需要分配存储空间.在这种情况下进行的测试表明,这与此处无关.
  • 通常,在运行PHP的基础C代码中有多种分配内存的方法.由于大多数SimpleXML的工作不是在Zend引擎中完成,而是在名为libxml2的第三方库中完成,因此内存分配将在此处完成,而不是在特定于PHP的分配例程中进行,例如,在附加例程时使用到PHP字符串.
  • Firstly, memory_get_usage takes an optional parameter of $real_usage, which distinguishes between the amount of memory allocated and the amount in use - the memory manager allocates memory a block at a time, so it will often have claimed more from the OS than is actually in use. As more is needed, the already-claimed memory is used up, meaning no more needs to be allocated. Testing in this case suggests that this is not relevant here.
  • More generally, there are different ways of allocating memory in the underlying C code that runs PHP. Since most of the work of SimpleXML is done not in the Zend Engine, but in a third-party library called libxml2, the memory allocation will be done there, not in the PHP-specific allocation routines which would be used when, say, appending to a PHP string.

我从朱利安·保利(Julien Pauli)的幻灯片中获取了以下功能,该幻灯片查看了Linux内核对正在运行的PHP进程的看法,并找到了代表常驻集大小"的行-实际分配的物理内存量超过该流程要求保留的数量:

I took the following function from Julien Pauli's slides, which looks at the Linux kernel's view of the running PHP process and finds the line which represents the "Resident Set Size" - the amount of physical memory which has actually been allocated, rather than the amount the process has asked to be reserved:

function heap() {
    return shell_exec(sprintf('grep "VmRSS:" /proc/%s/status', getmypid()));
}

在您的示例代码中对此(以及对get_memory_usage(true))添加一个调用,我得到以下输出,显示了在解析XML时显着分配的堆"内存:

Adding a call to this (as well as to get_memory_usage(true)) in your sample code, I got the following output, showing a significant allocation of "heap" memory when you parse the XML:

OS: Linux pink-marmalade 3.8.0-29-generic #42~precise1-Ubuntu SMP Wed Aug 14 16:19:23 UTC 2013 x86_64
PHP version: 5.3.10-1ubuntu3.8
memory_get_usage(): 0.61 Mb
memory_get_usage(true): 0.75 Mb
Heap: VmRSS:        6956 kB

memory_get_usage(): 65.99 Mb
memory_get_usage(true): 66.25 Mb
Heap: VmRSS:       74348 kB

memory_get_usage(): 65.99 Mb
memory_get_usage(true): 66.25 Mb
Heap: VmRSS:      761836 kB

这篇关于PHP SimpleXML大文件没有额外的内存使用量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆