XML DomDocument优化 [英] XML DomDocument optimization
问题描述
我有一个5MB的xml文件
I have a 5MB xml file
我正在使用以下代码来获取所有nodeValue
I'm using the following code to get all nodeValue
$dom = new DomDocument('1.0', 'UTF-8');
if(!$dom->load($url))
return;
$games = $dom->getElementsByTagName("game");
foreach($games as $game)
{
}
这需要76秒,大约有2000个游戏
标签。是否有任何优化或其他解决方案来获取数据?
This takes 76 seconds and there are around 2000 games
tag. Is there any optimization or other solution to get the data?
推荐答案
您不应在大型XML文件上使用文档对象模型,它应用于人类可读的文档,而不是大型数据集!
You shouldn't use the Document Object Model on large XML files, it is intended for human readable documents, not big datasets!
如果要快速访问,应该使用XMLReader或SimpleXML。
If you want fast access you should use XMLReader or SimpleXML.
XMLReader是解析整个文档的理想选择,并且SimpleXML具有出色的XPath功能,可以快速检索数据。
XMLReader is ideal for parsing whole documents, and SimpleXML has a nice XPath function for retreiving data quickly.
对于XMLReader,您可以使用以下代码:
For XMLReader you can use the following code:
<?php
// Parsing a large document with XMLReader with Expand - DOM/DOMXpath
$reader = new XMLReader();
$reader->open("tooBig.xml");
while ($reader->read()) {
switch ($reader->nodeType) {
case (XMLREADER::ELEMENT):
if ($reader->localName == "game") {
$node = $reader->expand();
$dom = new DomDocument();
$n = $dom->importNode($node,true);
$dom->appendChild($n);
$xp = new DomXpath($dom);
$res = $xp->query("/game/title"); // this is an example
echo $res->item(0)->nodeValue;
}
}
}
?>
以上内容将输出所有游戏标题(假设您拥有 / game / title
XML结构)。
The above will output all game titles (assuming you have /game/title
XML structure).
对于SimpleXML,您可以使用:
For SimpleXML you can use:
$xml = file_get_contents($url);
$sxml = new SimpleXML($xml);
$games = $sxml->xpath('/game'); // returns an array of SXML nodes
foreach ($games as $game)
{
print $game->nodeValue;
}
这篇关于XML DomDocument优化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!