XMLReader &simpleXML Combo，带条件 [英] XMLReader & simpleXML Combo, with Conditions

查看：30 发布时间：2021/10/2 18:49:21 php xml simplexml xmlreader

本文介绍了XMLReader &simpleXML Combo，带条件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我使用 XMLReader 和 simpleXML 的组合来解析 WordPress 导出文件中的帖子.我意识到这有点不正常，但它更多是备份项目，因此如果我们将来需要，我们可以轻松地提取其中一篇文章.他们所在的 WP 网站需要关闭.

我遇到的问题是 XML 文件中的某些节点为空或包含无用值(即不是完整的帖子).我需要添加一些字符串长度条件，但是，我不确定如何检查每个条件.

open($path_to_xml_file);while($reader->read()){if($reader->nodeType == XMLReader::ELEMENT && $reader->name == 'item'){$doc = new DOMDocument('1.0', 'UTF-8');$xml = simplexml_import_dom($doc->importNode($reader->expand(),true));//echo $xml->title;//管他呢//照顾好文章$newcontent = $xml->children('http://purl.org/rss/1.0/modules/content/');$contentString = $newcontent->encoded;$titleString = $xml->title;回声'<div class="article-container" id="article-' . $xml->title . '"><a href="#top" class="top-link">返回顶部</a><h2>'.$xml-> 标题.'</h2><div class="文章">'.$newcontent->encoded .'</div>

';}}?>

我能够仅使用 simpleXML 就成功地检查了这一点，但是，它本身就占用了太多的内存.这是我的 simplexml 代码:

item as $item) :$newcontent = $item->children('http://purl.org/rss/1.0/modules/content/');?><?php$contentString = $newcontent->encoded;$titleString = $item->title;if ((strlen($contentString) <13) || (strlen($titleString) <5)) {回声'';} 别的 {回声'<div class="article-container" id="article-' . $item->title . '"><a href="#top" class="top-link">返回顶部</a><h2>'.$item->title .'</h2><div class="文章">'.$newcontent->encoded .'</div>

';}?><?php endforeach;?>

更新

在 Francis 的帮助下，它现在可以工作了.代码如下:

open($path_to_xml_file);$contentNS = 'http://purl.org/rss/1.0/modules/content/';while($reader->read()) {if($reader->nodeType == XMLReader::ELEMENT and $reader->name == 'item') {$doc = new DOMDocument('1.0','UTF-8');$xml = simplexml_import_dom($doc->importNode($reader->expand(), true));$titleString = (string) $xml->title;$contentString = (string) $xml->children($contentNS)->encoded;如果 (strlen($contentString) > 12 和 strlen($titleString) > 4) {//小心你的输出转义！//下面这看起来可能是错误的://- 用于 ID 的 $titleString(使用 slug)//- $titleString 未转义//- $contentString 应该被转义?不确定这里.//你考虑过使用 XMLWriter() 吗?回声'<div class="article-container" id="article-' . $titleString . '"><a href="#top" class="top-link">返回顶部</a><h2>'.$titleString .'</h2><div class="文章">'.$contentString .'</div>

';} 别的 {回声'';}$reader->next();//跳过子树，转到下一个兄弟项//我们已经扩展()了它，所以我们不需要走它.}}?>

解决方案

当你说 $contentString = $newcontent->encoded 时，$contentString 的类型是不是 string 而是 SimpleXMLElement.因此 strlen() 返回了一些无意义的东西.

您需要将 SimpleXMLElements 显式转换为 string 以获取元素的文本值:

$contentString = (string) $newcontent->encoded;

顺便说一句，您可以通过使用 XMLReader::expand() 的可选参数来简化 DOM 扩展和转换为 SimpleXMLElement:

$sxe = simplexml_import_dom($reader->expand(new DOMDocument('1.0','UTF-8')));

EDIT 使用您的第一个代码块的完整示例编写来执行您想要的操作(我认为?)正如您所看到的，我所做的只是从您的第二个代码示例中获取内部循环并放入它在您的第一个代码示例的内部循环中.

$reader = new XMLReader();$reader->open($path_to_xml_file);$contentNS = 'http://purl.org/rss/1.0/modules/content/';while($reader->read()) {if($reader->nodeType == XMLReader::ELEMENT and $reader->name == 'item') {$xml = simplexml_import_dom($reader->expand(new DOMDocument('1.0', 'UTF-8')));$titleString = (string) $xml->title;$contentString = (string) $xml->children($contentNS)->encoded;如果 (strlen($contentString) > 12 和 strlen($titleString) > 4) {//小心你的输出转义！//下面这看起来可能是错误的://- 用于 ID 的 $titleString(使用 slug)//- $titleString 未转义//- $contentString 应该被转义?不确定在这里.//你考虑过使用 XMLWriter() 吗?回声'<div class="article-container" id="article-' . $titleString . '"><a href="#top" class="top-link">返回顶部</a><h2>'.$titleString .'</h2><div class="文章">'.$contentString .'</div>

<?php $path_to_xml_file = 'compress.zlib://wordpress.2011.xml.gz'; $reader = new XMLReader(); $reader->open($path_to_xml_file); while($reader->read()) { if($reader->nodeType == XMLReader::ELEMENT && $reader->name == 'item') { $doc = new DOMDocument('1.0', 'UTF-8'); $xml = simplexml_import_dom($doc->importNode($reader->expand(),true)); //echo $xml->title; //or whatever // Take care of the articles $newcontent = $xml->children('http://purl.org/rss/1.0/modules/content/'); $contentString = $newcontent->encoded; $titleString = $xml->title; echo ' <div class="article-container" id="article-' . $xml->title . '"> <a href="#top" class="top-link">Back to the Top</a> <h2>' . $xml->title . '</h2> <div class="articles">' . $newcontent->encoded . '</div> </div>'; } } ?>

<?php $url = 'wordpress.2011.xml.gz'; $xml = new SimpleXMLElement("compress.zlib://$url", NULL, TRUE); foreach ($xml->item as $item) : $newcontent = $item->children('http://purl.org/rss/1.0/modules/content/'); ?> <?php $contentString = $newcontent->encoded; $titleString = $item->title; if ((strlen($contentString) < 13) || (strlen($titleString) < 5)) { echo ''; } else { echo ' <div class="article-container" id="article-' . $item->title . '"> <a href="#top" class="top-link">Back to the Top</a> <h2>' . $item->title . '</h2> <div class="articles">' . $newcontent->encoded . '</div> </div>'; } ?> <?php endforeach; ?>

<?php $path_to_xml_file = 'compress.zlib://wordpress.2011.xml.gz'; $reader = new XMLReader(); $reader->open($path_to_xml_file); $contentNS = 'http://purl.org/rss/1.0/modules/content/'; while($reader->read()) { if($reader->nodeType == XMLReader::ELEMENT and $reader->name == 'item') { $doc = new DOMDocument('1.0','UTF-8'); $xml = simplexml_import_dom($doc->importNode($reader->expand(), true)); $titleString = (string) $xml->title; $contentString = (string) $xml->children($contentNS)->encoded; if (strlen($contentString) > 12 and strlen($titleString) > 4) { // Be careful with your output escaping! // This below looks like it might be wrong: // - $titleString for an ID (use slug) // - $titleString not escaped // - $contentString should be escaped? not sure here. // Have you considered using XMLWriter()? echo ' <div class="article-container" id="article-' . $titleString . '"> <a href="#top" class="top-link">Back to the Top</a> <h2>' . $titleString . '</h2> <div class="articles">' . $contentString . '</div> </div>'; } else { echo''; } $reader->next(); //skip the subtrees, go to next item sibling // we already expand()ed this so we don't need to walk it. } } ?>

$reader = new XMLReader(); $reader->open($path_to_xml_file); $contentNS = 'http://purl.org/rss/1.0/modules/content/'; while($reader->read()) { if($reader->nodeType == XMLReader::ELEMENT and $reader->name == 'item') { $xml = simplexml_import_dom($reader->expand(new DOMDocument('1.0', 'UTF-8'))); $titleString = (string) $xml->title; $contentString = (string) $xml->children($contentNS)->encoded; if (strlen($contentString) > 12 and strlen($titleString) > 4) { // Be careful with your output escaping! // This below looks like it might be wrong: // - $titleString for an ID (use slug) // - $titleString not escaped // - $contentString should be escaped? not sure here. // Have you considered using XMLWriter()? echo ' <div class="article-container" id="article-' . $titleString . '"> <a href="#top" class="top-link">Back to the Top</a> <h2>' . $titleString . '</h2> <div class="articles">' . $contentString . '</div> </div>'; } $reader->next(); //skip the subtrees, go to next item sibling // we already expand()ed this so we don't need to walk it. } }

XMLReader &simpleXML Combo，带条件 [英] XMLReader & simpleXML Combo, with Conditions

问题描述

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

XMLReader &amp;simpleXML Combo，带条件 [英] XMLReader &amp; simpleXML Combo, with Conditions

问题描述

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

XMLReader &simpleXML Combo，带条件 [英] XMLReader & simpleXML Combo, with Conditions

登录关闭