PHP Simplexml_Load_File 失败 [英] PHP Simplexml_Load_File fails

查看:32
本文介绍了PHP Simplexml_Load_File 失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已成功获得 xml 格式的发布结果页面并将内容写入本地文件Publications.xml".问题是当我使用 simplexml_load_file("Publications.xml") 时,它失败了.无法弄清楚为什么.

在最后但第二行,解析器失败,我收到消息不能".我已经仔细检查了 xml 文件,它看起来状况良好.

如果有人能告诉我有关此问题的任何解决方法,我将不胜感激.这是上面的 PHP 脚本尝试读取的 xml 文件的副本(http://pastebin.com/U0fEKmZL):

<预>&lt;PubmedArticle>&lt;MedlineCitation Status="Publisher" Owner="NLM">&lt;PMID Version="1"&gt;23314841&lt;/PMID>&lt;创建日期>&lt;年>2013&lt;/年>&lt;月>1&lt;/月>&lt;Day>14&lt;/Day>&lt;/DateCreated>&lt;Article PubModel="Print-Electronic">&lt;期刊>&lt;ISSN IssnType="电子">1432-0932&lt;/ISSN>&lt;JournalIssue CitedMedium="互联网">&lt;发布日期>&lt;年>2013&lt;/年>&lt;月>一月&lt;/月>&lt;Day>12&lt;/Day>&lt;/PubDate>...(太长,见链接)

解决方案

出于某种原因,pubmed 服务器将整个 XML 文件作为 HTML 文件返回,其中包含单个 <pre> 标记XML.它还包含多个 XML 片段(有多个 <PubmedArticle> 元素并且它们周围没有容器).显然,这是为了由一些古怪的自定义代码处理.

您可以通过调用 SimpleXML 两次来解包"XML,如下所示:

$outer_xml = simplexml_load_file($local);$inner_xml = simplexml_load_string('<dummyContainer>' . (string)$outer_xml .'</dummyContainer>');foreach ( $inner_xml->PubmedArticle 作为 $article ){//等等}

解释:

  • 外部XML 文档"是 HTML,它具有

  • 的单个外部元素
  • 将其转换为字符串(为了清晰和良好的习惯,我已经明确地使用 (string) 完成)将为您提供 <pre> 标记的内容,即所有 元素
  • 将该内容包装在 标记中将为您提供一个有效的 XML 文档,其中每个 元素作为顶级子元素在文档中

I have successfully been able to get a pubmed results page in xml format and write the contents to a local file "Publications.xml". The problem is when I use simplexml_load_file("Publications.xml"), it fails. Not able to figure out why.

<?php
$feed = 'http://www.ncbi.nlm.nih.gov/pubmed?term=carl&sort=pubdate&report=xml';
$local = 'Publications.xml';
$curtime = time();
$filemodtime;
if( (!file_exists($local)) || (time() - filemtime($local)) > 86400 )
{
    $contents = file_get_contents($feed);
    $fp = fopen($local,"w");
    fwrite($fp, $contents);
    fclose($fp);
}
$xml = simplexml_load_file($local) or ("Can't");
?>

On the last but the second line the parser fails and I get the message "Can't". I have double checked the xml file and it appears to be in a good shape.

If anyone can let me know about any workarounds for this one, I will be very grateful. Here's a copy of the xml file the PHP script above tries to read (http://pastebin.com/U0fEKmZL):

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<pre>
&lt;PubmedArticle&gt;
    &lt;MedlineCitation Status="Publisher" Owner="NLM"&gt;
        &lt;PMID Version="1"&gt;23314841&lt;/PMID&gt;
        &lt;DateCreated&gt;
            &lt;Year&gt;2013&lt;/Year&gt;
            &lt;Month&gt;1&lt;/Month&gt;
            &lt;Day&gt;14&lt;/Day&gt;
        &lt;/DateCreated&gt;
        &lt;Article PubModel="Print-Electronic"&gt;
            &lt;Journal&gt;
                &lt;ISSN IssnType="Electronic"&gt;1432-0932&lt;/ISSN&gt;
                &lt;JournalIssue CitedMedium="Internet"&gt;
                    &lt;PubDate&gt;
                        &lt;Year&gt;2013&lt;/Year&gt;
                        &lt;Month&gt;Jan&lt;/Month&gt;
                        &lt;Day&gt;12&lt;/Day&gt;
                    &lt;/PubDate&gt;

 ... (too long, see link)

解决方案

For some reason, the pubmed server is returning that entire XML file as an HTML file with a single <pre> tag containing the XML. It also contains multiple XML fragments (there's several <PubmedArticle> elements and no container around them). Clearly this is intended to be processed by some wacky custom code.

You could "unwrap" the XML by calling SimpleXML twice, like so:

$outer_xml = simplexml_load_file($local);
$inner_xml = simplexml_load_string('<dummyContainer>' . (string)$outer_xml . '</dummyContainer>');
foreach ( $inner_xml->PubmedArticle as $article )
{
    // etc
}

To explain:

  • the outer "XML document" is the HTML, which has a single outer element of <pre>
  • casting that to string (which I've done explicitly with (string) for clarity and good habit) will give you the contents of that <pre> tag, i.e. all the <PubmedArticle> elements
  • wrapping that content in a <dummyElement> tag will give you a valid XML document, with each of the <PubmedArticle> elements as a top-level child in the document

这篇关于PHP Simplexml_Load_File 失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆