PHP-SimpleXML解析错误 [英] PHP - SimpleXML parse error

查看:125
本文介绍了PHP-SimpleXML解析错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

查看底部的编辑以显示更多准确的错误输出

SEE EDITS AT BOTTOM TO SHOW MORE ACCURATE ERROR OUTPUT

我第一次使用SimpleXML解析较大(约15MB)的XML文件.这些文件是航班搜索结果,因此它们具有较长的属性(链接回皮艇;例如:
"/book/flightcode=1238917408.NxJI6G.0.F.ORBITZAIR,ORBITZAIR.0.f36f1ea92513977249aa695112410052&sid=26-Vu01v7ilzhSAjPVLZ3Ul"

I'm parsing somewhat large (~15MB) XML files with PHP for the first time using SimpleXML. The files are flight search results so they have long attributes (links back to Kayak; example:
"/book/flightcode=1238917408.NxJI6G.0.F.ORBITZAIR,ORBITZAIR.0.f36f1ea92513977249aa695112410052&sid=26-Vu01v7ilzhSAjPVLZ3Ul"

SimpleXML解析时抛出此错误:

SimpleXML throws this error when parsing:

实体:第10行:解析器错误:EntityRef:期望为';' in",然后;

"Entity: line 10: parser error : EntityRef: expecting ';' in" and then;

中的38917408.NxJI6G.0.F.ORBITZAIR,ORBITZAIR.0.f36f1ea92513977249aa695112410052& sid 中" 然后;

"38917408.NxJI6G.0.F.ORBITZAIR,ORBITZAIR.0.f36f1ea92513977249aa695112410052&sid in" and then;

"simplexml_load_string()[function.simplexml-load-string]:^ in,"

"simplexml_load_string() [function.simplexml-load-string]: ^ in,"

,对于包含这些网址的每一行,依此类推.

and so forth for each line where there are these urls.

我发现提到SimpleXML并不喜欢php.net上的长属性,没有解决方案.我宁愿现在就使用和学习SimpleXML,如果有一个非垃圾的,有些简单的解决方法,可以克服此错误.

I found a mention of SimpleXML not liking long attributes on php.net with no solution. I would rather just use and learn SimpleXML for now and work past this error if there is a non-janky, somewhat easy workaround.

有人可以解决吗?预先感谢!

Does anyone have a solution? Thanks in advance!

我尝试输入XML的前13行,但是它只输出信息而没有XML,所以....如果可以的话,我可以这样做.我不确定使用其他解析器/扩展是否会减少功能或简化易用性,但是如果没有解决方法,请随时提出另一个建议(DOM或XMLReader是我在想的).

I tried entering the first 13 lines of the XML but it only outputs the info without the XML so.... I can do that if it will help. I'm not sure if using another parser/extension would reduce the functionality or ease of use but please feel free to suggest another if there's not workaround (DOM or XMLReader is what I'm thinking perhaps).

下面的编辑包含较少的人为错误输出:

EDITS BELOW TO INCLUDE LESS ADULTERATED ERROR OUTPUT:

http://dl.dropbox.com/u/10206237/stack_overflow_xml.xml

错误1:

simplexml_load_string() [<a href='function.simplexml-load-string'>function.simplexml-load-string</a>]: Entity: line 10: parser error : EntityRef: expecting ';' in 

错误2 :(我认为XML很好,因为它可以与使用DOM的Python脚本一起使用;因为我不了解Python,所以我将其翻译为PHP).我不知道浏览器中的输出会有所不同.感谢您的耐心等待.)

ERROR 2:(The XML I think is fine because it works with a Python script using DOM; I'm translating it to PHP because I don't know Python). I didn't know that the output in the browser would be different. Thanks for being patient.)

<a href='function.simplexml-load-string'>function.simplexml-load-string</a>]: 38917408.Pt8rW8.0.F.ORBITZAIR,ORBITZAIR.0.f36f1ea92513977249aa695112410052&amp;_sid_ in 

错误3:

function.simplexml-load-string</a>]:                                                                                ^ in     

(所有这些空格都在其中)

(all of those spaces are in there)

推荐答案

如其他答案和注释中所述,您的源XML已损坏,并且XML解析器应拒绝无效输入. libxml具有恢复"模式,可以让您加载此损坏的XML,但是会丢失& sid"部分,因此无济于事.

As mentionned in other answers and comments, your source XML is broken and XML parsers are supposed to reject invalid input. libxml has a "recover" mode which would let you load this broken XML, but you would lose the "&sid" part so it wouldn't help.

如果幸运的话,并且喜欢冒险,您可以尝试通过某种固定输入的方式使它起作用.您可以使用一些字符串替换来逃避看起来在URL查询部分的&"号.

If you're lucky and you like taking chances, you can try to somehow make it work by kind-of-fixing the input. You can use some string replacement to escape the ampersands that look like they're in the query part of an URL.

$xml = file_get_contents('broken.xml');
// replace '&' followed by a bunch of letters, numbers
// and underscores and an equal sign with &amp;
$xml = preg_replace('#&(?=[a-z_0-9]+=)#', '&amp;', $xml);
$sxe = simplexml_load_string($xml);

这当然是只有黑客,解决此问题的唯一好方法是要求您的XML提供程序修复其生成器.因为如果它生成损坏的XML,谁知道其他错误会被忽略?

This is, of course, nothing but a hack and the only good way to fix your situation is to ask your XML provider to fix their generator. Because if it generates broken XML, who knows what other errors slip by unnoticed?

这篇关于PHP-SimpleXML解析错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆