是什么导致DOMNode :: nodeValue为空? [英] What would cause DOMNode::nodeValue to be empty?

查看:97
本文介绍了是什么导致DOMNode :: nodeValue为空?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在尝试使用DOMDocument解析文档,但遇到了一些严重的问题。我创建了一个可以在php 5.2.9上正常运行的脚本,使用DOMNode :: nodeValue提取了内容。相同的脚本无法在php 5.3.3上获取任何内容-即使它正确导航到适当的节点以提取内容。

I'm currently trying to parse a document with DOMDocument, and I'm having some serious problems. I created a script that runs fine on php 5.2.9, ripping out content using DOMNode::nodeValue. The same script fails to get any content on php 5.3.3 - even though it correctly navigates to the proper nodes to extract content.

基本上,使用的代码如下所示:

Basically, the code used looks like this:

$dom = new DOMDocument();
$dom->loadHTML($data);
$dom->preserveWhiteSpace = false; 
$xpath = new DOMXpath($dom);
$nodelist = $xpath->query($query);
$value = $nodelist->item(0)->nodeValue;

我检查过,以确保item(0)实际上是一个节点-它在那里并且甚至是正确的类型,但nodeValue为空。

I've checked to make sure that item(0) is in fact a node - it's there and even of the right type, but nodeValue is empty.

该脚本适用于某些文档,但不适用于其他文档(适用于5.3.3)-适用于5.2.9,适用于所有文档文档,返回正确的nodeValue。

The script works on some documents but not others (on 5.3.3) - on 5.2.9 it works on all documents, returning the proper nodeValue.

推荐答案

我似乎错过了一些基本知识和/或错误(尽管该错误是在php或libxml中,我不知道)。基本上,通过确保用loadHTML加载的数据是UTF-8编码来解决此问题的。请注意,并不是整个文档都需要UTF-8编码-这里的问题是元素中存在一个字符,而UTF-8中却没有。

I seem to have missed something basic and/or a bug (though if the bug is in php or libxml I don't know). Basically, the issue is fixed by making sure the data loaded with loadHTML is UTF-8 encoded. Mind you, it's not the entire document that needs to be UTF-8 encoded - the problem here was that there was a character in the element which wasn't in UTF-8. That then threw off everything else in the document handling.

让我明白的是,这基本上意味着所有文档内容都被扔掉了-但结构正常运行。没有错误或任何暗示内容被视为无效的内容。

What gets me is that this basically meant all document content was thrown out - but the structure was in place working normally. No errors or anything to suggest the content was seen as invalid.

这篇关于是什么导致DOMNode :: nodeValue为空?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆