Dom和XPath刮 - 这里有什么问题? [英] Dom and XPath scraping - What wrong here?
问题描述
我需要从互联网上的网页中删除一段文字,我正在使用dom和xpath查找数据,但是我似乎无法选择我需要的确切信息。这是我的代码到目前为止,问题是与项目(0) - > nodeValue部分 - 这适用于我的其他scrapes我有另一个页面,但不是这一个。
$ argos_html = file_get_html('http://www.argos.co.uk/static/Product/partNumber/9282197/Trail/searchtext%3EIPOD+TOUCH.htm');
$ dom_argos = new DOMDocument();
$ dom_argos-> loadHTML($ argos_html);
$ xpath_argos = new DOMXpath($ dom_argos);
$ expr_currys =/ html / body / div [4] / div [3] / form / div [2] / div / div [5] / ul / li [3] / span ;
$ nodes_argos = $ xpath_argos-> query($ expr_argos);
$ argos_stock_data = $ nodes_argos-> item(0) - > nodeValue;
有没有人会告诉我我哪里错了?因为我总是得到一个错误,它涉及到 - > item(0) - > nodeValue;但是,如果我发表评论,theres没有错误,但是没有收集任何数据...
应该是 - > nodeValue;
我明白这可能是页面结构,但我是全新的!
Thx
运行代码,我先得到:
注意:未定义的变量:expr_argos
警告:DOMXPath :: query()[domxpath.query]:无效的表达式
所以,首先,确保你使用的XPath查询有效的东西 - 例如你应该这样做:
$ nodes_argos = $ xpath_argos-> query($ expr_currys);
而不是您目前拥有的:
$ nodes_argos = $ xpath_argos-> query($ expr_argos);
然后,您会收到以下错误:
注意:试图获取非对象的财产
在以下行:
$ argos_stock_data = $ nodes_argos-> item(0) - > ;的nodeValue;
基本上,这意味着您正在尝试读取一个属性 nodeValue
,对于不是对象的东西:
$ nodes_argos-> item(0);
我猜你的XPath查询是无效的;所以,调用 xpath()
方法不会返回任何有趣的东西。
你应该检查你的(相当长的时间不容易理解) XPath查询,确保它与HTML页面中的某些内容相匹配。
I need to scrape a length of text from a webpage from the internet, I am using the dom and xpath to find the data, however I cant seem to select the exact information I need. Here is my code so far, the problem is with the item(0)->nodeValue section - this works for my other scrapes i have for another page, however not this one.
$argos_html = file_get_html('http://www.argos.co.uk/static/Product/partNumber/9282197/Trail/searchtext%3EIPOD+TOUCH.htm');
$dom_argos= new DOMDocument();
$dom_argos->loadHTML($argos_html);
$xpath_argos = new DOMXpath($dom_argos);
$expr_currys = "/html/body/div[4]/div[3]/form/div[2]/div/div[5]/ul/li[3]/span";
$nodes_argos = $xpath_argos->query($expr_argos);
$argos_stock_data = $nodes_argos->item(0)->nodeValue;
Could anyone show me where I am going wrong ? because I always get an error, which relates to the ->item(0)->nodeValue; part, however if I comment that out, theres no error, but theres no data collected at all...
Should it perhaps be just ->nodeValue;
I understand this may be down to page structures, but I am new to all of this! Thx
Running your code, I first get :
Notice: Undefined variable: expr_argos
Warning: DOMXPath::query() [domxpath.query]: Invalid expression
So, first of all, make sure you are using something valid for your XPath query -- for example, you should have this :
$nodes_argos = $xpath_argos->query($expr_currys);
instead of what you currently have :
$nodes_argos = $xpath_argos->query($expr_argos);
Then, you get the following error :
Notice: Trying to get property of non-object
on the following line :
$argos_stock_data = $nodes_argos->item(0)->nodeValue;
Basically, this means you are trying to read a property, nodeValue
, on something that is not an object : $nodes_argos->item(0);
I'm guessing your XPath query is not valid ; so, the call to the xpath()
method doesn't return anything interesting.
You should check your (quite a bit too long to be easy to understand) XPath query, making sure it matches something in your HTML page.
这篇关于Dom和XPath刮 - 这里有什么问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!