xpath无法识别标签 [英] xpath could not recognize a tag
问题描述
我正在尝试使用xpath从论坛中删除reddit帖子。
我希望蜘蛛实现的功能之一就是当它从当前页面结束时自动进入下一页。
页面html代码如下所示:
< span>
我使用了xpath选择器:
response.xpath(// a [@class ='next-button'])
但它没有给我一切都回来了有人可以帮我弄清楚为什么吗?
谢谢!
Hao
@class
span
元素,而不是 a
链接元素。因此,将您的XPath更改为
response.xpath(// span [@class ='next-button'] / a )
选择 a
或
response.xpath(// span [@class ='next-button'] / a / @ href)
可以获取链接地址。
I am trying to use xpath to scrape reddit posts from a forum. One of the functions I want the spider to achieve is to automatically go to the next page as soon as it finishes scrapping from the current page. The page html code looks like this:
<span class="next-button"><a href="https://www.reddit.com/r/InteriorDesign/?count=975&after=t3_8ol7yp" rel="nofollow next" >next ›</a></span>
and I used the xpath selector as: response.xpath("//a[@class = 'next-button']") but it didn't give me anything back. Can someone help me figure out why?
thanks! Hao
The @class
attribute is on the span
element and not the a
link element. So change your XPath to
response.xpath("//span[@class = 'next-button']/a")
to select a
or
response.xpath("//span[@class = 'next-button']/a/@href")
to get the link address.
这篇关于xpath无法识别标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!