Scrapy,Xpath,提取 h3 内容? [英] Scrapy, Xpath, extracting h3 content?
问题描述
我需要在 h3 class AIRFRAME /h3 之后但在 h3 class ENGINES /h3 之前提取所有内容:
我需要提取的内容:
"投入使用:2010 年 12 月自新以来的总时间:3,580 小时"等
以下是我尝试过的,但没有返回任何内容.我是 Scrapy 和编程的新手,所以我希望得到一些帮助.我试过搜索其他帖子和谷歌,但没有任何运气.
input = response.xpath("//div[@class='large-6 cell selectorgadget_rejected']/h3/text()").extract()
输出 = []
您使用的代码引用了另一个没有您提到的文本的类.
input = response.xpath("//div[@class='large-6 cell selectorgadget_rejected']/h3/text()").extract()
图中类的名称是large-6 cell selectorgadget_selected
而不是large-6 cell selectorgadget_rejected
此外,如果您使用 .../h3/text()
,您将抓取 H3 标签内的文本.据我了解,您需要 H3 之后的文本,在
input = response.xpath("//div[@class='large-6 cell selectorgadget_selected']/text()").extract()
I need to extract everything after h3 class AIRFRAME /h3 but before h3 class ENGINES /h3:
What I need extracted:
"Entry Into Service: December 2010 Total Time Since New: 3,580 Hours" etc.
HTML code photo - not sure how to embed it directly instead of having a link
Below is what I've tried but it doesn't return anything. I'm new to Scrapy and programming in general so I would appreciate some help. I've tried searching through other posts and google in general without any luck.
input = response.xpath("//div[@class='large-6 cell selectorgadget_rejected']/h3/text()").extract()
output = []
The code that you are using is referencing another class that doesn't have the text you mentioned.
input = response.xpath("//div[@class='large-6 cell selectorgadget_rejected']/h3/text()").extract()
The name of the class in the picture is large-6 cell selectorgadget_selected
and not large-6 cell selectorgadget_rejected
Also, if you use .../h3/text()
you are going to scrape the text inside the H3 tag.
As I understand you want the text after the H3, between the <div>
. So try something like this:
input = response.xpath("//div[@class='large-6 cell selectorgadget_selected']/text()").extract()
这篇关于Scrapy,Xpath,提取 h3 内容?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!