Scrapy，Xpath，提取 h3 内容? [英] Scrapy, Xpath, extracting h3 content?

查看：92 发布时间：2021/7/16 22:24:16 html python-3.x xpath web-scraping scrapy

本文介绍了Scrapy，Xpath，提取 h3 内容?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要在 h3 class AIRFRAME /h3 之后但在 h3 class ENGINES /h3 之前提取所有内容:

我需要提取的内容:

"投入使用:2010 年 12 月自新以来的总时间:3,580 小时"等

HTML 代码照片 - 不确定如何直接嵌入而不是链接

以下是我尝试过的，但没有返回任何内容.我是 Scrapy 和编程的新手，所以我希望得到一些帮助.我试过搜索其他帖子和谷歌，但没有任何运气.

input = response.xpath("//div[@class='large-6 cell selectorgadget_rejected']/h3/text()").extract()

输出 = []

解决方案

您使用的代码引用了另一个没有您提到的文本的类.

input = response.xpath("//div[@class='large-6 cell selectorgadget_rejected']/h3/text()").extract()

图中类的名称是large-6 cell selectorgadget_selected而不是large-6 cell selectorgadget_rejected

此外，如果您使用 .../h3/text()，您将抓取 H3 标签内的文本.据我了解，您需要 H3 之后的文本，在

之间.所以尝试这样的事情:

input = response.xpath("//div[@class='large-6 cell selectorgadget_selected']/text()").extract()

I need to extract everything after h3 class AIRFRAME /h3 but before h3 class ENGINES /h3:

What I need extracted:

"Entry Into Service: December 2010 Total Time Since New: 3,580 Hours" etc.

HTML code photo - not sure how to embed it directly instead of having a link

Below is what I've tried but it doesn't return anything. I'm new to Scrapy and programming in general so I would appreciate some help. I've tried searching through other posts and google in general without any luck.

input = response.xpath("//div[@class='large-6 cell selectorgadget_rejected']/h3/text()").extract()

output = []

解决方案

The code that you are using is referencing another class that doesn't have the text you mentioned.

input = response.xpath("//div[@class='large-6 cell selectorgadget_rejected']/h3/text()").extract()

The name of the class in the picture is large-6 cell selectorgadget_selected and not large-6 cell selectorgadget_rejected

Also, if you use .../h3/text() you are going to scrape the text inside the H3 tag. As I understand you want the text after the H3, between the <div>. So try something like this:

input = response.xpath("//div[@class='large-6 cell selectorgadget_selected']/text()").extract()

这篇关于Scrapy，Xpath，提取 h3 内容?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Scrapy，Xpath，提取 h3 内容? [英] Scrapy, Xpath, extracting h3 content?

问题描述

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

Scrapy，Xpath，提取 h3 内容? [英] Scrapy, Xpath, extracting h3 content?

问题描述

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭