如何从 XPath(Python/scrapy) 中的类属性获取标题 [英] How to get title from class attribute in XPath(Python/scrapy)

查看：67 发布时间：2021/7/17 18:31:54 python xpath web-scraping scrapy

本文介绍了如何从 XPath(Python/scrapy) 中的类属性获取标题的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在从 tripadvisor 获取数据，但大多数第一个是相对日期，其余是正常的 MM/DD/YYYY，但仔细检查我发现相对日期有这个

Im working on getting the data from tripadvisor but most of the first ones are relative date and the rest are normal MM/DD/YYYY, but with closer inspection I see that relative date has this

<span class="ratingDate relativeDate" title="20 June 2015">Reviewed 4 weeks ago
</span>

我正在使用这个 Xpath 来获取数据

I am using this Xpath to get the data

response.xpath('//div[@class="col2of2"]//span[@class="ratingDate relativeDat
e" or @class="ratingDate"]/text()').extract()

我的问题是如何添加@title 以便获得具有正常日期格式的标题.

My question is How do I add the @title so that I can get the title which has the normal date format.

我试过了

response.xpath('//div[@class="col2of2"]//span[@class="ratingDate relativeDat
e"/@title or @class="ratingDate"]/text()').extract()

response.xpath('//div[@class="col2of2"]//span[@class="ratingDate relativeDat
e" or @class="ratingDate"]/@title/text()').extract()

推荐答案

在蜘蛛中弄清楚了，您必须执行一个条件语句，该语句将动态检查该 xpath 是否包含值.

Figured it out in the spider you have to do a conditional statement that will dynamically check whether that xpath contains values or not.

这是我的演绎.

item['date'] = sel.xpath('//*[@class="ratingDate relativeDate"]/@title').extract()
item['date'] += sel.xpath('//div[@class="col2of2"]//span[@class="ratingDate"]/text()').extract()

这篇关于如何从 XPath(Python/scrapy) 中的类属性获取标题的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何从 XPath(Python/scrapy) 中的类属性获取标题 [英] How to get title from class attribute in XPath(Python/scrapy)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何从 XPath(Python/scrapy) 中的类属性获取标题 [英] How to get title from class attribute in XPath(Python/scrapy)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭