在scrapy中提取类名 [英] Extract class name in scrapy

查看:41
本文介绍了在scrapy中提取类名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从 trustpilot.com 中删除评级.

是否可以使用scrapy提取类名?我正在尝试抓取由五个单独图像组成的评级,但这些图像位于一个带有评级名称的类中,例如,如果评级为 2 则开始:

...

如果是 3 星,则:

...

那么有没有办法我可以刮掉类 count-2count-3 假设选择器像 .css('.star-rating')?

解决方案

您可以在代码中的某处结合使用两者:

导入重新classes = response.css('.star-rating').xpath("@class").extract()对于类中的 cls:match = re.search(r'count-d+', cls)如果匹配:print("Class = {}".format(match.group(0))

I am trying to scrape rating off of trustpilot.com.

Is it possible to extract a class name using scrapy? I am trying to scrape a rating which is made up of five individual images but the images are in a class with the name of the rating for example if the rating is 2 starts then:

<div class="star-rating count-2 size-medium clearfix">...

if it is 3 stars then:

<div class="star-rating count-3 size-medium clearfix">...

So is there a way I can scrape the class count-2 or count-3 assuming a selector like .css('.star-rating')?

解决方案

You could use a combination of both somewhere in your code:

import re

classes = response.css('.star-rating').xpath("@class").extract()
for cls in classes:
    match = re.search(r'count-d+', cls)
    if match:
        print("Class = {}".format(match.group(0))

这篇关于在scrapy中提取类名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆