提取scrapy中的类名 [英] Extract class name in scrapy

查看:70
本文介绍了提取scrapy中的类名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从trustpilot.com上删除评分.

I am trying to scrape rating off of trustpilot.com.

是否可以使用scrapy提取类名?我正在尝试刮除由五个单独的图像组成的评分,但是这些图像属于该评分名称的类,例如,如果评分为2开始,则:

Is it possible to extract a class name using scrapy? I am trying to scrape a rating which is made up of five individual images but the images are in a class with the name of the rating for example if the rating is 2 starts then:

<div class="star-rating count-2 size-medium clearfix">...

如果是3星,则:

<div class="star-rating count-3 size-medium clearfix">...

那么有没有办法假设.css('.star-rating')这样的选择器,我可以抓取count-2count-3类?

So is there a way I can scrape the class count-2 or count-3 assuming a selector like .css('.star-rating')?

推荐答案

您可以在代码中的某处使用两者的组合:

You could use a combination of both somewhere in your code:

import re

classes = response.css('.star-rating').xpath("@class").extract()
for cls in classes:
    match = re.search(r'\bcount-\d+\b', cls)
    if match:
        print("Class = {}".format(match.group(0))

这篇关于提取scrapy中的类名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆