Scrapy - 获取所有产品详细信息 [英] Scrapy - Grab all product details

查看：37 发布时间：2021/7/16 22:24:22 scrapy scrapy-spider

本文介绍了Scrapy - 获取所有产品详细信息的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要从此页面获取所有产品详细信息(带有绿色勾号):https://sourceforge.net/software/product/Budget-Maestro/

I need to grab all Product Details (with green tickmarks) from this page: https://sourceforge.net/software/product/Budget-Maestro/

    divs = response.xpath("//section[@class='row psp-section m-section-comm-details m-section-emphasized grey']/div[@class='list-outer column']/div")
    for div in divs:
        detail = div.xpath("./h3/text()").extract_first().strip() + ":"
        if detail!="Company Information:":
            divs2 = div.xpath(".//div[@class='list']/div")
            for div2 in divs2:
                dd = [val for val in div2.xpath("./text()").extract() if val.strip('\n').strip().strip('\n')]
                for d in dd:
                    detail = detail + d + ","
            detail = detail.strip(",")
            product_details = product_details + detail + "|"
    product_details = product_details.strip("|")

但它也为我提供了一些带有 \n 的功能.我相信一定有一个更聪明的 &更短的方法来做到这一点.

But it gives me some features with \n as well. And I'm sure there must be a smarter & shorter way to do this.

推荐答案

如果您只需要产品详细信息"中的数据，请检查:

If you need data only from "Product Details", check this:

In [6]: response.css("section.m-section-comm-details div.list svg").xpath('.//following-sibling::text()').extract()
Out[6]: 
[u' SaaS\n                        ',
 u' Windows\n                        ',
 u' Live Online ',
 u' In Person ',
 u' Online ',
 u' Business Hours ']

这篇关于Scrapy - 获取所有产品详细信息的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Scrapy - 获取所有产品详细信息 [英] Scrapy - Grab all product details

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Scrapy - 获取所有产品详细信息 [英] Scrapy - Grab all product details

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭