多个类别的Scrapy抓取div? [英] Scrapy grab div with multiple classes?
问题描述
我正在尝试使用产品"类来获取div.问题是,某些产品"类的div也有小产品"类.因此,当我使用xpath('//div[@class='product']')
时,它仅捕获具有一个类而不是多个类的div.我该怎么办呢?
I am trying to grab div's with the class: 'product'. The problem is, some of the div's with class 'product' also have the class 'product-small'. So when I use xpath('//div[@class='product']')
, it only captures the divs with one class and not multiple. How can I do this with scrapy?
示例:
- 渔获量:
<div class='product'>
- 没有抓住:
<div class='product product-small'>
- Catches:
<div class='product'>
- Doesn't catch:
<div class='product product-small'>
推荐答案
对于此部分查询,您应该考虑使用CSS选择器.
You should consider using a CSS selector for this part of your query.
http: //doc.scrapy.org/en/latest/topics/selectors.html#when-querying-by-class-consider-using-css
from scrapy import Selector
sel = Selector(text='<div class="product product-small">I am a product!</div>')
print sel.css('.product').extract()
如果需要,可以链接CSS和XPath选择器,如该页面上的示例所示.
If you need to, you can chain CSS and XPath selectors, as in the example on that page.
这篇关于多个类别的Scrapy抓取div?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!