Scrapy 在 shell 中获得结果但不在脚本中 [英] Scrapy get result in shell but not in script
问题描述
又是一个主题^^ 根据这里的建议,我已经实现了我的机器人并在 shell 中对其进行了全部测试:
name_list = response.css("h2.label.title::text").extract()Packaging_list = response.css("div.label.packaging::text").extract()ean = response.css("h1.page-title::text").extract_first()product_price = ''.join(response.css('.product-pricing__main-price ::text').extract())公司 = "家乐福"对于名称,包装,zip 中的价格(名称列表,包装列表,产品价格):item = ScrapybotItem()项目['ean'] = eanitem['desc'] = name.replace("\n","").strip() + " " + 包装项目['价格'] = 价格item['company'] = 公司产量项目
问题在于价格字段.
对于外壳中的价格,我有例如:
在[2]中:product_price输出 [2]: '\n 5,65€\n\n \n '
相同产品的脚本输出:
{'company': '家乐福','desc': "Gel nettoyant anti-imperfects 5 en 1 L'Oréal Paris Men Expert乐"'管 de 150ml','ean': '\n 1 个结果倒出 « 3600522418634 »\n','价格':'\n'}
你知道为什么我不能用脚本得到价格的结果吗?
product_price
是一个字符串,假设您要加入以下选择器的结果:
product_price = ''.join(response.css('.product-pricing__main-price ::text').extract())
然后,当您使用 zip
时,您将将该字符串分成几部分,因此您将拥有第一项的 \n
,因为它可能是product_price
中的第一个字符.
检查这个例子:
<预><代码>>>>对于 zip([1, 2, 3, 4], [5, 6, 7, 8], 'abcd') 中的 i, j, k:打印 (i, j, k)输出:
1 5 a2 6 乙3 7 c4 8 天
one topic again ^^ Based on recommendations here, I've implemented my bot the following and tested it all in shell :
name_list = response.css("h2.label.title::text").extract()
packaging_list = response.css("div.label.packaging::text").extract()
ean = response.css("h1.page-title::text").extract_first()
product_price = ''.join(response.css('.product-pricing__main-price ::text').extract())
company = "carrefour"
for name, packaging, price in zip(name_list, packaging_list, product_price):
item = ScrapybotItem()
item['ean'] = ean
item['desc'] = name.replace("\n","").strip() + " " + packaging
item['price'] = price
item['company'] = company
yield item
Problem is with price field.
For price in shell, I have for instance :
In [2]: product_price
Out[2]: '\n 5,65€\n\n \n '
Output from script for same product :
{'company': 'carrefour',
'desc': "Gel nettoyant anti-imperfections 5 en 1 L'Oréal Paris Men Expert
le "
'tube de 150ml',
'ean': '\n 1 résultat pour « 3600522418634 »\n',
'price': '\n'}
Do you know why don't I get result for prices with script ?
product_price
is a string, given that you are joining the results of the selector in:
product_price = ''.join(response.css('.product-pricing__main-price ::text').extract())
Then, when you use zip
, you'll be splitting that string in parts, thus you'll have the \n
for the first item, as it's probably the first character in product_price
.
Check this example:
>>> for i, j, k in zip([1, 2, 3, 4], [5, 6, 7, 8], 'abcd'):
print (i, j, k)
Output:
1 5 a
2 6 b
3 7 c
4 8 d
这篇关于Scrapy 在 shell 中获得结果但不在脚本中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!