Xpath 返回 null [英] Xpath returns null
问题描述
我需要刮这个页面的价格:
有人可以帮忙吗?
谢谢!
使用 XHR 提取时:
如何检索价格?
您的问题不是 xpath,而是使用 XHR 检索价格.
如果你使用scrapy sheel并输入view(response),你可以看到价格没有被生成:
查看原网页出处,搜索价格:
然后用这个网址刮价格:
def parse(self, response):进口重新price_url = 'https://www.asos.com' + re.search(r'window.asos.pdp.config.stockPriceApiUrl = \'(.+)\'', response.text).group(1)产量scrapy.Request(url=price_url,方法='GET',回调=self.parse_price,标头=self.headers)def parse_price(自我,响应):导入jsonjsonresponse = json.loads(response.text)………………………………………………………………
我无法通过我提供的标头解决 403 错误,但也许你会有更多的运气.
为了从 json 文件中获取价格,实际上不需要 json.loads
def parse_price(self, response):jsonresponse = response.json()[0]price = jsonresponse['productPrice']['current']['text']# 如果您愿意,也可以使用 jsonresponse.get()打印(价格)
输出:
£10.00
I need to scrape the price of this page: https://www.asos.com/monki/monki-lisa-cropped-vest-top-with-ruched-side-in-black/prd/23590636?colourwayid=60495910&cid=2623
However it is always returning null:
My code:
'price' :response.xpath('//*[contains(@class, "current-price")]').get()
Can someone help please?
Thanks!
When Extracted using XHR:
How to retrieve price?
Your problem is not the xpath, it's that the price is being retrieved with XHR.
If you use scrapy sheel and type view(response) you can see that the price is not being generated:
Look at the source of the original webpage and search for the price:
Then use this url the scrape the price:
def parse(self, response):
import re
price_url = 'https://www.asos.com' + re.search(r'window.asos.pdp.config.stockPriceApiUrl = \'(.+)\'', response.text).group(1)
yield scrapy.Request(url=price_url,
method='GET',
callback=self.parse_price,
headers=self.headers)
def parse_price(self, response):
import json
jsonresponse = json.loads(response.text)
...............
...............
...............
I couldn't get around 403 error with the headers I provided, but maybe you'll have more luck.
Edit:
In order to get the price from the json file there's actually no need for json.loads
def parse_price(self, response):
jsonresponse = response.json()[0]
price = jsonresponse['productPrice']['current']['text']
# You can also use jsonresponse.get() if you prefer
print(price)
Output:
£10.00
这篇关于Xpath 返回 null的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!