无法在Requests Python中捕获记录名称,价格,等级和图像 [英] Unable to capture records name , price and rating and image in Requests Python

查看:51
本文介绍了无法在Requests Python中捕获记录名称,价格,等级和图像的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

打印产品名称,产品尺寸价格和等级时发生异常这是我要从中提取详细信息的链接.

Exception occur when printing the productname, product size price and rating here is the link from which i want to extract the details.

import requests
import time

from requests.models import Response


params = ((
    'url','/continental-80-shoes/G27707.html'),
    ('sitePath', 'us'),)
 


headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36'
}

response = requests.get('https://www.adidas.com/api/metadata/pdp',params=params,headers=headers)

for item in response.json()['metadata']:

    itemRes = requests.get('https://www.adidas.com/api/search/product/'+item['productId'],headers=headers) 
    print(item['productId'],item['name'],item['price'],item['rating])
   

推荐答案

您必须抓取adidas网站并使用正则表达式:

You have to scrape the adidas site and use regex:

import requests
import re

endpoint = "https://www.adidas.com.au/continental-80-shoes/G27707.html"
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36'
}
response = requests.get(endpoint, headers = headers)
data = response.text

pricereg = r"(?<=\"price\":)(.*)(?=,\")"
namereg = r"(?<=\"name\":)(.*)(?=,\"co)"
ratingreg= r"(?<=\"ratingValue\":)(.*)(?=,\"reviewCou)"

price = re.search(pricereg, data, re.MULTILINE).group()
name = re.search(namereg, data, re.MULTILINE).group()
rating = re.search(ratingreg, data, re.MULTILINE).group()

print(f"name {name}, rating {rating}, price {price}")

这篇关于无法在Requests Python中捕获记录名称,价格,等级和图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆