Python - 沃尔玛的网页抓取 [英] Python - Web Scraping for Walmart

查看:81
本文介绍了Python - 沃尔玛的网页抓取的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 Python 和 BeautifulSoup bs4Walmart 获取一些数据>.

I'm trying to get some datas from Walmart using Python and BeautifulSoup bs4.

只是我写了一个代码来获取所有类别名称并且有效:

Simply I wrote a code for get the all category names and that works:

import requests
from bs4 import BeautifulSoup

baseurl = 'https://www.walmart.com/'

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36'
}

r = requests.get('https://www.walmart.com/all-departments')

soup = BeautifulSoup(r.content, 'lxml')

sub_list = soup.find_all('div', class_='alldeps-DepartmentNav-link-wrapper display-inline-block u-size-1-3')

print(sub_list)

问题是;当我尝试从这个 link 获取值时,使用下面的代码,我得到空结果:

The problem is; when I try to get the values from this link by using the code below, I get empty results:

import requests
from bs4 import BeautifulSoup

baseurl = 'https://www.walmart.com/'

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36'
}

r = requests.get('https://www.walmart.com/browse/snacks-cookies-chips/cookies/976759_976787_1001391')

soup = BeautifulSoup(r.content, 'lxml')

general_list = soup.find_all('a', class_='product-title-link line-clamp line-clamp-2 truncate-title')

print(general_list)

当我搜索旧文档时,我只看到了 SerpApi 解决方案,但它是付费解决方案,所以有什么方法可以获得这些值吗?还是我做错了什么?

As I searched on old docs, I see only SerpApi solution but it is paid solution so is there any way for get the values? Or am I doing something wrong?

推荐答案

这里有很好的 Selenium 教程:https://selenium-python.readthedocs.io/getting-started.html#simple-usage.

Here is good tutotial for Selenium: https://selenium-python.readthedocs.io/getting-started.html#simple-usage.

我写了一个简短的脚本供您开始使用.您只需要下载 chromedriver(Chromium) 并将其放入路径.对于 Windows,chromedriver 将具有 .exe 分辨率

I've wrote a short script for you to get started. All you need is to download chromedriver(Chromium) and put it to path. For Windows, chromedriver will have .exe resolution

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome(executable_path='/snap/bin/chromium.chromedriver')
driver.get("https://www.walmart.com/browse/snacks-cookies-chips/cookies/976759_976787_1001391")
assert "Walmart.com" in driver.title
wait = WebDriverWait(driver, 20)
wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, ".product-title-link.line-clamp.line-clamp-2.truncate-title>span")))

elems = driver.find_elements_by_css_selector(".product-title-link.line-clamp.line-clamp-2.truncate-title>span")
for el in elems:
    print(el.text)
driver.close()

我的输出:

Lance Sandwich Cookies, Nekot Lemon Creme, 8 Ct Box
Nature Valley Biscuits, Almond Butter Breakfast Biscuits w/ Nut Filling, 13.5 oz
Pepperidge Farm Soft Baked Strawberry Cheesecake Cookies, 8.6 oz. Bag
Nutter Butter Family Size Peanut Butter Sandwich Cookies, 16 oz
SnackWell's Devil's Food Cookie Cakes 6.75 oz. Box
Munk Pack Protein Cookies, Variety Pack, Vegan, Gluten Free, Dairy Free Snacks, 6 Count
Great Value Twist & Shout Chocolate Sandwich Cookies, 15.5 Oz.
CHIPS AHOY! Chewy Brownie Filled Chocolate Chip Cookies, 9.5 oz
Nutter Butter Peanut Butter Wafer Cookies, 10.5 oz
Nabisco Sweet Treats Cookie Variety Pack OREO, OREO Golden & CHIPS AHOY!, 30 Snack Packs (2 Cookies Per Pack)
Archway Cookies, Soft Dutch Cocoa, 8.75 oz
OREO Double Stuf Chocolate Sandwich Cookies, Family Size, 20 oz
OREO Chocolate Sandwich Cookies, Party Size, 25.5 oz
Fiber One Soft-Baked Cookies, Chocolate Chunk, 6.6 oz
Nature Valley Toasted Coconut Biscuits with Coconut Filling, 10 ct, 13.5 oz
Great Value Duplex Sandwich Creme Cookies Family Size, 25 Oz
Great Value Assorted Sandwich creme Cookies Family Size, 25 oz
CHIPS AHOY! Original Chocolate Chip Cookies, Family Size, 18.2 oz
Archway Cookies, Crispy Windmill, 9 oz
Nabisco Classic Mix Variety Pack, OREO Mini, CHIPS AHOY! Mini, Nutter Butter Bites, RITZ Bits Cheese, Easter Snacks, 20 Snack Packs
Mother's Original Circus Animal Cookies 11 oz
Lotus Biscoff Cookies, 8.8 Oz.
Archway Cookies, Crispy Gingersnap, 12 oz
Great Value Vanilla Creme Wafer Cookies, 8 oz
Pepperidge Farm Verona Strawberry Thumbprint Cookies, 6.75 oz. Bag
Absolutely Gluten Free Coconut Macaroons
Sheila G's Brownie Brittle GLUTEN-FREE Chocolate Chip Cookie Snack Thins, 4.5oz
CHIPS AHOY! Peanut Butter Cup Chocolate Cookies, Family Size, 14.25 oz
Great Value Lemon Sandwich Creme Cookies Family Size, 25 oz
Keebler Sandies Classic Shortbread Cookies 11.2 oz
Nabisco Cookie Variety Pack, OREO, Nutter Butter, CHIPS AHOY!, 12 Snack Packs
OREO Chocolate Sandwich Cookies, Family Size, 19.1 oz
Lu Petit Ecolier European Dark Chocolate Biscuit Cookies, 45% Cocoa, 5.3 oz
Keebler Sandies Pecan Shortbread Cookies 17.2 oz
CHIPS AHOY! Reeses Peanut Butter Cup Chocolate Chip Cookies, 9.5 oz
Fiber One Soft-Baked Cookies, Oatmeal Raisin, 6 ct, 6.6 oz
OREO Dark Chocolate Crme Chocolate Sandwich Cookies, Family Size, 17 oz
Pinwheels Pure Chocolate & Marshmallow Cookies, 12 oz
Keebler Fudge Stripes Original Cookies 17.3 oz
Pepperidge Farm Classic Collection Cookies, 13.25 oz. Box

这篇关于Python - 沃尔玛的网页抓取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆