Python 3:使用请求无法获取网页的全部内容 [英] Python 3: using requests does not get the full content of a web page

查看:34
本文介绍了Python 3:使用请求无法获取网页的全部内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在测试使用 requests 模块来获取网页的内容.但是当我查看内容时,我发现它没有获得页面的全部内容.

I am testing using the requests module to get the content of a webpage. But when I look at the content I see that it does not get the full content of the page.

这是我的代码:

import requests
from bs4 import BeautifulSoup

url = "https://shop.nordstrom.com/c/womens-dresses-shop?origin=topnav&cm_sp=Top%20Navigation-_-Women-_-Dresses&offset=11&page=3&top=72"
page = requests.get(url)

soup = BeautifulSoup(page.content, 'html.parser')
print(soup.prettify())

同样在 chrome 网络浏览器上,如果我查看页面源代码,我看不到完整内容.

Also on the chrome web-browser if I look at the page source I do not see the full content.

有没有办法获得我提供的示例页面的完整内容?

Is there a way to get the full content of the example page that I have provided?

推荐答案

页面使用 JavaScript 呈现,提出更多请求以获取额外数据.您可以使用 selenium 获取完整页面.

The page is rendered with JavaScript making more requests to fetch additional data. You can fetch the complete page with selenium.

from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Chrome()
url = "https://shop.nordstrom.com/c/womens-dresses-shop?origin=topnav&cm_sp=Top%20Navigation-_-Women-_-Dresses&offset=11&page=3&top=72"
driver.get(url)
soup = BeautifulSoup(driver.page_source, 'html.parser')
driver.quit()
print(soup.prettify())

有关其他解决方案,请参阅我对 Google 财经 (BeautifulSoup) 的回答

For other solutions see my answer to Scraping Google Finance (BeautifulSoup)

这篇关于Python 3:使用请求无法获取网页的全部内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆