Requests.content 与 Chrome 检查元素不匹配 [英] Requests.content not matching with Chrome inspect element

查看:19
本文介绍了Requests.content 与 Chrome 检查元素不匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 BeautifulSoup 和 Requests 来抓取所有食谱用户数据.

I'm using BeautifulSoup and Requests to scrape allrecipes user data.

在检查 HTML 代码时,我发现我想要的数据包含在

When inspecting the HTML code I find that the data I want is contained within

<article class="profile-review-card">

但是当我使用以下代码时

However when I use the following code

URL = 'http://allrecipes.com/cook/2010/reviews/'
response = requests.get(URL ).content
soup = BeautifulSoup(response, 'html.parser')
X = soup.find_all('article', class_ = "profile-review-card"  )

虽然汤和响应充满了 html,但 X 是空的.我已经浏览过,我看到的检查元素和 requests.get(URL).content 之间存在一些不一致,这是怎么回事?

While soup and response are full of html, X is empty. I've looked through and there are some inconsistencies between what I see with inspect element and requests.get(URL).content, what is going on?

Chrome 检查显示的内容

推荐答案

那是因为它是使用 Ajax/javascript 加载的.请求库不处理这个,你需要使用可以执行这些脚本并获取 dom 的东西.有多种选择,我将列出几个以帮助您入门.

That's because it's loaded using Ajax/javascript. Requests library doesn't handle that, you'll need to use something that can execute these scripts and get the dom. There are various options, I'll list a couple to get you started.

这篇关于Requests.content 与 Chrome 检查元素不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆