Requests.content不使用Chrome检查元件匹配 [英] Requests.content not matching with Chrome inspect element

查看:202
本文介绍了Requests.content不使用Chrome检查元件匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用BeautifulSoup和请求刮Allrecipes的用户数据。

I'm using BeautifulSoup and Requests to scrape allrecipes user data.

在检查HTML code我发现我想要的数据是包含在

When inspecting the HTML code I find that the data I want is contained within

<article class="profile-review-card">

然而,当我用下面的code

However when I use the following code

URL = 'http://allrecipes.com/cook/2010/reviews/'
response = requests.get(URL ).content
soup = BeautifulSoup(response, 'html.parser')
X = soup.find_all('article', class_ = "profile-review-card"  )

虽然汤和响应都充满的HTML,X是空的。我已经通过看,有什么之间我检查元素和requests.get(URL).content看到一些不一致的地方,这是怎么回事?

While soup and response are full of html, X is empty. I've looked through and there are some inconsistencies between what I see with inspect element and requests.get(URL).content, what is going on?

什么浏览器检查我显示

推荐答案

这是因为它使用Ajax / JavaScript的加载。图书馆的请求不处理,你需要使用的东西,可以执行这些脚本,并得到了DOM。有多种方案,我将列出一对夫妇,让你开始。

That's because it's loaded using Ajax/javascript. Requests library doesn't handle that, you'll need to use something that can execute these scripts and get the dom. There are various options, I'll list a couple to get you started.

  • Selenium
  • ghost.py

这篇关于Requests.content不使用Chrome检查元件匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆