为什么不解析div中的html代码? [英] Why isn't the html code inside div is being parsed?

查看:143
本文介绍了为什么不解析div中的html代码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

HTML代码

在此图像中, div id ="root"

这是代码:

import requests 
from bs4 import BeautifulSoup

URL = 'https://www.daraz.com.bd/catalog/?spm=a2a0e.home.search.3.73524591owXnnM&q=mobile' 
page = requests.get(URL)

soup = BeautifulSoup(page.content, 'html.parser')
result = soup.find("div", id="root")
print(result)

输出为:

<div id="root"></div>

为什么不解析div中的html代码?

Why isn't the html code inside div is being parsed?

推荐答案

<div id="root"></div>内部的内容可能是动态加载的.您可以自行检查是否使用禁用的JavaScript访问该页面. 使用您的方法,BeatifulSoup不会解析(通过JavaScript)动态添加的内容.

The content inside of <div id="root"></div> is likely loaded dynamically. You can check on your own if you visit the page with disabled JavaScript. With your approach, BeatifulSoup doesn't parse the content which was added dynamically (through JavaScript).

此处有更多详细信息=> BeautifulSoup无法获取动态内容

More details here => BeautifulSoup not grabbing dynamic content

我建议您使用无头浏览器,以获取使用JavaScript生成的动态内容. (无头浏览器能够执行JavaScript,因此可以访问动态内容以进行解析)

I would recommend using a headless browser in your case to be able to fetch dynamic content which was generated with JavaScript. (headless browser is able to execute JavaScript so it makes dynamic content accessible for parsing)

这篇关于为什么不解析div中的html代码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆