Python BeautifulSoup - 在 iframe 中抓取 Web 内容 [英] Python BeautifulSoup - Scrape Web Content Inside Iframes

查看:32
本文介绍了Python BeautifulSoup - 在 iframe 中抓取 Web 内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有这个网址:https://www.aliexpress.com/store/feedback-score/1665279.html

所需的内容是 iframe 中的Feedback History"表:

反馈 1个月 3个月 6个月正面(4-5 星) 154 562 1,550中性(3 星) 8 19 65负(1-2 星) 8 20 57好评率 95.1% 96.6% 96.5%

我们如何提取它?

解决方案

你只需要获取iframesrc属性,然后请求解析其内容即可:

导入请求从 bs4 导入 BeautifulSoups = requests.Session()r = s.get(https://www.aliexpress.com/store/feedback-score/1665279.html")汤 = BeautifulSoup(r.content, html.parser")iframe_src = soup.select_one("#detail-displayer").attrs["src"]r = s.get(f"https:{iframe_src}")汤 = BeautifulSoup(r.content, html.parser")对于soup.select(.history-tb tr")中的行:打印(	".join([e.text for e in row.select(th, td")]))

结果:

<前>反馈 1 个月 3 个月 6 个月正面(4-5 星) 154 562 1,550中性(3 星) 8 19 65负(1-2 星) 8 20 57好评率 95.1% 96.6% 96.5%

We have this URL: https://www.aliexpress.com/store/feedback-score/1665279.html

And the needed content is the "Feedback History" table, which is inside an iframe:

Feedback    1 Month 3 Months    6 Months
Positive (4-5 Stars)    154 562 1,550
Neutral (3 Stars)   8   19  65
Negative (1-2 Stars)    8   20  57
Positive feedback rate  95.1%   96.6%   96.5%

How do we extract it?

解决方案

You just need to obtain the src attribute of the iframe, and then request and parse its content:

import requests
from bs4 import BeautifulSoup

s = requests.Session()
r = s.get("https://www.aliexpress.com/store/feedback-score/1665279.html")

soup = BeautifulSoup(r.content, "html.parser")
iframe_src = soup.select_one("#detail-displayer").attrs["src"]

r = s.get(f"https:{iframe_src}")

soup = BeautifulSoup(r.content, "html.parser")
for row in soup.select(".history-tb tr"):
    print("	".join([e.text for e in row.select("th, td")]))

Result:

Feedback        1 Month         3 Months        6 Months
Positive (4-5 Stars)    154     562     1,550
Neutral (3 Stars)       8       19      65
Negative (1-2 Stars)    8       20      57
Positive feedback rate  95.1%   96.6%   96.5%

这篇关于Python BeautifulSoup - 在 iframe 中抓取 Web 内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆