如何使用美丽的汤从脚本标签中提取json? [英] How to extract json from script tag using beautiful soup?

查看:26
本文介绍了如何使用美丽的汤从脚本标签中提取json?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用美丽的汤从脚本标签中提取 reviewCount.尝试了不同的方法,但没有成功.

I want to extract the reviewCount from the script tag using beautiful soup. Tried different approach but didn't succeed.

<script type="application/json" data-initial-state="review-filter">
{"languages":[{"isoCode":"all","displayName":"Toutes les langues","reviewCount":"573"},{"isoCode":"fr","displayName":"français","reviewCount":"567"},{"isoCode":"en","displayName":"English","reviewCount":"6"}],"selectedLanguages":["all"],"selectedStars":null,"selectedLocationId":null}
</script>

推荐答案

这应该可行,我绝对肯定有更优雅的方法:

This should work, I am absolutely sure there is a more elegant approach:

import json
from bs4 import BeautifulSoup

html = '''
<script type="application/json" data-initial-state="review-filter">
{"languages":[{"isoCode":"all","displayName":"Toutes les langues","reviewCount":"573"},{"isoCode":"fr","displayName":"français","reviewCount":"567"},{"isoCode":"en","displayName":"English","reviewCount":"6"}],"selectedLanguages":["all"],"selectedStars":null,"selectedLocationId":null}
</script>
'''

soup = BeautifulSoup(html, 'html.parser')
res = soup.find('script')
json_object = json.loads(res.contents[0])

for language in json_object['languages']:
    print('{}: {}'.format(language['displayName'], language['reviewCount']))

输出:

Toutes les langues: 573
français: 567
English: 6

这篇关于如何使用美丽的汤从脚本标签中提取json?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆