如何使用美丽的汤从脚本标签中提取json? [英] How to extract json from script tag using beautiful soup?
本文介绍了如何使用美丽的汤从脚本标签中提取json?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想使用美丽的汤从脚本标签中提取 reviewCount
.尝试了不同的方法,但没有成功.
I want to extract the reviewCount
from the script tag using beautiful soup. Tried different approach but didn't succeed.
<script type="application/json" data-initial-state="review-filter">
{"languages":[{"isoCode":"all","displayName":"Toutes les langues","reviewCount":"573"},{"isoCode":"fr","displayName":"français","reviewCount":"567"},{"isoCode":"en","displayName":"English","reviewCount":"6"}],"selectedLanguages":["all"],"selectedStars":null,"selectedLocationId":null}
</script>
推荐答案
这应该可行,我绝对肯定有更优雅的方法:
This should work, I am absolutely sure there is a more elegant approach:
import json
from bs4 import BeautifulSoup
html = '''
<script type="application/json" data-initial-state="review-filter">
{"languages":[{"isoCode":"all","displayName":"Toutes les langues","reviewCount":"573"},{"isoCode":"fr","displayName":"français","reviewCount":"567"},{"isoCode":"en","displayName":"English","reviewCount":"6"}],"selectedLanguages":["all"],"selectedStars":null,"selectedLocationId":null}
</script>
'''
soup = BeautifulSoup(html, 'html.parser')
res = soup.find('script')
json_object = json.loads(res.contents[0])
for language in json_object['languages']:
print('{}: {}'.format(language['displayName'], language['reviewCount']))
输出:
Toutes les langues: 573
français: 567
English: 6
这篇关于如何使用美丽的汤从脚本标签中提取json?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文