Python - 循环遍历 HTML 标签并使用 IF [英] Python - Looping through HTML Tags and using IF

查看:78
本文介绍了Python - 循环遍历 HTML 标签并使用 IF的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 python 从网页中提取数据.该网页有一个重复出现的 html div 标签,其 class = "result",其中包含其他数据(例如位置、组织等).我能够使用漂亮的汤成功地循环浏览 html,但是当我添加一个条件时,例如某个词(例如,'NHS')存在于该段中,它不会返回任何内容 - 尽管我知道某些段包含它.这是代码:

soup = BeautifulSoup(content)详细信息 = 汤.findAll('div', {'class': 'result'})详情见详情:如果NHS"详细说明:打印细节

希望我的问题有意义...

解决方案

findAll 返回标签列表,而不是字符串.也许将它们转换为字符串?

s = "

golly

NHS

foo

"汤 = BeautifulSoup(s)详细信息 = 汤.findAll('p')type(details[0]) # 打印:<class 'BeautifulSoup.Tag'>

您正在标签中寻找字符串.最好在字符串中查找字符串...

详细介绍:如果 str(detail) 中的NHS":打印细节

I am using python to extract data from a webpage. The webpage has a reoccurring html div tag with class = "result" which contains other data in it (such as location, organisation etc...). I am able to successfully loop through the html using beautiful soup but when I add a condition such as if a certain word ('NHS' for e.g.) exists in the segment it doesn't return anything - though I know certain segments contain it. This is the code:

soup = BeautifulSoup(content)
details = soup.findAll('div', {'class': 'result'})

for detail in details:
    if 'NHS' in detail:
        print detail

Hope my question makes sense...

解决方案

findAll returns a list of tags, not strings. Perhaps convert them to strings?

s = "<p>golly</p><p>NHS</p><p>foo</p>"
soup = BeautifulSoup(s)
details = soup.findAll('p')
type(details[0])    # prints: <class 'BeautifulSoup.Tag'>

You are looking for a string amongst tags. Better to look for a string amongst strings...

for detail in details:
    if 'NHS' in str(detail):
        print detail

这篇关于Python - 循环遍历 HTML 标签并使用 IF的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆