Python - 循环遍历 HTML 标签并使用 IF [英] Python - Looping through HTML Tags and using IF
问题描述
我正在使用 python 从网页中提取数据.该网页有一个重复出现的 html div 标签,其 class = "result",其中包含其他数据(例如位置、组织等).我能够使用漂亮的汤成功地循环浏览 html,但是当我添加一个条件时,例如某个词(例如,'NHS')存在于该段中,它不会返回任何内容 - 尽管我知道某些段包含它.这是代码:
soup = BeautifulSoup(content)详细信息 = 汤.findAll('div', {'class': 'result'})详情见详情:如果NHS"详细说明:打印细节
希望我的问题有意义...
findAll
返回标签列表,而不是字符串.也许将它们转换为字符串?
s = "golly
NHS
foo
"汤 = BeautifulSoup(s)详细信息 = 汤.findAll('p')type(details[0]) # 打印:<class 'BeautifulSoup.Tag'>
您正在标签中寻找字符串.最好在字符串中查找字符串...
详细介绍:如果 str(detail) 中的NHS":打印细节
I am using python to extract data from a webpage. The webpage has a reoccurring html div tag with class = "result" which contains other data in it (such as location, organisation etc...). I am able to successfully loop through the html using beautiful soup but when I add a condition such as if a certain word ('NHS' for e.g.) exists in the segment it doesn't return anything - though I know certain segments contain it. This is the code:
soup = BeautifulSoup(content)
details = soup.findAll('div', {'class': 'result'})
for detail in details:
if 'NHS' in detail:
print detail
Hope my question makes sense...
findAll
returns a list of tags, not strings. Perhaps convert them to strings?
s = "<p>golly</p><p>NHS</p><p>foo</p>"
soup = BeautifulSoup(s)
details = soup.findAll('p')
type(details[0]) # prints: <class 'BeautifulSoup.Tag'>
You are looking for a string amongst tags. Better to look for a string amongst strings...
for detail in details:
if 'NHS' in str(detail):
print detail
这篇关于Python - 循环遍历 HTML 标签并使用 IF的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!