使用 BeautifulSoup 提取没有标签的文本 [英] Using BeautifulSoup to extract text without tags

查看：32 发布时间：2021/12/17 13:14:36 python web-scraping beautifulsoup

本文介绍了使用 BeautifulSoup 提取没有标签的文本的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的网页如下所示:

<p>
  <strong class="offender">YOB:</strong> 1987<br/>
  <strong class="offender">RACE:</strong> WHITE<br/>
  <strong class="offender">GENDER:</strong> FEMALE<br/>
  <strong class="offender">HEIGHT:</strong> 5'05''<br/>
  <strong class="offender">WEIGHT:</strong> 118<br/>
  <strong class="offender">EYE COLOR:</strong> GREEN<br/>
  <strong class="offender">HAIR COLOR:</strong> BROWN<br/>
</p>

我想提取每个人的信息并获得 YOB:1987、RACE:WHITE 等...

I want to extract the info for each individual and get YOB:1987, RACE:WHITE, etc...

我尝试的是:

subc = soup.find_all('p')
subc1 = subc[1]
subc2 = subc1.find_all('strong')

但这只给我YOB:、RACE:等的值...


But this gives me only the values of YOB:, RACE:, etc...
有没有办法可以获取YOB:1987、RACE:WHITE格式的数据?
Is there a way that I can get the data in YOB:1987, RACE:WHITE format?
推荐答案
只需循环遍历所有  标签并使用 next_sibling 以获得您想要的.像这样:

Just loop through all the <strong> tags and use next_sibling to get what you want. Like this:
for strong_tag in soup.find_all('strong'):
    print(strong_tag.text, strong_tag.next_sibling)

演示:
from bs4 import BeautifulSoup

html = '''
<p>
  <strong class="offender">YOB:</strong> 1987<br />
  <strong class="offender">RACE:</strong> WHITE<br />
  <strong class="offender">GENDER:</strong> FEMALE<br />
  <strong class="offender">HEIGHT:</strong> 5'05''<br />
  <strong class="offender">WEIGHT:</strong> 118<br />
  <strong class="offender">EYE COLOR:</strong> GREEN<br />
  <strong class="offender">HAIR COLOR:</strong> BROWN<br />
</p>
'''

soup = BeautifulSoup(html)

for strong_tag in soup.find_all('strong'):
    print(strong_tag.text, strong_tag.next_sibling)

这给你:
YOB:  1987
RACE:  WHITE
GENDER:  FEMALE
HEIGHT:  5'05''
WEIGHT:  118
EYE COLOR:  GREEN
HAIR COLOR:  BROWN


                        这篇关于使用 BeautifulSoup 提取没有标签的文本的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

使用 BeautifulSoup 提取没有标签的文本 [英] Using BeautifulSoup to extract text without tags

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用 BeautifulSoup 提取没有标签的文本 [英] Using BeautifulSoup to extract text without tags

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭