使用BeautifulSoup在特定标签后获取值 [英] Use BeautifulSoup to get a value after a specific tag

查看:132
本文介绍了使用BeautifulSoup在特定标签后获取值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很难让BeautifulSoup为我抓取一些数据.从此代码示例访问日期(实际数字,2008)的最佳方法是什么?这是我第一次使用Beautifulsoup,我已经弄清楚了如何从页面上抓取url,但是我不能将其范围缩小到仅选择单词Date,然后只返回跟在后面的任何数字日期(在dd中括号).我要问的甚至有可能吗?

I'm having a very hard time getting BeautifulSoup to scrape some data for me. What's the best way to access the date (the actual numbers, 2008) from this code sample? It's my first time using Beautifulsoup, I've figured out how to scrape urls off of the page, but I can't quite narrow it down to only select the word Date, and then to only return whatever numeric date follows (in the dd brackets). Is what I'm asking even possible?

<div class='dl_item_container clearfix detail_date'>
    <dt>Date</dt>
    <dd>
        2008
    </dd>
</div>

推荐答案

找到dt标记

Find the dt tag by text and find the next dd sibling:

soup.find('div', class_='detail_date').find('dt', text='Date').find_next_sibling('dd').text

完整代码:

from bs4 import BeautifulSoup

data = """
<div class='dl_item_container clearfix detail_date'>
    <dt>Date</dt>
    <dd>
    2008
    </dd>
</div>
"""

soup = BeautifulSoup(data)
date_field = soup.find('div', class_='detail_date').find('dt', text='Date')
print date_field.find_next_sibling('dd').text.strip()

打印2008.

这篇关于使用BeautifulSoup在特定标签后获取值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆