Python 3 - 从 beautifulSoup 中的标签获取文本 [英] Python 3 - Get text from tag in beautifulSoup

查看:36
本文介绍了Python 3 - 从 beautifulSoup 中的标签获取文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 beautifulSoup 从网站中提取数据.每次重新加载页面时,来自该网站的文本都会更改,因此基本上我希望能够将类名设置为静态变量,因为文本是动态的.

I am using beautifulSoup to extract data from website. Text from that website changes everytime you reload your page so basically I wish to be able to set a focus on the class name as a Static variable since the text is Dynamic.

import requests
from bs4 import BeautifulSoup
url = 'xxxxxxxxxxx'
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
class2 = soup.find_all(True, class_="template_title")
print (class2)

打印出来
<td align="left" class="template_title" height="50" valign="bottom" width="535"><div style="padding-bottom:9px;">4</div>
当页面重新加载时,我仍然会将焦点放在该区域,但我不知道如何仅打印文本(在这种情况下为:4)

which prints out
<td align="left" class="template_title" height="50" valign="bottom" width="535"><div style="padding-bottom:9px;">4</div></td>
When the page reloads, I will still have the focus on the area but I do not know how to print only the text (which in this case is : 4)

一旦弄清楚了,我还有另一个问题:如果类包含多个标签,有没有办法获取更多静态数据以确保它只打印我正在搜索的文本而不是更多?(我有课,但我也可以使用 height="50" valign="bottom" width="535" 吗?)

Once this is figured out, I have another question: If the class contains multiple tags, is there a way to get more static data to be sure it only prints the text I was searching for and not more? ( I have class, but could I use height="50" valign="bottom" width="535" as well?)

推荐答案

  1. 您可以使用元素的 textstring 属性.

elems = soup.find_all(True, class_='template_title')
print([elem.string for elem in elems])
# prints `['4']` for the given html snippet

  • 根据需要指定更多属性:

  • Specify more attributes as you want:

    elems = soup.find_all(True, class_='template_title',
                          height='50', valign='bottom', width='535')
    

  • 这篇关于Python 3 - 从 beautifulSoup 中的标签获取文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆