Python 3 - 从beautifulSoup中的标签中获取文本 [英] Python 3 - Get text from tag in beautifulSoup
问题描述
我使用beautifulSoup从网站提取数据。来自该网站的文本每次重新加载页面时都会发生变化,因此基本上我希望能够将类名称作为静态变量,因为文本是动态的。
从bs4导入请求
导入BeautifulSoup
url ='xxxxxxxxxxx'
r =请求。 get(url)
soup = BeautifulSoup(r.content,'html.parser')
class2 = soup.find_all(True,class _ =template_title)
print(class2)
打印出
< td align =left class =template_titleheight =50valign =bottomwidth =535>< div style =padding-bottom:9px;> 4
< / td> code>
当页面重新加载时,我仍然将重点放在该区域,但我不知道如何仅打印文本(在本例中为4)
一旦找到了这个问题,我还有一个问题:如果类包含多个标签,是否有办法获得更多的静态数据以确保它只打印我正在搜索的文本而不是更多? (我有课,但我可以使用height =50valign =bottomwidth =535以及?)
-
您可以使用元素的
text
或字符串
属性。elems = soup.find_all(True,class _ ='template_title')
print([elem.string for elem )为给定的HTML片段
- ,
#打印`['4']` >elems = soup.find_all(True,class _ ='template_title',
height = '50',valign ='bottom',width ='535')
I am using beautifulSoup to extract data from website. Text from that website changes everytime you reload your page so basically I wish to be able to set a focus on the class name as a Static variable since the text is Dynamic.
import requests
from bs4 import BeautifulSoup
url = 'xxxxxxxxxxx'
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
class2 = soup.find_all(True, class_="template_title")
print (class2)
which prints out
<td align="left" class="template_title" height="50" valign="bottom" width="535"><div style="padding-bottom:9px;">4</div></td>
When the page reloads, I will still have the focus on the area but I do not know how to print only the text (which in this case is : 4)
Once this is figured out, I have another question: If the class contains multiple tags, is there a way to get more static data to be sure it only prints the text I was searching for and not more? ( I have class, but could I use height="50" valign="bottom" width="535" as well?)
You can use
text
orstring
attribute of the element.elems = soup.find_all(True, class_='template_title') print([elem.string for elem in elems]) # prints `['4']` for the given html snippet
Specify more attributes as you want:
elems = soup.find_all(True, class_='template_title', height='50', valign='bottom', width='535')
这篇关于Python 3 - 从beautifulSoup中的标签中获取文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!