如何使用BeautifulSoup4获取< br>之前的所有文本.标签 [英] How do I use BeautifulSoup4 to get ALL text before <br> tag

查看：88 发布时间：2020/9/20 7:45:05 python html beautifulsoup scrapy

本文介绍了如何使用BeautifulSoup4获取< br>之前的所有文本.标签的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试为我的应用抓取一些数据.我的问题是我需要一些这是HTML代码:

I'm trying to scrape some data for my app. My question is I need some Here is the HTML code:

<tr>
  <td>
    This
    <a class="tip info" href="blablablablabla">is a first</a>
    sentence.
    <br>
    This
    <a class="tip info" href="blablablablabla">is a second</a>
    sentence.
    <br>This
    <a class="tip info" href="blablablablabla">is a third</a>
    sentence.
    <br>
  </td>
</tr>

我希望输出看起来像

这是第一句话.
这是第二句话.
这是第三句话.

This is a first sentence.
This is a second sentence.
This is a third sentence.

有可能这样做吗?

推荐答案

尝试一下.它应该为您提供所需的输出.只需将以下脚本中使用的content变量视为上面粘贴的html elements的所有者即可.

Try this. It should give you the desired output. Just consider the content variable used within the below script to be the holder of your above pasted html elements.

from bs4 import BeautifulSoup

soup = BeautifulSoup(content,"lxml")
items = ','.join([''.join([item.previous_sibling,item.text,item.next_sibling]) for item in soup.select(".tip.info")])
data = ' '.join(items.split()).replace(",","\n")
print(data)

输出:

This is a first sentence. 
This is a second sentence. 
This is a third sentence.

这篇关于如何使用BeautifulSoup4获取< br>之前的所有文本.标签的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用BeautifulSoup4获取< br>之前的所有文本.标签 [英] How do I use BeautifulSoup4 to get ALL text before <br> tag

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

如何使用BeautifulSoup4获取&lt; br&gt;之前的所有文本.标签 [英] How do I use BeautifulSoup4 to get ALL text before &lt;br&gt; tag

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

如何使用BeautifulSoup4获取< br>之前的所有文本.标签 [英] How do I use BeautifulSoup4 to get ALL text before <br> tag

登录关闭