如何判断BeautifulSoup提取特定标签文本的内容？（不接触） [英] How to tell BeautifulSoup to extract the content of a specific tag as text? (without touching it)

查看：822 发布时间：2016/8/5 18:58:55 python syntax-highlighting beautifulsoup

本文介绍了如何判断BeautifulSoup提取特定标签文本的内容？（不接触）的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要解析它包含一个html文件code标记

I need to parse an html document which contains "code" tags

我得到了code块是这样的：

I'm getting the code blocks like this:

soup = BeautifulSoup(str(content))
code_blocks = soup.findAll('code')

问题是，如果我有这样的code标记：

The problem is, if i have a code tag like this:

<code class="csharp">
    List<Person> persons = new List<Person>();
</code>

BeautifulSoup ForSE召开前夕嵌套标签的关闭和改造code座为：

BeautifulSoup forse the closing of nested tags and transform the code block into:

<code class="csharp">
    List<person> persons = new List</person><person>();
    </person>
</code>

有没有什么办法来提取code标签与BeautifulSoup文本的内容，而不让它修复它的想法是HTML标记错误？

is there any way to extract the content of the code tags as text with BeautifulSoup without letting it fix what IT thinks are html markup errors?

推荐答案

在code标记添加到QUOTE_TAGS字典。

Add the code tag to the QUOTE_TAGS dictionary.

from BeautifulSoup import BeautifulSoup

content = "<code class='csharp'>List<Person> persons = new List<Person>();</code>"

BeautifulSoup.QUOTE_TAGS['code'] = None
soup = BeautifulSoup(str(content))
code_blocks = soup.findAll('code')

输出：

[<code class="csharp"> List<Person> persons = new List<Person>(); </code>]

这篇关于如何判断BeautifulSoup提取特定标签文本的内容？（不接触）的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何判断BeautifulSoup提取特定标签文本的内容？（不接触） [英] How to tell BeautifulSoup to extract the content of a specific tag as text? (without touching it)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何判断BeautifulSoup提取特定标签文本的内容？ （不接触） [英] How to tell BeautifulSoup to extract the content of a specific tag as text? (without touching it)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

如何判断BeautifulSoup提取特定标签文本的内容？（不接触） [英] How to tell BeautifulSoup to extract the content of a specific tag as text? (without touching it)

登录关闭