BeautifulSoupinnerhtml? [英] BeautifulSoup innerhtml?

查看：14 发布时间：2021/12/23 19:46:27 python html beautifulsoup innerhtml

本文介绍了BeautifulSoupinnerhtml?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设我有一个带有 div 的页面.我可以使用 soup.find() 轻松获得那个 div.

Let's say I have a page with a div. I can easily get that div with soup.find().

现在我有了结果，我想打印那个 div 的整个 innerhtml:我的意思是，我需要一个包含所有 html 的字符串标签和文本放在一起，就像我用 obj.innerHTML 在 javascript 中得到的字符串一样.这可能吗?

Now that I have the result, I'd like to print the WHOLE innerhtml of that div: I mean, I'd need a string with ALL the html tags and text all toegether, exactly like the string I'd get in javascript with obj.innerHTML. Is this possible?

TL;DR

在 BeautifulSoup 4 中，如果您需要 UTF-8 编码的字节串，请使用 element.encode_contents()；如果您需要 Python Unicode 字符串，请使用 element.decode_contents().例如，DOM 的innerHTML 方法可能如下所示:

TL;DR

With BeautifulSoup 4 use element.encode_contents() if you want a UTF-8 encoded bytestring or use element.decode_contents() if you want a Python Unicode string. For example the DOM's innerHTML method might look something like this:

def innerHTML(element):
    """Returns the inner HTML of an element as a UTF-8 encoded bytestring"""
    return element.encode_contents()

<小时>

这些函数目前不在在线文档中，所以我将引用当前的函数定义和代码中的文档字符串.

These functions aren't currently in the online documentation so I'll quote the current function definitions and the doc string from the code.

def encode_contents(
    self, indent_level=None, encoding=DEFAULT_OUTPUT_ENCODING,
    formatter="minimal"):
    """Renders the contents of this tag as a bytestring.

    :param indent_level: Each line of the rendering will be
       indented this many spaces.

    :param encoding: The bytestring will be in this encoding.

    :param formatter: The output formatter responsible for converting
       entities to Unicode characters.
    """

另请参阅有关格式化程序的文档；您很可能会使用 formatter="minimal"(默认值)或 formatter="html"(对于 html 实体)，除非您想以某种方式手动处理文本.

See also the documentation on formatters; you'll most likely either use formatter="minimal" (the default) or formatter="html" (for html entities) unless you want to manually process the text in some way.

encode_contents 返回一个编码的字节串.如果您需要 Python Unicode 字符串，请改用 decode_contents.

encode_contents returns an encoded bytestring. If you want a Python Unicode string then use decode_contents instead.

decode_contents 与 encode_contents 做同样的事情，但返回 Python Unicode 字符串而不是编码的字节字符串.

decode_contents does the same thing as encode_contents but returns a Python Unicode string instead of an encoded bytestring.

def decode_contents(self, indent_level=None,
                   eventual_encoding=DEFAULT_OUTPUT_ENCODING,
                   formatter="minimal"):
    """Renders the contents of this tag as a Unicode string.

    :param indent_level: Each line of the rendering will be
       indented this many spaces.

    :param eventual_encoding: The tag is destined to be
       encoded into this encoding. This method is _not_
       responsible for performing that encoding. This information
       is passed in so that it can be substituted in if the
       document contains a <META> tag that mentions the document's
       encoding.

    :param formatter: The output formatter responsible for converting
       entities to Unicode characters.
    """

<小时>

美汤3

BeautifulSoup 3 没有上述功能，而是有 renderContents

def renderContents(self, encoding=DEFAULT_OUTPUT_ENCODING,
                   prettyPrint=False, indentLevel=0):
    """Renders the contents of this tag as a string in the given
    encoding. If encoding is None, returns a Unicode string.."""

这个功能被重新添加到 BeautifulSoup 4 (在 4.0.4) 与 BS3 兼容.

This function was added back to BeautifulSoup 4 (in 4.0.4) for compatibility with BS3.

这篇关于BeautifulSoupinnerhtml?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

BeautifulSoupinnerhtml? [英] BeautifulSoup innerhtml?

问题描述

推荐答案

TL;DR

TL;DR

美汤3

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

BeautifulSoupinnerhtml? [英] BeautifulSoup innerhtml?

问题描述

推荐答案

TL;DR

TL;DR

美汤3

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭