为什么BeautifulSoup修改我自闭的元素呢? [英] Why is BeautifulSoup modifying my self-closing elements?

查看:121
本文介绍了为什么BeautifulSoup修改我自闭的元素呢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是剧本我有:

import BeautifulSoup

if __name__ == "__main__":
    data = """
    <root>
        <obj id="3"/>
        <obj id="5"/>
        <obj id="3"/>
    </root>
    """
    soup = BeautifulSoup.BeautifulStoneSoup(data)
    print soup

在跑,这种打印:

<root>
  <obj id="3"></obj>
  <obj id="5"></obj>
  <obj id="3"></obj>
</root>

我想它保持相同的结构。我该怎么做?

I'd like it to keep the same structure. How can I do that?

推荐答案

从的美丽的汤文档

BeautifulStoneSoup 最常见的缺点是,它不知道自闭的标签。 HTML有一个固定的自结束标记,但XML这取决于DTD说什么。你可以告诉 BeautifulStoneSoup 某些标签是自闭在他们的名字传递作为 selfClosingTags 构造函数的参数

The most common shortcoming of BeautifulStoneSoup is that it doesn't know about self-closing tags. HTML has a fixed set of self-closing tags, but with XML it depends on what the DTD says. You can tell BeautifulStoneSoup that certain tags are self-closing by passing in their names as the selfClosingTags argument to the constructor

这篇关于为什么BeautifulSoup修改我自闭的元素呢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆