python - lxml.etree为什么会自动加上加上</i>？

查看：89 发布时间：2017/9/5 22:46:11

本文介绍了python - lxml.etree为什么会自动加上加上</i>？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

问题

正在学习lxml，代码如下：

from lxml import etree
text = '''
<i class="cell maincell">
    <p class="title">
        <a target="_blank" href="https://itjuzi.com/company/60321">
            <span>洋鼹鼠</span>
        </a>
    </p>
    <p>
        <span class="tags t-small c-gray-aset">
            <a href="https://itjuzi.com/investevents?scope=145">电子商务</a>
        </span>
        <span class="loca c-gray-aset t-small">
            <a href="https://itjuzi.com/investevents?prov=天津">天津</a>
        </span>
    </p>
</i>
'''
html = etree.HTML(text)
print(etree.tostring(html,encoding='utf-8').decode('utf-8'))

输出如下：

<html><body><i class="cell maincell">
    </i><p class="title">
        <a target="_blank" href="https://itjuzi.com/company/60321">
            <span>洋鼹鼠</span>
        </a>
    </p>
    <p>
        <span class="tags t-small c-gray-aset">
            <a href="https://itjuzi.com/investevents?scope=145">电子商务</a>
        </span>
        <span class="loca c-gray-aset t-small">
            <a href="https://itjuzi.com/investevents?prov=天津">天津</a>
        </span>
    </p>

</body></html>

主要不理解为什么<i>标签那里会出错呢？请问怎么解决这个问题？谢谢~

解决方案

主要是因为

p元素
内容分类 Flow content, palpable content.
允许的内容 Phrasing content.
允许的父元素任何接受flow content的元素

i元素
Content catergories Flow content, phrasing content, palpable content.
允许量 phrasing content.

很显然P元素的父元素应该是flow content类型的，然而i并不满足条件，也就是说这是不符合规范的。
解决办法就是i直接换为div。

这篇关于python - lxml.etree为什么会自动加上加上</i>？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

python - lxml.etree为什么会自动加上加上</i>？

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

python - lxml.etree为什么会自动加上加上&lt;/i&gt;？

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

python - lxml.etree为什么会自动加上加上</i>？

登录关闭