如何替换lxml中的元素? [英] How can one replace an element in lxml?

查看：106 发布时间：2020/5/4 8:39:13 python lxml elementtree lxml.html

本文介绍了如何替换lxml中的元素?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个文本(CRM用户输入的数据)Web服务，该文本返回可怕的格式".我在使用数据之前使用python进行了过滤，但是在删除换行符(br)时，我也删除了文本.代码如下:

I have a text that I get (data entered by users of CRM) web service, which returns a "terrifying format". I am filtering with python before using the data, but when it comes to removing line breaks (br) removed me also the texts. The code is as follows:

description = '''
<div id="highlight" class="section">
    <p>
        text...............
    </p>
    <br>
    <h1>TITLE</h1>
    <p>Multiple text
        <br>&nbsp;
    </p>
    <ul>
        <li>bad layer....</li>
    </ul>
    <p>
        <br>subTitle
    </p>
    <p>&nbsp;</p>
    <p style="text-align: center;">
        <br>Text1
        <br>Text2
        <br>Text3
        <br>Text4
        <br>Text5
        <br>Text6
    </p>
    <p style="text-align: center;">
        <strong>small title</strong>
        <br>Text small</p>
    <p style="text-align: center;">
        <strong>highlighted text</strong>
        <br>
        <br><strong>Text1</strong>
        <br>Text2
        <br>Text3
        <br>Text4
    </p>
    <p style="text-align: center;">
        <strong>small text</strong>
        <br>Text1
        <br>Text2
    </p>
    <p style="text-align: center;">
        <strong>small text</strong>
        <br>description
    </p>
    <p style="text-align: center;">
        <br>&nbsp;</p>
    <p><strong>description two</strong></p>
    <p>
        <br>&nbsp;</p>
</div>
'''

tree = html.fragment_fromstring( description )

for element in tree.xpath('//br'):
    #element.getparent().remove(element)
    print element.text
    print element.getparent().getchildren()
    #print element
    #print element.getparent()
    #print element.getchildren()
    #print element.getnext()
    #print '--------------------------------'

我尝试使用element.getparent().remove(element)删除 br ，但是也删除了文本，我做了测试以查看文本是否属于任何节点，但不是如此.

I have tried to remove the br with element.getparent().remove(element), but also deletes the text, I did tests to see if the texts belong to any node, but not so.

我曾考虑过用li更改br，用ul中的stylo来制作p，但我想不起来，就像这样(前面的la脚):

I've thought about changing the br by li, making the p with stylo in ul, but I can't think as do it, something like this (the previous text lame):

..........
..........
<ul>
    <li>Text1</li>
    <li>Text2</li>
    <li>Text3</li>
    <li>Text4</li>
    <li>Text5</li>
    <li>Text6</li>
</ul>
<ul>
    <li><strong>small title</strong></li>
    <li>Text small</li></ul>
<ul>
    <li><strong>highlighted text</strong></li>
    <li><strong>Text1</strong></li>
    <li>Text2</li>
    <li>Text3</li>
    <li>Text4</li>
</ul>
<ul>
    <li><strong>small text</strong></li>
    <li>Text1</li>
    <li>Text2</li>
</ul>
<ul>
    <li><strong>small text</strong></li>
    <li>description</li>
</ul>
<ul>
    <li>&nbsp;</li></ul>
........

我不认为是文本，因为我认为仅选择具有样式和其值的节点p的xpath，创建节点li的子级和父级ul，就可以消除p.

I can't think as take texts, because I thought that just choosing the xpath of the node p with style and its value, creating nodes children of li and a parent ul, eliminated p.

可能吗?谢谢

致谢

如何替换lxml中的元素? [英] How can one replace an element in lxml?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何替换lxml中的元素? [英] How can one replace an element in lxml?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭