Python 从文档中去除 XML 标签 [英] Python strip XML tags from document

查看：182 发布时间：2021/7/6 19:42:36 python xml regex

本文介绍了Python 从文档中去除 XML 标签的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用 Python 从文档中剥离 XML 标签，Python 是一种我是新手的语言.这是我第一次尝试使用正则表达式，whixh 确实是一个希望得到最好的主意.

I am trying to strip XML tags from a document using Python, a language I am a novice in. Here is my first attempt using regex, whixh was really a hope-for-the-best idea.

mfile = file("somefile.xml","w")

for line in mfile:
    re.sub('<./>',"",line) #trying to match elements between < and />

那很失败.我想知道应该如何使用正则表达式来完成.

That failed miserably. I would like to know how it should be done with regex.

其次，我用谷歌搜索发现:http://code.activestate.com/recipes/440481-strips-xmlhtml-tags-from-string/

Secondly, I googled and found: http://code.activestate.com/recipes/440481-strips-xmlhtml-tags-from-string/

这似乎有效.但我想知道有没有更简单的方法来摆脱所有 xml 标签?也许使用 ElementTree?

which seems to work. But I would like to know is there a simpler way to get rid of all xml tags? Maybe using ElementTree?

推荐答案

请注意，使用正则表达式通常是不正常的.请参阅耶利米回答.

试试这个:

import re

text = re.sub('<[^<]+>', "", open("/path/to/file").read())
with open("/path/to/file", "w") as f:
    f.write(text)

这篇关于Python 从文档中去除 XML 标签的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python 从文档中去除 XML 标签 [英] Python strip XML tags from document

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python 从文档中去除 XML 标签 [英] Python strip XML tags from document

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭