在Python中的XML文件的字符转义 [英] escaping characters in a xml file with python

查看:3233
本文介绍了在Python中的XML文件的字符转义的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要逃避一个丑陋的XML文件中的特殊字符(5000行左右长)。这是我要处理XML的例子:

I need to escape special characters in an ugly XML file (5000 lines or so long). Here's an example of XML I have to deal with:

<root>
 <element>
  <name>name & surname</name>
  <mail>name@name.org</mail>
 </element>
</root>

下面的问题是字符与&amp;在名字里。你会如何​​特殊字符转义像这样用Python库?我没有找到 BeautifulSoup

Here the problem is the character "&" in the name. How would you escape special characters like this with a Python library? I didn't find the way to do it with BeautifulSoup.

推荐答案

如果你不关心在XML你可以使用XML解析器的恢复选项无效字符(看到解析破损的XML与lxml.etree.iterparse ):

If you don't care about invalid characters in the xml you could use XML parser's recover option (see Parsing broken XML with lxml.etree.iterparse):

from lxml import etree

parser = etree.XMLParser(recover=True) # recover from bad characters.
root = etree.fromstring(broken_xml, parser=parser)
print etree.tostring(root)

输出

<root>
<element>
<name>name  surname</name>
<mail>name@name.org</mail>
</element>
</root>

这篇关于在Python中的XML文件的字符转义的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆