有没有一种方法可以对无效的Char值恢复iterparse? [英] Is there a way to recover iterparse on invalid Char values?

查看：89 发布时间：2020/5/4 8:26:44 python lxml

本文介绍了有没有一种方法可以对无效的Char值恢复iterparse?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用lxml的iterparse解析一些大的XML文件(3-5Gig).由于这些文件中的某些文件包含无效字符，因此会抛出lxml.etree.XMLSyntaxError.

I'm using lxml's iterparse to parse some big XML files (3-5Gig). Since some of these files have invalid characters a lxml.etree.XMLSyntaxError is thrown.

使用lxml.etree.parse时，我可以提供一个解析器，该解析器可以恢复无效字符:

When using lxml.etree.parse I can provide a parser which recovers on invalid characters:

parser = lxml.etree.XMLParser(recover=True)
root = lxml.etree.parse(open("myMalformed.xml, parser)

有没有办法为iterparse获得相同的功能?

Is there a way to get the same functionality for iterparse?

修改: 编码在这里不是问题.这些XML文件中存在无效字符，可以通过定义具有restore = True的XMLParser来清除这些字符.由于我需要为此使用iterparse，因此无法使用自定义解析器.因此，我正在此处寻找以上代码段中提供的功能:

Encoding is not an Issue here. There are invalid characters in these XML files which can be sanitized by defining a XMLParser with recover=True. Since I need to use iterparse for this, I can't use a custom parser. So I'm looking for the functionality provided in my snippet above for this here:

context = etree.iterparse(open("myMalformed.xml", events=('end',), tag="Foo") <-- cant recover

推荐答案

当您说无效字符时，您是指unicode字符吗?如果是这样，您可以尝试

When you say invalid characters, do you mean unicode characters? If so you can try

lxml.etree.XMLParser(encoding='UTF-8', recover=True)

如果您的意思是XML格式错误，那么这显然是行不通的.如果您可以发布您的追溯，我们可以看到XMLSyntaxError的性质，它将提供更多信息.

If you mean malformed XML then this obviously won't work. If you can post your traceback, we can see the nature of the XMLSyntaxError which will provide more information.

这篇关于有没有一种方法可以对无效的Char值恢复iterparse?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

有没有一种方法可以对无效的Char值恢复iterparse? [英] Is there a way to recover iterparse on invalid Char values?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

有没有一种方法可以对无效的Char值恢复iterparse? [英] Is there a way to recover iterparse on invalid Char values?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭