lxml unicode实体解析问题 [英] lxml unicode entity parse problems

查看：65 发布时间：2020/5/4 8:26:26 python xml unicode lxml

本文介绍了lxml unicode实体解析问题的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

我正在如下使用lxml来解析从另一个系统导出的XML文件:

I'm using lxml as follows to parse an exported XML file from another system:

xmldoc = open(filename)
etree.parse(xmldoc)

但是我得到了

lxml.etree.XMLSyntaxError:实体未定义紧急"行4495，第46栏

lxml.etree.XMLSyntaxError: Entity 'eacute' not defined, line 4495, column 46

很显然，Unicode实体名称存在问题-但是我将如何解决呢?通过open()还是parse()?

Obviously it's having problems with unicode entity names - but how would i get round this? Via open() or parse()?

编辑:我忘了将DTD包含在同一文件夹中-它现在已经存在，并且具有以下声明:

I had forgotten to include my DTD in the same folder - it's there now and has the following declaration:

<!ENTITY eacute "&#233;">

，并且在xmldoc中被这样引用(并且一直被引用):

and is referred to (and always was) in xmldoc as so:

<?xml version="1.0" encoding="ISO-8859-1" ?>
<!DOCTYPE DScribeDatabase SYSTEM "foo.dtd">

但是我仍然遇到相同的问题... DTD是否也需要在Python中声明?

Yet I still get the same problem ... does the DTD need to be declared in Python too?