解析utf8 xml时，lxml编码错误 [英] lxml encoding error when parsing utf8 xml

查看：87 发布时间：2021/5/4 19:19:24 xml encoding utf-8 lxml

本文介绍了解析utf8 xml时，lxml编码错误的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

我正在尝试使用lxml遍历XML文件(UTF-8编码，以开头)，但是在字符丂上出现以下错误:

I'm trying to iterate through an XML file (UTF-8 encoded, starts with ) with lxml, but get the following error on the character 丂 :

UnicodeEncodeError:'cp932'编解码器无法在位置0:非法的多字节序列中对字符u'\ u4e02'进行编码

UnicodeEncodeError: 'cp932' codec can't encode character u'\u4e02' in position 0: illegal multibyte sequence

此之前的其他字符已正确打印.代码是:

Other characters before this are printed out correctly. The code is:

parser = etree.XMLParser(encoding='utf-8')
tree = etree.parse("filename.xml", parser)
root = tree.getroot()
for elem in root:
    print elem[0].text

该错误是否表示它不是在utf-8中而是在Shift JIS中解析文件?

Does the error mean that it didn't parse the file in utf-8 but in shift JIS instead?