不支持具有编码声明的XML Unicode字符串 [英] XML Unicode strings with encoding declaration are not supported

查看:225
本文介绍了不支持具有编码声明的XML Unicode字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

尝试执行以下操作...

  from lxml import etree 
from lxml.etree import fromstring

如果request.POST:
parser = etree.XMLParser(ns_clean = True,recover = True)
h = fromstring(request.POST ['xml'],parser = parser)
return HttpResponse(h.cssselect('itagg_delivery_receipt status')。text_content())

但是它给出了这个错误:

  [Fri Apr 05 10:27:54 2013] [error] Internal Server Error:/ sms / status_postback / 
[Fri Apr 05 10:27:54 2013] [error]追溯(最近的电话最后):
[Fri Apr 05 10:27:54 2013] [error] File/ usr / local / lib / python2.7 / dist-packages / django / core / handlers / base.py,第115行,get_response
[Fri Apr 05 10:27:54 2013] [error] response =回调(请求,* callback_args,** callback_kwargs)
[Fri Apr 05 10:27:54 2013] [error] File/usr/local/lib/python2.7/dist-packages/django/views/装饰器/ csrf.py,第77行ped_view
[Fri Apr 05 10:27:54 2013] [error] return view_func(* args,** kwargs)
[Fri Apr 05 10:27:54 2013] [error] File/ srv / project / livewireSMS / sms / views.py,第42行,update_delivery_status
[Fri Apr 05 10:27:54 2013] [error] h = fromstring(request.POST ['xml'],parser = parser)
[Fri Apr 05 10:27:54 2013] [error]文件lxml.etree.pyx,第2754行,在lxml.etree.fromstring(src / lxml / lxml.etree.c: 54631)
[Fri Apr 05 10:27:54 2013] [error]文件parser.pxi,第1569行,lxml.etree._parseMemoryDocument(src / lxml / lxml.etree.c:82659)
[Fri Apr 05 10:27:54 2013] [error] ValueError:不支持具有编码声明的Unicode字符串。

这是XML

 <?xml version =1.1encoding =ISO-8859-1?> 
< itagg_delivery_receipt>
< version> 1.0< / version>
< msisdn> 447889000000< / msisdn>
< submission_ref>
845tgrgsehg394g3hdfhhh56445y7ts6< /
submission_ref>
< status>已传送< / status>
< reason> 4< / reason>
< timestamp> 20050709120945< / timestamp>
< retry> 0< / retry>
< / itagg_delivery_receipt>

我无法控制来自SMS公司的xml文档。

解决方案

您必须对其进行编码,然后在解析器中强制使用相同的编码:



来自lxml import etree
from lxml.etree import fromstring

如果request.POST:
xml = request.POST [' xml']。编码('utf-8')
parser = etree.XMLParser(ns_clean = True,recover = True,encoding ='utf-8')
h = fromstring(xml,parser = parser )

return HttpResponse(h.cssselect('delivery_reciept status')。text_content())


Trying to do the following...

from lxml import etree
from lxml.etree import fromstring

if request.POST:
    parser = etree.XMLParser(ns_clean=True, recover=True)
    h = fromstring(request.POST['xml'], parser=parser)
    return HttpResponse(h.cssselect('itagg_delivery_receipt status').text_content())

but it give this error:

[Fri Apr 05 10:27:54 2013] [error] Internal Server Error: /sms/status_postback/
[Fri Apr 05 10:27:54 2013] [error] Traceback (most recent call last):
[Fri Apr 05 10:27:54 2013] [error]   File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py", line 115, in get_response
[Fri Apr 05 10:27:54 2013] [error]     response = callback(request, *callback_args, **callback_kwargs)
[Fri Apr 05 10:27:54 2013] [error]   File "/usr/local/lib/python2.7/dist-packages/django/views/decorators/csrf.py", line 77, in wrapped_view
[Fri Apr 05 10:27:54 2013] [error]     return view_func(*args, **kwargs)
[Fri Apr 05 10:27:54 2013] [error]   File "/srv/project/livewireSMS/sms/views.py", line 42, in update_delivery_status
[Fri Apr 05 10:27:54 2013] [error]     h = fromstring(request.POST['xml'], parser=parser)
[Fri Apr 05 10:27:54 2013] [error]   File "lxml.etree.pyx", line 2754, in lxml.etree.fromstring (src/lxml/lxml.etree.c:54631)
[Fri Apr 05 10:27:54 2013] [error]   File "parser.pxi", line 1569, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:82659)
[Fri Apr 05 10:27:54 2013] [error] ValueError: Unicode strings with encoding declaration are not supported.

this is the XML

 <?xml version="1.1" encoding="ISO-8859-1"?>
<itagg_delivery_receipt>
<version>1.0</version>
<msisdn>447889000000</msisdn>
<submission_ref>
845tgrgsehg394g3hdfhhh56445y7ts6</
submission_ref>
<status>Delivered</status>
<reason>4</reason>
<timestamp>20050709120945</timestamp>
<retry>0</retry>
</itagg_delivery_receipt> 

I don't have control over the xml document this comes from the SMS company.

解决方案

You'll have to encode it and then force the same encoding in the parser:

from lxml import etree
from lxml.etree import fromstring

if request.POST:
    xml = request.POST['xml'].encode('utf-8')
    parser = etree.XMLParser(ns_clean=True, recover=True, encoding='utf-8')
    h = fromstring(xml, parser=parser)

    return HttpResponse(h.cssselect('delivery_reciept status').text_content())

这篇关于不支持具有编码声明的XML Unicode字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆