不支持具有编码声明的XML Unicode字符串 [英] XML Unicode strings with encoding declaration are not supported
本文介绍了不支持具有编码声明的XML Unicode字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
尝试执行以下操作...
from lxml import etree
from lxml.etree import fromstring
如果request.POST:
parser = etree.XMLParser(ns_clean = True,recover = True)
h = fromstring(request.POST ['xml'],parser = parser)
return HttpResponse(h.cssselect('itagg_delivery_receipt status')。text_content())
但是它给出了这个错误:
[Fri Apr 05 10:27:54 2013] [error] Internal Server Error:/ sms / status_postback /
[Fri Apr 05 10:27:54 2013] [error]追溯(最近的电话最后):
[Fri Apr 05 10:27:54 2013] [error] File/ usr / local / lib / python2.7 / dist-packages / django / core / handlers / base.py,第115行,get_response
[Fri Apr 05 10:27:54 2013] [error] response =回调(请求,* callback_args,** callback_kwargs)
[Fri Apr 05 10:27:54 2013] [error] File/usr/local/lib/python2.7/dist-packages/django/views/装饰器/ csrf.py,第77行ped_view
[Fri Apr 05 10:27:54 2013] [error] return view_func(* args,** kwargs)
[Fri Apr 05 10:27:54 2013] [error] File/ srv / project / livewireSMS / sms / views.py,第42行,update_delivery_status
[Fri Apr 05 10:27:54 2013] [error] h = fromstring(request.POST ['xml'],parser = parser)
[Fri Apr 05 10:27:54 2013] [error]文件lxml.etree.pyx,第2754行,在lxml.etree.fromstring(src / lxml / lxml.etree.c: 54631)
[Fri Apr 05 10:27:54 2013] [error]文件parser.pxi,第1569行,lxml.etree._parseMemoryDocument(src / lxml / lxml.etree.c:82659)
[Fri Apr 05 10:27:54 2013] [error] ValueError:不支持具有编码声明的Unicode字符串。
这是XML
<?xml version =1.1encoding =ISO-8859-1?>
< itagg_delivery_receipt>
< version> 1.0< / version>
< msisdn> 447889000000< / msisdn>
< submission_ref>
845tgrgsehg394g3hdfhhh56445y7ts6< /
submission_ref>
< status>已传送< / status>
< reason> 4< / reason>
< timestamp> 20050709120945< / timestamp>
< retry> 0< / retry>
< / itagg_delivery_receipt>
我无法控制来自SMS公司的xml文档。
解决方案
您必须对其进行编码,然后在解析器中强制使用相同的编码:
来自lxml import etree
from lxml.etree import fromstring
如果request.POST:
xml = request.POST [' xml']。编码('utf-8')
parser = etree.XMLParser(ns_clean = True,recover = True,encoding ='utf-8')
h = fromstring(xml,parser = parser )
return HttpResponse(h.cssselect('delivery_reciept status')。text_content())
Trying to do the following...
from lxml import etree
from lxml.etree import fromstring
if request.POST:
parser = etree.XMLParser(ns_clean=True, recover=True)
h = fromstring(request.POST['xml'], parser=parser)
return HttpResponse(h.cssselect('itagg_delivery_receipt status').text_content())
but it give this error:
[Fri Apr 05 10:27:54 2013] [error] Internal Server Error: /sms/status_postback/
[Fri Apr 05 10:27:54 2013] [error] Traceback (most recent call last):
[Fri Apr 05 10:27:54 2013] [error] File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py", line 115, in get_response
[Fri Apr 05 10:27:54 2013] [error] response = callback(request, *callback_args, **callback_kwargs)
[Fri Apr 05 10:27:54 2013] [error] File "/usr/local/lib/python2.7/dist-packages/django/views/decorators/csrf.py", line 77, in wrapped_view
[Fri Apr 05 10:27:54 2013] [error] return view_func(*args, **kwargs)
[Fri Apr 05 10:27:54 2013] [error] File "/srv/project/livewireSMS/sms/views.py", line 42, in update_delivery_status
[Fri Apr 05 10:27:54 2013] [error] h = fromstring(request.POST['xml'], parser=parser)
[Fri Apr 05 10:27:54 2013] [error] File "lxml.etree.pyx", line 2754, in lxml.etree.fromstring (src/lxml/lxml.etree.c:54631)
[Fri Apr 05 10:27:54 2013] [error] File "parser.pxi", line 1569, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:82659)
[Fri Apr 05 10:27:54 2013] [error] ValueError: Unicode strings with encoding declaration are not supported.
this is the XML
<?xml version="1.1" encoding="ISO-8859-1"?>
<itagg_delivery_receipt>
<version>1.0</version>
<msisdn>447889000000</msisdn>
<submission_ref>
845tgrgsehg394g3hdfhhh56445y7ts6</
submission_ref>
<status>Delivered</status>
<reason>4</reason>
<timestamp>20050709120945</timestamp>
<retry>0</retry>
</itagg_delivery_receipt>
I don't have control over the xml document this comes from the SMS company.
解决方案
You'll have to encode it and then force the same encoding in the parser:
from lxml import etree
from lxml.etree import fromstring
if request.POST:
xml = request.POST['xml'].encode('utf-8')
parser = etree.XMLParser(ns_clean=True, recover=True, encoding='utf-8')
h = fromstring(xml, parser=parser)
return HttpResponse(h.cssselect('delivery_reciept status').text_content())
这篇关于不支持具有编码声明的XML Unicode字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文