使用httplib的POST二进制数据会导致Unicode异常 [英] POST binary data using httplib cause Unicode exceptions
问题描述
当我尝试使用urllib2发送图像时,发生UnicodeDecodeError异常.
HTTP Post正文:
f = open(imagepath, "rb")
binary = f.read()
mimetype, devnull = mimetypes.guess_type(urllib.pathname2url(imagepath))
body = """Content-Length: {size}
Content-Type: {mimetype}
{binary}
""".format(size=os.path.getsize(imagepath),
mimetype=mimetype,
binary=binary)
request = urllib2.Request(url, body, headers)
opener = urllib2.build_opener(urllib2.HTTPSHandler(debuglevel=1))
response = opener.open(request)
print response.read()
跟踪:
response = opener.open(request)
File "/usr/local/lib/python2.7/urllib2.py", line 404, in open
response = self._open(req, data)
File "/usr/local/lib/python2.7/urllib2.py", line 422, in _open
'_open', req)
File "/usr/local/lib/python2.7/urllib2.py", line 382, in _call_chain
result = func(*args)
File "/usr/local/lib/python2.7/urllib2.py", line 1222, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/usr/local/lib/python2.7/urllib2.py", line 1181, in do_open
h.request(req.get_method(), req.get_selector(), req.data, headers)
File "/usr/local/lib/python2.7/httplib.py", line 973, in request
self._send_request(method, url, body, headers)
File "/usr/local/lib/python2.7/httplib.py", line 1007, in _send_request
self.endheaders(body)
File "/usr/local/lib/python2.7/httplib.py", line 969, in endheaders
self._send_output(message_body)
File "/usr/local/lib/python2.7/httplib.py", line 827, in _send_output
msg += message_body
File "/home/usertmp/biogeek/lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xff in position 49: invalid start byte
python版本2.7.5
有人知道解决方案吗?
您正在尝试发送包含标头和内容的正文.如果要发送内容类型和内容长度,则需要在标头中而不是在正文中:
headers = {'Content-Type': mimetype, 'Content-Length', str(size)}
request = urllib2.Request(url, data=binary, headers=headers)
如果您未设置Content-Length标头,则会自动将其设置为data
关于您的错误:它正在发生
msg += message_body
仅当这两个字符串之一是unicode
,而另一个str
包含\xff
时,才会发生此错误,因为在这种情况下,后者将使用sys.getdefaultencoding()
自动转换为unicode.>
我最后的猜测是:message_body
这是您的data
,它是一个str
,并且在某处包含\xff
. msg
是先前传递给HTTPConnection的内容,即标头,它们是unicode,因为您对标头中的至少一个键使用了unicode(这些值之前已转换为str
),或者您已经导入了__futures__
中的unicode_literals
.
When i try to send an image with urllib2 the UnicodeDecodeError exception is occured.
HTTP Post body:
f = open(imagepath, "rb")
binary = f.read()
mimetype, devnull = mimetypes.guess_type(urllib.pathname2url(imagepath))
body = """Content-Length: {size}
Content-Type: {mimetype}
{binary}
""".format(size=os.path.getsize(imagepath),
mimetype=mimetype,
binary=binary)
request = urllib2.Request(url, body, headers)
opener = urllib2.build_opener(urllib2.HTTPSHandler(debuglevel=1))
response = opener.open(request)
print response.read()
Traceback :
response = opener.open(request)
File "/usr/local/lib/python2.7/urllib2.py", line 404, in open
response = self._open(req, data)
File "/usr/local/lib/python2.7/urllib2.py", line 422, in _open
'_open', req)
File "/usr/local/lib/python2.7/urllib2.py", line 382, in _call_chain
result = func(*args)
File "/usr/local/lib/python2.7/urllib2.py", line 1222, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/usr/local/lib/python2.7/urllib2.py", line 1181, in do_open
h.request(req.get_method(), req.get_selector(), req.data, headers)
File "/usr/local/lib/python2.7/httplib.py", line 973, in request
self._send_request(method, url, body, headers)
File "/usr/local/lib/python2.7/httplib.py", line 1007, in _send_request
self.endheaders(body)
File "/usr/local/lib/python2.7/httplib.py", line 969, in endheaders
self._send_output(message_body)
File "/usr/local/lib/python2.7/httplib.py", line 827, in _send_output
msg += message_body
File "/home/usertmp/biogeek/lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xff in position 49: invalid start byte
python version 2.7.5
Anyone know a solution to this?
You're trying to send a body containing headers and content. If you want to send content type and content length, you need to do it in the headers, not in the body:
headers = {'Content-Type': mimetype, 'Content-Length', str(size)}
request = urllib2.Request(url, data=binary, headers=headers)
If you don't set the Content-Length header, it will be automatically set to the size of data
As to your error: it's happening on the line
msg += message_body
This error can only happen, if one of these two strings is unicode
, and the other str
containing \xff
, as in that case the latter will be automatically coecred to unicode using sys.getdefaultencoding()
.
My final guess would be: message_body
here is your data
, which is a str
and contains \xff
somewhere. msg
is what has been passed to the HTTPConnection earlier, namely the headers, and they are unicode because you either used unicode for at least one key in your headers (the values are converted to str
earlier), or you have imported unicode_literals
from __futures__
.
这篇关于使用httplib的POST二进制数据会导致Unicode异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!