在Python中创建和解析多部分HTTP请求 [英] Create and parse multipart HTTP requests in Python

查看:131
本文介绍了在Python中创建和解析多部分HTTP请求的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试编写一些python代码,可以在客户端创建多部分mime http请求,然后在服务器上进行适当的解释。我认为,我在客户端部分取得了成功:

I'm trying to write some python code which can create multipart mime http requests in the client, and then appropriately interpret then on the server. I have, I think, partially succeeded on the client end with this:

from email.mime.multipart import MIMEMultipart, MIMEBase
import httplib
h1 = httplib.HTTPConnection('localhost:8080')
msg = MIMEMultipart()
fp = open('myfile.zip', 'rb')
base = MIMEBase("application", "octet-stream")
base.set_payload(fp.read())
msg.attach(base)
h1.request("POST", "http://localhost:8080/server", msg.as_string())

唯一的问题是电子邮件库还包括Content-Type和MIME-Version标头,我不确定它们将如何与httplib包含的HTTP标头相关:

The only problem with this is that the email library also includes the Content-Type and MIME-Version headers, and I'm not sure how they're going to be related to the HTTP headers included by httplib:

Content-Type: multipart/mixed; boundary="===============2050792481=="
MIME-Version: 1.0

--===============2050792481==
Content-Type: application/octet-stream
MIME-Version: 1.0

这可能是我的web.py应用程序收到此请求后,我收到错误消息的原因。 web.py POST处理程序:

This may be the reason that when this request is received by my web.py application, I just get an error message. The web.py POST handler:

class MultipartServer:
    def POST(self, collection):
        print web.input()

引发此错误:

Traceback (most recent call last):
  File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/application.py", line 242, in process
    return self.handle()
  File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/application.py", line 233, in handle
    return self._delegate(fn, self.fvars, args)
  File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/application.py", line 415, in _delegate
    return handle_class(cls)
  File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/application.py", line 390, in handle_class
    return tocall(*args)
  File "/home/richard/Development/server/webservice.py", line 31, in POST
    print web.input()
  File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/webapi.py", line 279, in input
    return storify(out, *requireds, **defaults)
  File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/utils.py", line 150, in storify
    value = getvalue(value)
  File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/utils.py", line 139, in getvalue
    return unicodify(x)
  File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/utils.py", line 130, in unicodify
    if _unicode and isinstance(s, str): return safeunicode(s)
  File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/utils.py", line 326, in safeunicode
    return obj.decode(encoding)
  File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 137-138: invalid data

我的代码行由误差线表示大约一半:

My line of code is represented by the error line about half way down:

  File "/home/richard/Development/server/webservice.py", line 31, in POST
    print web.input()

它即将到来,但我不知道从哪里开始。这是我的客户端代码的问题,还是web.py的限制(也许它只是不支持多部分请求)?我们将非常感激地提供替代代码库的任何提示或建议。

It's coming along, but I'm not sure where to go from here. Is this a problem with my client code, or a limitation of web.py (perhaps it just can't support multipart requests)? Any hints or suggestions of alternative code libraries would be gratefully received.

编辑

上述错误是由数据不是自动base64编码。添加

The error above was caused by the data not being automatically base64 encoded. Adding

encoders.encode_base64(base)

摆脱这个错误,现在问题很明显了。服务器中没有正确解释HTTP请求,大概是因为电子邮件库包含了正文中的HTTP标头:

Gets rid of this error, and now the problem is clear. HTTP request isn't being interpreted correctly in the server, presumably because the email library is including what should be the HTTP headers in the body instead:

<Storage {'Content-Type: multipart/mixed': u'', 
          ' boundary': u'"===============1342637378=="\n'
          'MIME-Version: 1.0\n\n--===============1342637378==\n'
          'Content-Type: application/octet-stream\n'
          'MIME-Version: 1.0\n' 
          'Content-Transfer-Encoding: base64\n'
          '\n0fINCs PBk1jAAAAAAAAA.... etc

所以有些事情不存在。

谢谢

Richard

推荐答案

经过一番探索,这个问题的答案已经变得清晰了。简短的回答是,尽管Mime编码的消息中的 Content-Disposition是可选的, web.py为每个mime-part需要它才能正确解析HTTP请求。

After a bit of exploration, the answer to this question has become clear. The short answer is that although the Content-Disposition is optional in a Mime-encoded message, web.py requires it for each mime-part in order to correctly parse out the HTTP request.

与此问题的其他评论相反,HTTP和Email之间的区别是无关紧要,因为它们只是Mime消息的传输机制,仅此而已。多部分/相关(非多部分/表单数据)消息在内容交换Web服务中很常见,这是此处的用例。但是,提供的代码片段是准确的,并使我得到了一个稍微简单的问题解决方案。

Contrary to other comments on this question, the difference between HTTP and Email is irrelevant, as they are simply transport mechanisms for the Mime message and nothing more. Multipart/related (not multipart/form-data) messages are common in content exchanging webservices, which is the use case here. The code snippets provided are accurate, though, and led me to a slightly briefer solution to the problem.

# open an HTTP connection
h1 = httplib.HTTPConnection('localhost:8080')

# create a mime multipart message of type multipart/related
msg = MIMEMultipart("related")

# create a mime-part containing a zip file, with a Content-Disposition header
# on the section
fp = open('file.zip', 'rb')
base = MIMEBase("application", "zip")
base['Content-Disposition'] = 'file; name="package"; filename="file.zip"'
base.set_payload(fp.read())
encoders.encode_base64(base)
msg.attach(base)

# Here's a rubbish bit: chomp through the header rows, until hitting a newline on
# its own, and read each string on the way as an HTTP header, and reading the rest
# of the message into a new variable
header_mode = True
headers = {}
body = []
for line in msg.as_string().splitlines(True):
    if line == "\n" and header_mode == True:
        header_mode = False
    if header_mode:
        (key, value) = line.split(":", 1)
        headers[key.strip()] = value.strip()
    else:
        body.append(line)
body = "".join(body)

# do the request, with the separated headers and body
h1.request("POST", "http://localhost:8080/server", body, headers)

这很好地被web.py所接受,所以很明显email.mime.multipart适合于c通过HTTP传输Mime消息,除了它的头处理。

This is picked up perfectly well by web.py, so it's clear that email.mime.multipart is suitable for creating Mime messages to be transported by HTTP, with the exception of its header handling.

我的另一个整体概念是可扩展性。这个解决方案和这里提出的其他解决方案都没有很好地扩展,因为它们在捆绑mime消息之前将文件的内容读入变量。一个更好的解决方案是,当内容通过HTTP连接传出时,可以按需串行化。我并不急于解决这个问题,但是如果我接触到这个问题,我会带着解决方案回到这里。

My other overall conern is in scalability. Neither this solution nor the others proposed here scale well, as they read the contents of a file into a variable before bundling up in the mime message. A better solution would be one which could serialise on demand as the content is piped out over the HTTP connection. It's not urgent for me to fix that, but I'll come back here with a solution if I get to it.

这篇关于在Python中创建和解析多部分HTTP请求的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆