如何在HTTP中解码Content-Disposition头文件名参数？ [英] How to decode the filename parameter of Content-Disposition header in HTTP?

查看：335 发布时间：2018/7/10 14:51:55 python python-3.x http-headers

本文介绍了如何在HTTP中解码Content-Disposition头文件名参数？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

问题提供了这个文件名参数的背景。

This question provides a background of this filename parameter.

我需要编写一个脚本来访问网络上的一些文件服务器。文件名包含无法用ASCII编码的CJK字符。

I need to write a script to access some files on a web server. The filename contains CJK characters which cannot be encoded in ASCII.

$ curl -I 'http://bj.baidupcs.com/file/f6f258963f3c5daaa154ed441db232e1?xcode=f5a142e99df965f6a3b4c502a3c55a73283ef282da2f5c14&fid=1107408242-250528-2625488475&time=1373046574&sign=FDTAXER-DCb740ccc5511e5e8fedcff06b081203-QSIMrWw%2FICWQuExpdtyijM0vbMM%3D&to=bb&fm=N,Q,U&expires=8h&rt=sh&r=210487178&logid=3893215518&sh=1'
......
Content-Disposition: attachment;filename="【动漫之家汉化组】[最强会长黑神][第192话][黑神目泷依然健在][END].zip"
......

如您所见，cURL正确解码文件名。 Firefox也可以找出正确的文件名。

As you see, cURL decodes the filename properly. Firefox can also figure out the correct filename.

我用Python编写了我的脚本。我首先尝试 requests ：

I wrote my script in Python. I tried requests first:

>>> import requests
>>> r=requests.head('http://bj.baidupcs.com/file/f6f258963f3c5daaa154ed441db232e1?xcode=f5a142e99df965f6a3b4c502a3c55a73283ef282da2f5c14&fid=1107408242-250528-2625488475&time=1373046574&sign=FDTAXER-DCb740ccc5511e5e8fedcff06b081203-QSIMrWw%2FICWQuExpdtyijM0vbMM%3D&to=bb&fm=N,Q,U&expires=8h&rt=sh&r=210487178&logid=3893215518&sh=1')
>>> r.headers['content-disposition']
'attachment;filename="ã\x80\x90å\x8a¨æ¼«ä¹\x8bå®¶æ±\x89å\x8c\x96ç»\x84ã\x80\x91[æ\x9c\x80å¼ºä¼\x9aé\x95¿é»\x91ç¥\x9e][ç¬¬192è¯\x9d][é»\x91ç¥\x9eç\x9b®æ³·ä¾\x9dç\x84¶å\x81¥å\x9c¨][END].zip"'

文件名看起来像Python字节的奇怪表示。问题是整个事情已经是一个Python字符串。我想不出有办法让实际的字节进行解码。

The filename looks like a weird representation of Python bytes. The problem is that this whole thing is already a Python string. I can't think of a way to get the actual bytes to decode.

>>> type(r.headers['content-disposition'])
<class 'str'>

基础库请求使用的是 http.client 标准库。我尝试过但得到了同样的东西：

The underlying library requests uses is the http.client standard library. I tried it but got the same thing:

>>> import http.client
>>> conn = http.client.HTTPConnection("bj.baidupcs.com")
>>> conn.request('HEAD', '/file/f6f258963f3c5daaa154ed441db232e1?xcode=f5a142e99df965f6a3b4c502a3c55a73283ef282da2f5c14&fid=1107408242-250528-2625488475&time=1373046574&sign=FDTAXER-DCb740ccc5511e5e8fedcff06b081203-QSIMrWw%2FICWQuExpdtyijM0vbMM%3D&to=bb&fm=N,Q,U&expires=8h&rt=sh&r=210487178&logid=3893215518&sh=1')
>>> r=conn.getresponse()
>>> r.getheader('content-disposition')
'attachment;filename="ã\x80\x90å\x8a¨æ¼«ä¹\x8bå®¶æ±\x89å\x8c\x96ç»\x84ã\x80\x91[æ\x9c\x80å¼ºä¼\x9aé\x95¿é»\x91ç¥\x9e][ç¬¬192è¯\x9d][é»\x91ç¥\x9eç\x9b®æ³·ä¾\x9dç\x84¶å\x81¥å\x9c¨][END].zip"'

我在Windows上使用Python 3.

I'm using Python 3 on Windows.

如何在HTTP中解码Content-Disposition头文件名参数？ [英] How to decode the filename parameter of Content-Disposition header in HTTP?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何在HTTP中解码Content-Disposition头文件名参数？ [英] How to decode the filename parameter of Content-Disposition header in HTTP?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭