如何以“更智能"的方式使用python下载文件? [英] How to download a file using python in a 'smarter' way?

查看：27 发布时间：2021/12/11 10:35:34 python http download

本文介绍了如何以“更智能"的方式使用python下载文件?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要在 Python 中通过 http 下载多个文件.

I need to download several files via http in Python.

最明显的方法就是使用 urllib2:

The most obvious way to do it is just using urllib2:

import urllib2
u = urllib2.urlopen('http://server.com/file.html')
localFile = open('file.html', 'w')
localFile.write(u.read())
localFile.close()

但我必须以某种方式处理令人讨厌的 URL，像这样说:http://server.com/!Run.aspx/someoddtext/somemore?id=121&m=pdf.当通过浏览器下载时，文件有一个人类可读的名称，即.accounts.pdf.

But I'll have to deal with the URLs that are nasty in some way, say like this: http://server.com/!Run.aspx/someoddtext/somemore?id=121&m=pdf. When downloaded via the browser, the file has a human-readable name, ie. accounts.pdf.

有没有办法在 python 中处理这个问题，所以我不需要知道文件名并将它们硬编码到我的脚本中?

Is there any way to handle that in python, so I don't need to know the file names and hardcode them into my script?

推荐答案

下载这样的脚本往往会推送一个标题，告诉用户代理该文件的名称:

Download scripts like that tend to push a header telling the user-agent what to name the file:

Content-Disposition: attachment; filename="the filename.ext"

如果你能抓住那个标题，你就能得到正确的文件名.

If you can grab that header, you can get the proper filename.

有另一个线程，其中有一些代码可用于Content-处置-抓取.

There's another thread that has a little bit of code to offer up for Content-Disposition-grabbing.

remotefile = urllib2.urlopen('http://example.com/somefile.zip')
remotefile.info()['Content-Disposition']

这篇关于如何以“更智能"的方式使用python下载文件?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何以“更智能"的方式使用python下载文件? [英] How to download a file using python in a 'smarter' way?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何以“更智能"的方式使用python下载文件? [英] How to download a file using python in a &#39;smarter&#39; way?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

如何以“更智能"的方式使用python下载文件? [英] How to download a file using python in a 'smarter' way?

登录关闭