使用python urllib从url下载图像,但接收HTTP错误403:禁止 [英] download image from url using python urllib but receiving HTTP Error 403: Forbidden
问题描述
我想使用python模块urllib.request从url下载图像文件,该模块适用于某些网站(例如mangastream.com),但不适用于另一个(mangadoom.co)接收错误HTTP错误403 :禁止。对于后一种情况可能是什么问题以及如何解决?
我在OSX上使用python3.4。
import urllib.request
#不工作
img_url ='http: /mangadoom.co/wp-content/manga/5170/886/005.png'
img_filename ='my_img.png'
urllib.request.urlretrieve(img_url,img_filename)
在错误消息结束时,它说:
...
HTTPError:HTTP错误403:禁止
,它适用于另一个网站
#work
img_url ='http://img.mangastream.com/cdn /manga/51/3140/006.png'
img_filename ='my_img.png'
urllib.request.urlretrieve(img_url,img_filename)
我已经尝试过以下帖子中的解决方案,但没有人在mangadoom.co上工作。
这里的解决方案也不适合,因为我的情况是下载图像。
urllib2.HTTPError:HTTP错误403:禁止
非python解决方案也是受欢迎的。您的建议将非常感激。
此网站阻止了urllib使用的用户代理,因此您需要根据您的请求进行更改。不幸的是,我不认为 urlretrieve
直接支持。
我建议使用美丽的请求
库,代码变成(从这里):
导入请求
import shutil
r = requests.get( 'http://mangadoom.co/wp-content/manga/5170/886/005.png',stream = True)
如果r.status_code == 200:
with open(img。 png,'wb')as f:
r.raw.decode_content = True
shutil.copyfileobj(r.raw,f)
请注意,似乎本网站不禁止请求
user-agent。但是,如果需要修改,很容易:
r = requests.get('http://mangadoom.co/wp -content / manga / 5170/886 / 005.png',
/ pre>
stream = True,headers = {'User-agent':'Mozilla / 5.0'}
还相关:在urllib中更改用户代理
I want to download image file from a url using python module "urllib.request", which works for some website (e.g. mangastream.com), but does not work for another (mangadoom.co) receiving error "HTTP Error 403: Forbidden". What could be the problem for the latter case and how to fix it?
I am using python3.4 on OSX.
import urllib.request # does not work img_url = 'http://mangadoom.co/wp-content/manga/5170/886/005.png' img_filename = 'my_img.png' urllib.request.urlretrieve(img_url, img_filename)
At the end of error message it said:
... HTTPError: HTTP Error 403: Forbidden
However, it works for another website
# work img_url = 'http://img.mangastream.com/cdn/manga/51/3140/006.png' img_filename = 'my_img.png' urllib.request.urlretrieve(img_url, img_filename)
I have tried the solutions from the post below, but none of them works on mangadoom.co.
Downloading a picture via urllib and python
How do I copy a remote image in python?
The solution here also does not fit because my case is to download image. urllib2.HTTPError: HTTP Error 403: Forbidden
Non-python solution is also welcome. Your suggestion will be very appreciated.
解决方案This website is blocking the user-agent used by urllib, so you need to change it in your request. Unfortunately I don't think
urlretrieve
supports this directly.I advise for the use of the beautiful
requests
library, the code becomes (from here) :import requests import shutil r = requests.get('http://mangadoom.co/wp-content/manga/5170/886/005.png', stream=True) if r.status_code == 200: with open("img.png", 'wb') as f: r.raw.decode_content = True shutil.copyfileobj(r.raw, f)
Note that it seems this website does not forbide
requests
user-agent. But if need to be modified it is easy :r = requests.get('http://mangadoom.co/wp-content/manga/5170/886/005.png', stream=True, headers={'User-agent': 'Mozilla/5.0'})
Also relevant : changing user-agent in urllib
这篇关于使用python urllib从url下载图像,但接收HTTP错误403:禁止的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!