尝试从图像 url 中抓取图像(使用 python urllib )，但改为获取 html [英] Try to scrape image from image url (using python urllib ) but get html instead

查看：51 发布时间：2021/7/16 21:44:32 python web-scraping urllib scrape

本文介绍了尝试从图像 url 中抓取图像(使用 python urllib )，但改为获取 html的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我尝试从以下网址获取图片.

I've tried to get the image from the following url.

http://upic.me/i/fj/the_wonderful_mist_once_again_01.jpg

我可以右键单击并另存为，但是当我尝试使用 urlretrieve 时

I can do right-click and save-as but when I tried to use urlretrieve like

import urllib
img_url = 'http://upic.me/i/fj/the_wonderful_mist_once_again_01.jpg'
urllib.urlretrieve( img_url, 'cover.jpg')

我发现它是 html 而不是 .jpg 图像，但我不知道为什么.你能告诉我为什么我的方法不起作用吗?有没有可以模仿右键另存为方法的选项?

I found that it is html instead of .jpg image but I don't know why. Could you please tell me why does my method not work? Are there any option that can mimic right-click save-as method?

推荐答案

您可以使用 Requests，如果你还没有安装，pip install requests

You can use Requests, if you havn't installed yet, pip install requests

因为如果您没有提供 referer 标头，此 img_url 已被服务器重定向到另一个 html 页面(即您刚刚下载的 html 页面).

Because this img_url was redirected by the server to another html page ( that was the html page you just downloaded) if you didn't provide a referer header.

所以下面的代码首先找到重定向的url，并将其添加到HTTP Referer头中.

So the following code first find the redirect url, and add it to the HTTP Referer header.

import requests
img_url = 'http://upic.me/i/fj/the_wonderful_mist_once_again_01.jpg'

r = requests.get(img_url, allow_redirects=False)   #  stop redirect 302 , capture redirects url

headers = {}
headers['Referer'] = r.headers['location']    # add this url to referer 'http://upic.me/show/55132055'

r = requests.get(img_url, headers=headers)
filename = img_url.split('/')[-1]             # find the file name in `img_url`
with open(filename, 'wb') as fh:             # use 'wb' to write in binary mode
    fh.write(r.content)

这篇关于尝试从图像 url 中抓取图像(使用 python urllib )，但改为获取 html的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

尝试从图像 url 中抓取图像(使用 python urllib )，但改为获取 html [英] Try to scrape image from image url (using python urllib ) but get html instead

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

尝试从图像 url 中抓取图像(使用 python urllib )，但改为获取 html [英] Try to scrape image from image url (using python urllib ) but get html instead

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭