使用 Python 从 Google 搜索下载图像会出错? [英] Downloading images from Google Search using Python gives error?

查看：33 发布时间：2021/9/24 18:46:47 python web web-scraping

本文介绍了使用 Python 从 Google 搜索下载图像会出错?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是我的代码:

import os
import sys
import time
from urllib import FancyURLopener
import urllib2
import simplejson

# Define search term
searchTerm = "parrot"

# Replace spaces ' ' in search term for '%20' in order to comply with request
searchTerm = searchTerm.replace(' ','%20')


# Start FancyURLopener with defined version 
class MyOpener(FancyURLopener): 
    version = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11)Gecko/20071127     Firefox/2.0.0.11'

myopener = MyOpener()

# Set count to 0
count= 0

for i in range(0,10):
    # Notice that the start changes for each iteration in order to request a new set of     images for each loop
    url = ('https://ajax.googleapis.com/ajax/services/search/images?' + 'v=1.0&q='+searchTerm+'&start='+str(i*10)+'&userip=MyIP')
    print url
    request = urllib2.Request(url, None, {'Referer': 'testing'})
    response = urllib2.urlopen(request)

    # Get results using JSON
    results = simplejson.load(response)
    data = results['responseData']
    dataInfo = data['results']

    # Iterate for each result and get unescaped url
    for myUrl in dataInfo:
        count = count + 1
        my_url = myUrl['unescapedUrl']
        myopener.retrieve(myUrl['unescapedUrl'],str(count)+'.jpg')

但在下载了一些图片后，我收到以下错误:

But after downloading some images I am getting following error:

Traceback (most recent call last): File "C:\Python27\img_google3.py", line 37, in dataInfo = data['results'] TypeError: 'NoneType' object has no attribute 'getitem'

这可能是什么原因造成的?

What could be causing this?

我必须从 Google 下载图像，作为训练用于图像分类的神经网络的一部分.

I have to download images from Google, as a part of training neural networks for image classification.

推荐答案

错误信息告诉您 results['responseData'] == None.您需要查看在 results 中实际得到的内容(例如 print(results))，以确定如何访问您想要的数据.

The error message tells you that results['responseData'] == None. You need to look at what you actually get in results (e.g. print(results)) to figure out how to access the data you want.

当您的错误发生时，我得到以下信息:

I get the following when your error occurs:

{u'responseData': None, # hence the error
 u'responseDetails': u'out of range start', # what went wrong
 u'responseStatus': 400} # http response code for "Bad request"

最终你加载一个 url(即 https://ajax.googleapis.com/ajax/services/search/images?v=1.0&q=parrot&start=90&userip=MyIP) 搜索结果根本不会那么高.对于较低的数字，我在 results 中得到了合理的内容:...&start=0&....

Eventually you load a url (i.e. https://ajax.googleapis.com/ajax/services/search/images?v=1.0&q=parrot&start=90&userip=MyIP) where the search results simply don't go that high. I get a sensible content in results for lower numbers: ...&start=0&....

您需要检查您是否得到任何回报，例如:

You need to check whether you get anything back, e.g.:

if results["responseStatus"] == 200:
    # response was OK, do your thing

此外，您可以简化 url 构建代码并节省字符串连接:

Also, you could make your url-building code simpler and save on the string concatenation:

template = 'https://ajax.googleapis.com/ajax/services/search/images?v=1.0&q={}&start={}&userip=MyIP'
url = template.format(searchTerm, str(i * 10))

这篇关于使用 Python 从 Google 搜索下载图像会出错?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用 Python 从 Google 搜索下载图像会出错? [英] Downloading images from Google Search using Python gives error?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用 Python 从 Google 搜索下载图像会出错? [英] Downloading images from Google Search using Python gives error?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭