从Google图像搜索下载图像(Python) [英] Download images from google image search (python)

查看:171
本文介绍了从Google图像搜索下载图像(Python)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是网络抓取初学者. 首先,我指的是 https://www.youtube.com/watch?v=ZAUNEEtzsrg下载带有特定标签的图像(例如),就可以了! 但是我遇到了一个新问题,该问题只能下载约100张图像,而这个问题似乎是"ajax",它仅加载第一页html而不能全部加载.因此,似乎我们必须模拟向下滚动才能下载下100张或更多图片.

I am web scraping beginner. I am firstly refer to https://www.youtube.com/watch?v=ZAUNEEtzsrg to download image with the specific tag(e.g. cat), and it works! But I encountered new problem which only can download about 100 images, and this problem seems like "ajax" which only load the first page html and not load all. Therefore, it seem like we must simulate scroll down to download next 100 images or more.

我的代码: https://drive.google.com/file/d/0Bwjk-LKe_AohNk9CNXVQbGRxMHc/edit?usp = sharing

总而言之,问题如下:

  1. 如何通过python中的源代码下载google图像搜索中的所有图像(请给我一些示例:))

  1. how to download all images in google image search by source code in python( Please give me some examples :) )

有什么我必须知道的网页抓取技术吗?

Have any web scraping technique I must need to know?

推荐答案

我的最终解决方案是使用爬虫.

My final solution is using icrawler.

from icrawler.examples import GoogleImageCrawler

google_crawler = GoogleImageCrawler('your_image_dir')
google_crawler.crawl(keyword='sunny', offset=0, max_num=1000,
                     date_min=None, date_max=None, feeder_thr_num=1,
                     parser_thr_num=1, downloader_thr_num=4,
                     min_size=(200,200), max_size=None)

该框架的优点是包含5个内置的搜寻器(谷歌,必应,百度,闪烁和常规搜寻),但从Google搜寻时仍仅提供100张图像.

The advantage the framework contains 5 built-in crawler (google, bing, baidu, flicker and general crawl), but it still only provide 100 images when crawl from google.

这篇关于从Google图像搜索下载图像(Python)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆