如何使用 puppeteer 在页面上下载图像? [英] How can I download images on a page using puppeteer?

查看:96
本文介绍了如何使用 puppeteer 在页面上下载图像?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是网络抓取的新手,想使用 puppeteer 下载网页上的所有图像:

I'm new to web scraping and want to download all images on a webpage using puppeteer:

const puppeteer = require('puppeteer');

let scrape = async () => {
  // Actual Scraping goes Here...

  const browser = await puppeteer.launch({headless: false});
  const page = await browser.newPage();
  await page.goto('https://memeculture69.tumblr.com/');

  //   Right click and save images

};

scrape().then((value) => {
    console.log(value); // Success!
});

我看过API‌文档但是无法弄清楚如何实现这一点.所以感谢你的帮助.

I have looked at the API‌ docs but could not figure out how to acheive this. So appreciate your help.

推荐答案

这是另一个例子.它会在 google 中进行通用搜索并下载左上角的 google 图片.

Here is another example. It goes to a generic search in google and downloads the google image at the top left.

const puppeteer = require('puppeteer');
const fs = require('fs');

async function run() {
    const browser = await puppeteer.launch({
        headless: false
    });
    const page = await browser.newPage();
    await page.setViewport({ width: 1200, height: 1200 });
    await page.goto('https://www.google.com/search?q=.net+core&rlz=1C1GGRV_enUS785US785&oq=.net+core&aqs=chrome..69i57j69i60l3j69i65j69i60.999j0j7&sourceid=chrome&ie=UTF-8');

    const IMAGE_SELECTOR = '#tsf > div:nth-child(2) > div > div.logo > a > img';
    let imageHref = await page.evaluate((sel) => {
        return document.querySelector(sel).getAttribute('src').replace('/', '');
    }, IMAGE_SELECTOR);

    console.log("https://www.google.com/" + imageHref);
    var viewSource = await page.goto("https://www.google.com/" + imageHref);
    fs.writeFile(".googles-20th-birthday-us-5142672481189888-s.png", await viewSource.buffer(), function (err) {
    if (err) {
        return console.log(err);
    }

    console.log("The file was saved!");
});

    browser.close();
}

run();

如果您有要下载的图像列表,那么您可以将选择器更改为根据需要以编程方式更改,然后在图像列表中一次下载一个.

If you have a list of images you want to download then you could change the selector to programatically change as needed and go down the list of images downloading them one at a time.

这篇关于如何使用 puppeteer 在页面上下载图像?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆