使用Puppeteer选择href属性人 [英] Selecting href attributers with Puppeteer

查看:455
本文介绍了使用Puppeteer选择href属性人的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从中提取一些 urls

I am trying to extract a few urls from this page with Puppeteer.

但是我返回的所有脚本都是 undefined

However all my script is returning is undefined

const puppeteer = require('puppeteer');

async function run() {

    const browser = await puppeteer.launch({args: ['--no-sandbox', '--disable-setuid-sandbox']});

    const page = await browser.newPage();

    await page.goto('https://divisare.com/');


    let projects = await page.evaluate((sel) => {

        return document.getElementsByClassName(sel)
    }, 'homepage-project-image');


    var aNode = projects[0].href;

    console.log(aNode);
    console.log(projects.length)



  browser.close();

}
run();

但是当我运行以下内容时,我至少能够获得正确的链接数我正在尝试提取。

However when I run something like the below I am at least able to get the proper count of the links I am trying to extract.

let projects = await page.evaluate((sel) => {

    return document.getElementsByClassName(sel).length
}, 'homepage-project-image');


console.log(projects);

我想访问我的项目 HTMLCollection 错误?我在这里想念什么?谢谢。

Am I trying to access my projects HTMLCollection incorrectly? What am I missing here? Thanks.

推荐答案

木偶无法从 evaluate 语句返回不可序列化的值(请参见此问题和以下 PR

Puppeteer cannot return non-serialisable value from evaluate statement (see this issue and the following PR)

一种解决方法是:

let projects = await page.evaluate((sel) => {

        return document.getElementsByClassName(sel)[0].href;
    }, 'homepage-project-image');

请记住 document.getElementsByClassName 返回 HTMLCollection ,因此,如果要遍历结果,则需要类似以下内容:

Remember that document.getElementsByClassName returns HTMLCollection, so if you want to iterate over the results you need something like:

 let projects = await page.evaluate((sel) => {
            return Array.from(document.getElementsByClassName(sel)).map(node => node.href);
        }, 'homepage-project-image');

这篇关于使用Puppeteer选择href属性人的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆