Node.js puppeteer - 下载/访问 xml 文件并处理内容 [英] Node.js puppeteer - Downloading/Accessing a xml file and process the content

查看:84
本文介绍了Node.js puppeteer - 下载/访问 xml 文件并处理内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Node.js puppeteer - 如何下载、访问和处理 xml 文件以及 puppeteer 中的内容?

Node.js puppeteer - How do I download, access and process a xml file and the content in puppeteer?

点击链接时:

await page.evaluate(() => {
    document.querySelector('#datagrid > div > a:nth-child(2)').click();
});

...我可以下载一个如下所示的 xml 文件:

... I can download a xml file looking like this:

XML 文件:

<table>
    <row>
        <column>Titel01</column>
        <column>Titel02</column>
        <column>Titel03</column>
        <column>Titel04</column>
        <column>Titel05</column>
        <column>Titel06</column>
        <column>Titel07</column>
        <column>Titel08</column>
        <column>Titel09</column>
        <column>Titel10</column>
        <column>Titel11</column>
        <column>Titel12</column>
        <column>Titel13</column>
        <column>Titel14</column>
        <column>Titel15</column>
        <column>Titel16</column>
    </row>
    <row>
        <column>Value01</column>
        <column/>
        <column>Value03</column>
        <column>Value04</column>
        <column>Value05</column>
        <column>Value06</column>
        <column>Value07</column>
        <column>Value08</column>
        <column>Value09</column>
        <column>Value10</column>
        <column>Value11</column>
        <column>Value12</column>
        <column>Value13</column>
        <column>Value14</column>
        <column>Value15</column>
        <column>Value16</column>
    </row>
    ... // starting possible more rows
    <row>
        <column/>
        <column/>
        <column/>
        <column/>
        <column/>
        <column/>
        <column/>
        <column/>
        <column/>
        <column/>
        <column/>
        <column/>
        <column/>
        <column/>
        <column>Value15B</column>
        <column>Value16B</column>
    </row>
    ... // possible 
</table>  

如何访问值并将其存储在变量中以在 puppeteer 中进一步处理?

How can I access the values and store it in variables to further process it in puppeteer?

推荐答案

我不知道这是否是最好的解决方案,但有效.我会用 document.querySelector('#datagrid > div > a:nth-child(2)').href; 并在打开新页面后执行另一个 .goto ,您可以解析它.这是一个完整的例子:

I don't know if this is the best solution but works. I would return, instead of .click() the href value with document.querySelector('#datagrid > div > a:nth-child(2)').href; and do another .goto once you open the new page, you could parse it. Here a full example:

const newPage = await page.evaluate(() => {

       return document.querySelector('#datagrid > div > a:nth-child(2)').href;

        });

    await page.goto(newPage, {waitUntil: 'load'});   

    const newPage2 = await page.evaluate(() => {//<-- open the new page

        var columns = document.getElementsByTagName("column");      

        var values = {"values":[]};

        for(let f in columns){
            values.values.push(columns[f].innerText);
        }

        return JSON.stringify(values);//<-- return the values of columns

         });

    console.log(JSON.parse(newPage2))//<-- we have all values

这篇关于Node.js puppeteer - 下载/访问 xml 文件并处理内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆