Puppeteer/Node.js 只要按钮存在就点击它——当它不再存在时,开始行动 [英] Puppeteer / Node.js to click a button as long as it exists -- and when it no longer exists, commence action

查看:58
本文介绍了Puppeteer/Node.js 只要按钮存在就点击它——当它不再存在时,开始行动的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有一个网页包含多行不断更新的数据.

There is a web page that contains many rows of data that are continually updated.

有固定的行数,所以旧的行被循环出去而不存储在任何地方.

There is a fixed number of rows, so old rows are cycled out and not stored anywhere.

此页面由加载更多"按钮分隔,该按钮会一直出现,直到所有存储的行都显示在页面上.

This page is broken up by a "load more" button that will appear until all of the stored rows are displayed on the page.

我需要在 Puppeteer/Node.js 中编写一个脚本,点击该按钮,直到它不再存在于页面上...

I need to write a script in Puppeteer / Node.js that clicks that button until it no longer exists on the page...

然后

...阅读页面上的所有文本.(我已经完成了这部分脚本.)

...read all the text on the page. (I have this part of the script finished.)

我是 Puppeteer 的新手,不知道如何设置.任何帮助将不胜感激.

I am new to Puppeteer and not sure how to set this up. Any help would be greatly appreciated.

我添加了这个块:

  const cssSelector = await page.evaluate(() => document.cssSelector('.u-field-button Button-button-18U-i'));

  // Click the "load more" button repeatedly until it no longer appears
  const isElementVisible = async (page, cssSelector) => {
    await page.waitForSelector(cssSelector, { visible: true, timeout: 2000 })
    .catch(() => {
      return false;
    });
    return true;
  };

  let loadMoreVisible = await isElementVisible(page, cssSelector);
  while (loadMoreVisible) {
    await page.click(cssSelector);
    loadMoreVisible = await isElementVisible(page, cssSelector);
  }

但我收到此错误:

Error: Evaluation failed: TypeError: document.cssSelector is not a function
    at __puppeteer_evaluation_script__:1:17
    at ExecutionContext.evaluateHandle (/Users/reallymemorable/node_modules/puppeteer/lib/ExecutionContext.js:124:13)
    at process.internalTickCallback (internal/process/next_tick.js:77:7)
  -- ASYNC --
    at ExecutionContext.<anonymous> (/Users/reallymemorable/node_modules/puppeteer/lib/helper.js:144:27)
    at ExecutionContext.evaluate (/Users/reallymemorable/node_modules/puppeteer/lib/ExecutionContext.js:58:31)
    at ExecutionContext.<anonymous> (/Users/reallymemorable/node_modules/puppeteer/lib/helper.js:145:23)
    at Frame.evaluate (/Users/reallymemorable/node_modules/puppeteer/lib/FrameManager.js:439:20)
    at process.internalTickCallback (internal/process/next_tick.js:77:7)
  -- ASYNC --
    at Frame.<anonymous> (/Users/reallymemorable/node_modules/puppeteer/lib/helper.js:144:27)
    at Page.evaluate (/Users/reallymemorable/node_modules/puppeteer/lib/Page.js:736:43)
    at Page.<anonymous> (/Users/reallymemorable/node_modules/puppeteer/lib/helper.js:145:23)
    at /Users/reallymemorable/Documents/scripts.scrapers/squarespace.ip.scraper/squarespace5.js:32:34
    at process.internalTickCallback (internal/process/next_tick.js:77:7)
(node:8009) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:8009) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

推荐答案

好的,为了实现这一目标,我建议您这样做.我将忽略您的数据总是有固定数量的行(这可能会在未来发生变化),而是会通过不断单击来设置是否有未知数量的数据行显示加载更多"按钮.

OK this is what I'd recommend you do in order to achieve this. I'm going to ignore that there are always a fixed number of rows for your data (maybe this will change in future) and instead will set you up for if there are an unknown number of rows of data to be displayed by continually clicking on the "load more" button.

因此,您要做的第一件事是设置一个方法来决定是否在 UI 中显示加载更多"按钮.您想通过编写如下方法来做到这一点:

So the first thing you want to do is set up a method which decides if the "load more" button is displayed in the UI. You want to do that by writing a method as follows:

const isElementVisible = async (page, cssSelector) => {
  let visible = true;
  await page
    .waitForSelector(cssSelector, { visible: true, timeout: 2000 })
    .catch(() => {
      visible = false;
    });
  return visible;
};

一旦您传入所需的 css 选择器(在本例中为加载更多"按钮的选择器),如果按钮被显示,则此方法将返回 truefalse> 如果不是.

Once you pass in your required css selector (in this case the selector for your "load more" button) this method will return true if the button is displayed and false if it is not.

您希望超时为 2000,因为您希望不断检查此按钮是否显示.如果未显示,则超时将默认为 30000,这对于让您的代码等待等待的时间太长了.所以我发现 2000 是一个不错的折衷方案.catch 块的目的是捕捉当元素不再显示时将抛出的错误——你想忽略抛出错误的事实,因为你正试图抓住重点不再显示按钮的地方.您知道它不会在 X 次点击后显示.没关系.因此,您需要 catch 错误以在发生这种情况时彻底绕过.

You want the timeout to be 2000 because you want to continually check that this button is displayed. If it's not displayed, the timeout would otherwise default to 30000 and that's far too long to have your code hanging around waiting. So I find that 2000 is a nice compromise. The purpose of the catch block is to catch the error that will be thrown when the element is no longer displayed - you want to ignore the fact that the error is thrown since you are trying to get to the point where the button is no longer displayed. You know that it won't be displayed after X amount of clicks. That's fine. So you need to catch the error to cleanly bypass when that happens.

下一步就是做这样的事情,让你的代码继续点击加载更多"按钮,直到它不再可点击(即显示):

Next step, then, is to do something like this in order to let your code continue clicking on the "load more" button until it is no longer clickable (ie. displayed):

let loadMoreVisible = await isElementVisible(page, selectorForLoadMoreButton);
while (loadMoreVisible) {
  await page
    .click(selectorForLoadMoreButton)
    .catch(() => {});
  loadMoreVisible = await isElementVisible(page, selectorForLoadMoreButton);
}

这将持续检查该按钮是否在您的 UI 中可见,如果显示则单击它,然后重复该过程直到该按钮不再显示.这可确保在您继续测试脚本的其余部分之前,所有数据行都将显示在 UI 中.

This will continually check for if the button is visible in your UI, click it if it is displayed and then repeat the process until the button is no longer displayed. This ensures that all rows of data will be displayed in the UI before you continue on with the remainder of your test script.

您还需要在 click 操作上使用 catch 块,如上所示.这样做的原因是 headless 模式移动得非常快.有时太快,用户界面跟不上它.通常,在显示更多"按钮的最后一次显示时,isElementVisible 方法将在 UI 更新之前执行以消除按钮的存在,因此它返回 true 实际上,现在不再显示选择器.然后,这会触发 click 请求的异常,因为该元素不再存在.对我来说,解决这个问题的最简洁的方法是在 click 指令上添加那个空的 catch 块,这样,如果发生这种情况,click 操作仍将完全绕过,而不会使整个测试失败.

You will also need a catch block on the click action as shown above. The reason for this is that headless mode moves very quickly. Sometimes too quickly for the UI to keep up with it. Usually, on the very last display of the "Show More" button, the isElementVisible method will execute before the UI has updated to eliminate the presence of the button, thus it returns true when, in fact, the selector is now no longer displayed. This, then, triggers an exception from the click request since the element is no longer there. For me, the cleanest way to work around this is to add that empty catch block on the click instruction so that, if this happens, the click action will still bypass cleanly without failing your entire test.

更新 1:

您只是错误地使用了 css 选择器.您的选择器应该是:

You're just using the css selector incorrectly. Your selector should be:

const cssSelector = '.u-field-button Button-button-18U-i'; // This is your CSS selector for the element

您不需要为此使用 evaluate 方法.

You don't need to use the evaluate method for that.

更新 2:

好的,我已经添加了一些改进,我已经在几个不同的站点上广泛测试了这段代码,发现我自己的逻辑不太适合点击这些按钮的一刀切"方法所以这可能就是你得到这些例外的原因.我已经用所做的所有更改更新了我的原始答案.

OK I've added some improvements, I've extensively tested this code on a few different sites and found that my own logic wasn't quite right for a "one size fits all" approach to clicking on these sort of buttons so this is probably why you're getting those exceptions. I've updated my original answer with all changes made.

请注意:我已经更新了 isElementVisible 方法while 循环.

Just a quick note: I've updated both the isElementVisible method and the while loop.

希望这会有所帮助!

这篇关于Puppeteer/Node.js 只要按钮存在就点击它——当它不再存在时,开始行动的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆