如何处理在puppeteer中ajax请求后加载的元素 [英] how to handle elements that load after ajax request in puppeteer

查看:1489
本文介绍了如何处理在puppeteer中ajax请求后加载的元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用puppeteer进行网页抓取。我最近需要处理的元素。当我点击搜索按钮时,结果会加载到AJAX中,我需要选择我要选择的元素是在搜索结果中,而不是在页面的初始加载中。它生成的页面截图也包含搜索结果,如果它输出HTML源代码,我也可以在那里看到该元素。但不知道为什么我不能选择它。

I'm trying to do web scraping using puppeteer. The element I need to handle loads lately. When I click on the search button the result loads in AJAX and I need to pick the element I am trying to pick is in the search results but not in the initial load of the page. The page screenshot it is producing contains search results too and if it output the HTML source I can see the element there too. but not sure why I cannot pick it.

推荐答案

你可以使用 await page.waitForSelector(cssSelector) ; 要求Puppeteer在继续执行脚本中的进一步步骤之前等待UI中显示任何元素。默认情况下,等待的超时为30秒,但您可以将其设置为您希望的任何超时。

You can use await page.waitForSelector(cssSelector); to ask Puppeteer to wait for any element to be displayed in the UI before continuing on to further steps in your script. By default, the timeout for the wait is 30 seconds but you can set it to any timeout you wish.

所以在您的情况下我会:

So in your case I would:


  • 在搜索栏中输入您的搜索文本。

  • 点击搜索按钮(这将执行您的AJAX调用以加载结果)。

  • 使用 await page.waitForSelector(cssSelector); 让Puppeteer等到你确定某个元素将会是

  • 现在,Puppeteer已将该元素注册为可见,您知道您希望对其执行的任何操作都显示在 执行搜索后。也将正确执行。

  • Enter your search text into the search bar.
  • Click on the search button (this will execute your AJAX call to load the results).
  • Use await page.waitForSelector(cssSelector); to ask Puppeteer to wait until some element you are sure will be displayed in the UI after executing the search is visible.
  • Now that Puppeteer has registered the element as visible, you know that any actions you wish to perform on it will also execute correctly.

如果您不使用 waitForSelector(),您可能会发现什么调用是显示元素,但Puppeteer将超时,例如,如果您希望对元素执行单击命令。这是因为点击事件(以及与元素交互的其他Puppeteer事件)的超时非常短,有时脚本(特别是在无头模式下)可以移动到下一条指令太快,以至于不能让UI快速更新以便跟上。

What you might find happens, if you don't use that waitForSelector() call is that the element is displayed but Puppeteer will timeout, for example, if you wish to execute a click command on an element. This is because the timeouts for click events (and other Puppeteer events which interact with elements) is very short and sometimes the script (especially in headless mode) can move to the next instruction too quickly to allow for the UI to update fast enough to keep up.

所以通过添加额外的 waitForSelector 调用,你也使你的脚本更加健壮。特别是当数据按照您的情况动态生成时。

So by adding the additional waitForSelector calls, you're also making your scripts much more robust. Especially when data is being generated dynamically as they are in your case.

这篇关于如何处理在puppeteer中ajax请求后加载的元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆