使用客户端输出刮取页面的高效练习？ [英] Efficient practice to scrape a page with Client-side output?

查看：112 发布时间：2019/6/8 20:01:22 javascript python web-scraping pyqt pyqt4

本文介绍了使用客户端输出刮取页面的高效练习？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想要一个每小时刮掉某个网页的脚本，并会在该页面内查找某个字符串。

I want a script that will scrape a certain web page every hour, and will look for a certain string inside that page.

然而，当我进入该页面时并使用`view：source'，我在源代码中看不到该字符串。我被告知这是因为我正在寻找的字符串来自客户端（javascript）呈现的元素，因此我可以看到只有当我用Chrome控制台手动检查该元素时才会这样做。

However, when I enter that page and use `view:source", I cannot see that string in the source. I was told that it's because the string I'm looking for comes from an element that is rendered on the client side (javascript), and thus I can see it only when I manually inspect that element with Chrome console for example.

哪种练习/编程语言/环境，最有效的实现我想要的，考虑到我想从我的webhost服务器运行该脚本，该服务器有2.25GB内存？

Which practice / programming language / environment, would be the most efficient to achieve what I want, considering that I want to run that script from my webhost server, which has 2.25GB RAM?

有人建议我使用Pyqt4，但我的网络主机警告我这将是杀死我的RAM并损害服务器性能。我应该注意，脚本应该非常简单，并且每小时只扫描一个页面。

Someone suggested that I will use Pyqt4, but my web-host warned me that this will kill my RAM and hurt server performance. I should note that the script supposed to be very simple, and scrape only a single page, once in an hour.

使用客户端输出刮取页面的高效练习？ [英] Efficient practice to scrape a page with Client-side output?

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

使用客户端输出刮取页面的高效练习？ [英] Efficient practice to scrape a page with Client-side output?

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭