AJAX页面无浏览器网页抓取 [英] Browserless web scraping of ajax page

查看:190
本文介绍了AJAX页面无浏览器网页抓取的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用Selenium阅读一些教程网页抓取。

Have tried using Selenium after reading some tutorial for web scraping ..

这样做的目的是对Web /屏幕刮一个页面,Ajax调用时(这个Ajax调用后的初始页面加载制造)。

The aim is to web/screen scrape a page that loads the required data after an Ajax call when (this ajax call made after Initial page load)..

第二个目的是在后台(不打​​开任何浏览器)本地运行硒code允许加载页面(包括Ajax调用),获取最终的HTML并进行必要的处理。

The second aim is to run Selenium code in the background (not opening any browser) to allow loading the page (including the Ajax call) , retrieve the Final HTML and perform required processing locally ..

在code到现在如下($ C从的 http://www.geekonweb.com/net/web-page-scraping-using-selenium-and-net/

the code till now is as follows (code tutorial from http://www.geekonweb.com/net/web-page-scraping-using-selenium-and-net/)

public ActionResult Index()
    {
        //--
        //Below path should contain IEDriverServer.exe
        var chrome = new ChromeDriver(@"file path");
        chrome.Url = @"<url>";

        chrome.Navigate();

        //extract the html
        //var retval = ie.ExecuteScript("return document.body.outerHTML");

        string result = chrome.PageSource;


        return View();
    }

目前还没有能够找到一种方法来运行无提示硒(无GUI)。好心帮助,如果可以做。

currently have not been able to find a way to run Selenium Silently (without GUI). kindly assist if that can be done.

其次请你告诉如何才能硒被告知等待Ajax调用完成,然后检索数据。

Secondly kindly tell that how can Selenium be told to wait for the Ajax call to finish and then retrieve the data.

问候,

推荐答案

下面是<一个问题href="http://stackoverflow.com/questions/6992993/selenium-c-sharp-webdriver-wait-until-element-is-$p$psent">how等到一个元素是present 。这样做是为了等待AJAX​​

Here is a question on how to wait until an element is present. This is done to wait for the AJAX.

下面是天气的问题有可能运行硒无头

Here is a question on weather it's possible to run selenium headless.

这篇关于AJAX页面无浏览器网页抓取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆