从CefSharp Web浏览器获取HTML源代码 [英] Get HTML source code from CefSharp web browser

查看:993
本文介绍了从CefSharp Web浏览器获取HTML源代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用aCefSharp.Wpf.ChromiumWebBrowser(版本47.0.3.0)加载网页。页面加载后,我想获取源代码。

I am using aCefSharp.Wpf.ChromiumWebBrowser (Version 47.0.3.0) to load a web page. Some point after the page has loaded I want to get the source code.

我打电话给了:

wb.GetBrowser().MainFrame.GetSourceAsync()

但是似乎并没有返回所有源代码(我相信这是因为存在子框架)。

however it does not appear to be returning all the source code (I believe this is because there are child frames).

如果我调用:

wb.GetBrowser().MainFrame.ViewSource() 

我可以看到它列出了所有源代码(包括内部框架)。

I can see it lists all the source code (including the inner frames).

我希望得到与ViewSource()相同的结果。有人可以指出我的正确方向吗?

I would like to get the same result as ViewSource(). Could some one point me in the right direction please?

更新-添加了代码示例

注意:Web浏览器指向的地址也只能在2016年10月3日(含)使用。之后,它可能会显示不同的数据,而不是我要查看的数据。

在frmSelection.xaml文件中

In the frmSelection.xaml file

<cefSharp:ChromiumWebBrowser Name="wb" Grid.Column="1" Grid.Row="0" />

在frmSelection.xaml.cs文件中

In the frmSelection.xaml.cs file

public partial class frmSelection : UserControl
{
    private System.Windows.Threading.DispatcherTimer wbTimer = new System.Windows.Threading.DispatcherTimer();

    public frmSelection()
    {

         InitializeComponent();

         // This timer will start when a web page has been loaded.
         // It will wait 4 seconds and then call wbTimer_Tick which 
         // will then see if data can be extracted from the web page.
         wbTimer.Interval = new TimeSpan(0, 0, 4);
         wbTimer.Tick += new EventHandler(wbTimer_Tick);

         wb.Address = "http://www.racingpost.com/horses2/cards/card.sd?race_id=644222&r_date=2016-03-10#raceTabs=sc_";

         wb.FrameLoadEnd += new EventHandler<CefSharp.FrameLoadEndEventArgs>(wb_FrameLoadEnd);

    }

        void wb_FrameLoadEnd(object sender, CefSharp.FrameLoadEndEventArgs e)
        {
            if (wbTimer.IsEnabled)
                wbTimer.Stop();

            wbTimer.Start();
        }

    void wbTimer_Tick(object sender, EventArgs e)
    {
        wbTimer.Stop();
        string html = GetHTMLFromWebBrowser();
    }

    private string GetHTMLFromWebBrowser()
    {
         // call the ViewSource method which will open up notepad and display the html.
         // this is just so I can compare it to the html returned in GetSourceAsync()
         // This is displaying all the html code (including child frames)
            wb.GetBrowser().MainFrame.ViewSource();

         // Get the html source code from the main Frame.
            // This is displaying only code in the main frame and not any child frames of it.
            Task<String> taskHtml = wb.GetBrowser().MainFrame.GetSourceAsync();

            string response = taskHtml.Result;
     return response;
  }

}


推荐答案

我认为我不太了解这种 DispatcherTimer 解决方案。我会这样:

I don't think I quite get this DispatcherTimer solution. I would do it like this:

public frmSelection()
{
    InitializeComponent();

    wb.FrameLoadEnd += WebBrowserFrameLoadEnded;
    wb.Address = "http://www.racingpost.com/horses2/cards/card.sd?race_id=644222&r_date=2016-03-10#raceTabs=sc_";
}

private void WebBrowserFrameLoadEnded(object sender, FrameLoadEndEventArgs e)
{
    if (e.Frame.IsMain)
    {
        wb.ViewSource();
        wb.GetSourceAsync().ContinueWith(taskHtml =>
        {
            var html = taskHtml.Result;
        });
    }
}

我对<$ c的输出进行了比较$ c> ViewSource 和 html 变量中的文本是相同的,因此我在这里无法重现您的问题。

I did a diff on the output of ViewSource and the text in the html variable and they are the same, so I can't reproduce your problem here.

这是说,我注意到主机装入的时间很晚,因此您必须等待一段时间,直到记事本随源一起弹出。

This said, I noticed that the main frame gets loaded pretty late, so you have to wait quite a while until the notepad pops up with the source.

这篇关于从CefSharp Web浏览器获取HTML源代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆