WebBrowser线程似乎没有关闭 [英] WebBrowser Threads don't seem to be closing

查看:178
本文介绍了WebBrowser线程似乎没有关闭的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 WebBrowser 在网页上呈现javascript以抓取渲染的源代码,但在多次加载页面后,CPU使用率达到100%以及数量线程。

I am using WebBrowser to render javascript on webpages to scrape the rendered source code, but after several page loads, the CPU usage spikes to 100% as well as the number of threads.

我假设一旦网页呈现后线程没有正确关闭。我正在尝试打开浏览器,提取源代码,然后关闭浏览器并转到下一页。

I'm assuming that the threads are not closing properly once the webpage has been rendered. I am trying to open the browser, extract the source code, and then close the browser and move to the next page.

我能够获取呈现的页面,但是这个程序在陷入困境之前并没有走得太远。我尝试添加 wb.Stop()但这没有帮助。记忆似乎不是问题(保持在70%左右)。

I am able to get the rendered page, but this program doesn't make it very far before getting bogged down. I tried adding wb.Stop() but that didn't help. The memory doesn't seem to be the problem (stays at a constant 70% or so).

这是我的源代码。
使用System;
使用System.Collections.Generic;
使用System.Linq;
使用System.Text;
使用System.Threading.Tasks;
使用System.Windows.Forms;
使用System.Threading;

Here is my source code. using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Threading.Tasks; using System.Windows.Forms; using System.Threading;

namespace Abot.Demo
{
    // Threaded version
    public class HeadlessBrowser
    {
        private static string GeneratedSource { get; set; }
        private static string URL { get; set; }

        public static string GetGeneratedHTML(string url)
        {
            URL = url;

            Thread t = new Thread(new ThreadStart(WebBrowserThread));
            t.SetApartmentState(ApartmentState.STA);
            t.Start();
            t.Join();

            return GeneratedSource;
        }

        private static void WebBrowserThread()
        {
            WebBrowser wb = new WebBrowser();
            wb.Navigate(URL);

            wb.DocumentCompleted +=
                new WebBrowserDocumentCompletedEventHandler(
                    wb_DocumentCompleted);

            while (wb.ReadyState != WebBrowserReadyState.Complete);
                //Application.DoEvents();

            //Added this line, because the final HTML takes a while to show up
            GeneratedSource = wb.Document.Body.InnerHtml;

            wb.Dispose();
            wb.Stop();
        }

        private static void wb_DocumentCompleted(object sender,
            WebBrowserDocumentCompletedEventArgs e)
        {
            WebBrowser wb = (WebBrowser)sender;
            GeneratedSource = wb.Document.Body.InnerHtml;
        }

    }
}

任何建议不胜感激。

谢谢。

推荐答案

WebBrowser 专门设计用于从Windows窗体项目中使用。它不是为从Windows窗体项目外部使用而设计的。

WebBrowser is specifically designed to be used from inside a windows forms project. It is not designed to be used from outside a windows forms project.

除此之外,它还专门设计用于应用程序循环,几乎可以存在桌面GUI应用程序。你没有这个,这当然会给你带来麻烦,因为浏览器会利用它来实现基于事件的编程风格。

Among other things, it is specifically designed to use an application loop, which would exist in pretty much any desktop GUI application. You don't have this, and this is of course causing problems for you because the browser leverages this for its event based style of programming.

快速宣传任何未来读者恰好正在阅读本文并且实际上正在创建一个winforms,WPF或其他已经有消息循环的应用程序。不要应用以下代码。您应该只在应用程序中有一个消息循环。创建几个是为梦魇设置自己。

A quick word to any future readers who happen to be reading this and which are actually creating a winforms, WPF, or other application that already has a message loop. Do not apply the following code. You should only ever have one message loop in your application. Creating several is setting yourself up for a nightmare.

由于没有应用程序循环,您需要创建一个新的应用程序循环,指定一些代码要在该应用程序循环中运行,允许它抽取消息,然后在获得结果时将其拆除。

Since you have no application loop you need to create a new application loop, specify some code to run within that application loop, allow it to pump messages, and then tear it down when you have gotten your result.

public static string GetGeneratedHTML(string url)
{
    string result = null;
    ThreadStart pumpMessages = () =>
    {
        EventHandler idleHandler = null;
        idleHandler = (s, e) =>
        {
            Application.Idle -= idleHandler;

            WebBrowser wb = new WebBrowser();
            wb.DocumentCompleted += (s2, e2) =>
            {
                result = wb.Document.Body.InnerHtml;
                wb.Dispose();
                Application.Exit();
            };
            wb.Navigate(url);
        };
        Application.Idle += idleHandler;
        Application.Run();
    };
    if (Thread.CurrentThread.GetApartmentState() == ApartmentState.STA)
        pumpMessages();
    else
    {
        Thread t = new Thread(pumpMessages);
        t.SetApartmentState(ApartmentState.STA);
        t.Start();
        t.Join();
    }
    return result;
}

这篇关于WebBrowser线程似乎没有关闭的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆