WebClient挂起直到超时 [英] WebClient hangs until timeout

查看:126
本文介绍了WebClient挂起直到超时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试使用WebClient下载网页,但该网页会一直挂起,直到达到WebClient中的超时为止,然后失败并出现异常。

I try do download a web page using the WebClient, but it hangs until the timeout in WebClient is reached, and then fails with an Exception.

以下代码

WebClient client = new WebClient();
string url = "https://www.nasdaq.com/de/symbol/aapl/dividend-history";
string page = client.DownloadString(url);

使用其他URL,传输可以正常进行。例如,

Using a different URL, the transfer works fine. For example

WebClient client = new WebClient();
string url = "https://www.ariva.de/apple-aktie";
string page = client.DownloadString(url);

完成非常快,并且整个html都在页面变量中。

completes very quick and has the whole html in the page variable.

使用HttpClient或WebRequest / WebResponse在第一个URL上产生相同的结果:阻止直到超时异常。

Using a HttpClient or WebRequest/WebResponse gives the same result on the first URL: block until timeout exception.

两个URL在浏览器中均可正常加载,大约需要2-5秒。
任何想法,问题出在哪里,什么解决方案可用?

Both URLs load fine in a browser, in roughly 2-5 seconds. Any idea what the problem is, and what solution is available?

我注意到在Windows窗体对话框上使用WebBrowser控件时,第一个URL会加载带有20多个javascript错误,需要确认点击。当访问第一个URL时,在浏览器中打开开发人员工具时,也会观察到同样的情况。

I noticed that when using a WebBrowser control on a Windows Forms dialog, the first URL loads with 20+ javascript errors that need to be confirm-clicked. Same can be observed when developer tools are open in a browser when accessing the first URL.

但是,WebClient不会对它获得的返回值起作用。它不会运行javascript,也不会加载引用的图片,css或其他脚本,因此这应该不是问题。

However, WebClient does NOT act on the return it gets. It does not run the javascript, and does not load referenced pictures, css or other scripts, so this should not be a problem.

谢谢!

Ralf

推荐答案

第一个站点, https:// www.nasdaq.com/de/symbol/aapl/dividend-history; ,要求:


  • < a href = https://docs.microsoft.com/zh-cn/dotnet/api/system.net.servicepointmanager.securityprotocol?f1url=https%3A%2F%2Fmsdn.microsoft.com%2Fquery%2Fdev15.query% 3FappId%3DDev15IDEF1%26l%3DEN-US%26k%3Dk(System.Net.ServicePointManager.SecurityProtocol); k(TargetFrameworkMoniker-.NETFramework%26f%3D255%26MSPPError%3D-2147217396& view = netframework-4.7.2 rel = nofollow noreferrer> ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12

  • ServicePointManager.ServerCertificateValidationCallback

  • 一组用户代理标头

  • A CookieContainer 显然不是必需的。

  • ServicePointManager.SecurityProtocol= SecurityProtocolType.Tls12
  • ServicePointManager.ServerCertificateValidationCallback
  • A set User-Agent Header
  • A CookieContainer is, apparently, not required. It should be set anyway.

此处的 User-agent 很重要。如果在User-agent 。 httpwebrequest.useragent?view = netframework-4.7.2 rel = nofollow noreferrer> WebRequest.UserAgent ,网站将激活 Http 2.0 协议和 HSTS HTTP严格传输安全性 )),只有最近的浏览器(作为参考,FireFox 56或更高版本)才支持/理解。

The User-agent here is important. If a recent User-agent is specified in the WebRequest.UserAgent, the WebSite will activate the Http 2.0 protocol and HSTS (HTTP Strict Transport Security)) that are supported/understood only by recent Browsers (as a reference, FireFox 56 or newer).

必须使用较新的浏览器作为 User-agent ,否则WebSite会期望(并等待)动态响应。使用较旧 用户代理,网站将激活 Http 1.1 协议。

Using a less recent Browser as User-agent is necessary, otherwise the WebSite will expect (and wait for) a dynamic response. Using an older User-agent, the WebSite will activate the Http 1.1 protocol.

第二个站点 https://www.ariva.de/apple-aktie;


  • ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12

  • 不需要服务器证书验证

  • 不需要特定的用户代理

我建议通过以下方式设置WebRequest(或相应的HttpClient设置):

(WebClient 可能可以工作,但可能需要派生的自定义控件)

I suggest to setup a WebRequest (or a correspnding HttpClient setup) this way:
(WebClient could work, but it'ld probably require a derived Custom Control)

private async void button1_Click(object sender, EventArgs e)
{
    button1.Enabled = false;
    Uri uri = new Uri("https://www.nasdaq.com/de/symbol/aapl/dividend-history");
    string destinationFile = "[Some Local File]";
    await HTTPDownload(uri, destinationFile);
    button1.Enabled = true;
}


CookieContainer httpCookieJar = new CookieContainer();

//The 32bit IE11 header is the User-agent used here
public async Task HTTPDownload(Uri resourceURI, string filePath)
{
    ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;
    ServicePointManager.ServerCertificateValidationCallback += (s, cert, ch, sec) => { return true; };
    ServicePointManager.DefaultConnectionLimit = 50;

    HttpWebRequest httpRequest = WebRequest.CreateHttp(resourceURI);

    try
    {
        httpRequest.CookieContainer = httpCookieJar;
        httpRequest.Timeout = (int)TimeSpan.FromSeconds(15).TotalMilliseconds;
        httpRequest.AllowAutoRedirect = true;
        httpRequest.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
        httpRequest.ServicePoint.Expect100Continue = false;
        httpRequest.UserAgent = "Mozilla / 5.0(Windows NT 6.1; WOW32; Trident / 7.0; rv: 11.0) like Gecko";
        httpRequest.Accept = "ext/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
        httpRequest.Headers.Add(HttpRequestHeader.AcceptEncoding, "gzip, deflate;q=0.8");
        httpRequest.Headers.Add(HttpRequestHeader.CacheControl, "no-cache");

        using (HttpWebResponse httpResponse = (HttpWebResponse)await httpRequest.GetResponseAsync())
        using (Stream responseStream = httpResponse.GetResponseStream())
        {
            if (httpResponse.StatusCode == HttpStatusCode.OK)
            {
                try
                {
                    int buffersize = 132072;
                    using (FileStream fileStream = File.Create(filePath, buffersize, FileOptions.Asynchronous))
                    {
                        int read;
                        byte[] buffer = new byte[buffersize];
                        while ((read = await responseStream.ReadAsync(buffer, 0, buffer.Length)) > 0)
                        {
                            await fileStream.WriteAsync(buffer, 0, read);
                        }
                    };
                }
                catch (DirectoryNotFoundException) { /* Log or throw */}
                catch (PathTooLongException) { /* Log or throw */}
                catch (IOException) { /* Log or throw */}
            }
        };
    }
    catch (WebException) { /* Log and message */} 
    catch (Exception) { /* Log and message */}
}

返回的第一个网站( nasdaq.com )的有效载荷长度为 101.562 个字节

返回的第二个网站( www.ariva.de )的有效载荷长度为 56.919 字节

The first WebSite (nasdaq.com) returned payload length is 101.562 bytes
The second WebSite (www.ariva.de) returned payload length is 56.919 bytes

这篇关于WebClient挂起直到超时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆