HttpClient中有多少个连接 [英] How many connections in HttpClient

查看:217
本文介绍了HttpClient中有多少个连接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须从互联网上下载大约16000个文档和相同数量的html页面。这个数字将来会增加。目前,我只是使用 Parallel.ForEach 来并行下载和处理数据。但是,这似乎并没有充分利用我的资源,因此我打算发挥 async / await 的作用,使尽可能多的下载异步运行,但是我会

I have to download about 16k documents and the same amount of html pages from the internet. This number will increase in the future. Currently I am just using Parallel.ForEach to download and work on the data in parallel. This however does not seem to fully utilize my resources, so I am planning to bring async/await into play, to have as many downloads running in asynchronously as possible, but I will probably have to limit that.

单个 HttpClient 有吗?创建如此数量的连接时,我还必须牢记哪些其他因素?我知道我应该重用相同的 HttpClient ,并且我还阅读了此答案,但是我怀疑我真的可以一次打开数十亿个连接。

How many open connections can a single HttpClient have? What other factors will I have to keep in mind when creating such an amount of connections? I am aware that I should reuse the same HttpClientand I have also read this answer, but I have doubts that I can really have several billion connections open at once.

推荐答案

首先,好的通话从 Parallel.ForEach 切换到 async / await 的过程。通过突破线程的限制,您将能够将并发性提高几个数量级。

First, good call on switching from Parallel.ForEach to async/await. By breaking from the constraints of threads, you'll be able to increase concurrency by orders of magnitude.


我怀疑我是否真的可以一次有数十亿个连接打开。

I have doubts that I can really have several billion connections open at once.

假设您可以。您是否认为这项工作的完成速度比您一次打开1000项要快?首先要遇到的限制是带宽(或服务器可能拒绝请求),而不是并发连接。因此,如果您的目标是尽可能快地完成工作,那么我建议您一次可以打开的最大连接数量甚至不相关。

Let's say you could. Do you think the job would complete any faster than if you had, say, 1000 open at once? The limitation you're going to bump up against first is bandwidth (or possibly the server refusing requests), not concurrent connections. So I would suggest the max number of connections you can possibly have open at once isn't even relevant if your goal is to complete the job as fast as possible.

说,.NET有默认限制。假设您使用的是完整框架或.NET Core 2.x,则可以通过 ServicePointManager.DefaultConnectionLimit ,其默认值为2。将其设置为更大的值。

That said, there are default limits imposed by .NET. Assuming you're on full framework or .NET Core 2.x, the limit can be changed programatically via ServicePointManager.DefaultConnectionLimit, which has a default value of just 2. Set it to something much bigger.

接下来,我建议使用 SemaphoreSlim 或TPL Dataflow设置您的代码以同时执行下载,并达到一定的限制。答案此问题都很好地涵盖了这两种方法。然后开始实验,直到得出一个最佳数字。很难说那是什么。也许从50开始。如果进展顺利,将其增加到100,然后查看整体作业是否完成得更快。如果您开始收到套接字异常或服务器返回的错误,请记下它。

Next I would suggest setting up your code to perform the downloads concurrently up to some limit, using either SemaphoreSlim or TPL Dataflow. Both approaches are well covered in answers to this question. Then start experimenting until you come up with an optimal number. Hard to say what that is. Maybe start with 50. If it goes well, increase it to 100 and see if the overall job completes any faster. If you start getting socket exceptions or errors returned from the server, dial it down.

这篇关于HttpClient中有多少个连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆