.Net 4.5 中的异步 HttpClient 是密集负载应用程序的糟糕选择吗? [英] Is async HttpClient from .Net 4.5 a bad choice for intensive load applications?

查看:19
本文介绍了.Net 4.5 中的异步 HttpClient 是密集负载应用程序的糟糕选择吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近创建了一个简单的应用程序来测试 HTTP 调用吞吐量,该应用程序可以以异步方式生成,而不是经典的多线程方法.

I recently created a simple application for testing the HTTP call throughput that can be generated in an asynchronous manner vs a classical multithreaded approach.

该应用程序能够执行预定义数量的 HTTP 调用,并在最后显示执行它们所需的总时间.在我的测试过程中,所有 HTTP 调用都是对我的本地 IIS 服务器进行的,它们检索了一个小文本文件(大小为 12 字节).

The application is a able to perform a predefined number of HTTP calls and at the end it displays the total time needed to perform them. During my tests, all HTTP calls were made to my local IIS sever and they retrieved a small text file (12 bytes in size).

下面列出了异步实现代码中最重要的部分:

The most important part of the code for the asynchronous implementation is listed below:

public async void TestAsync()
{
    this.TestInit();
    HttpClient httpClient = new HttpClient();

    for (int i = 0; i < NUMBER_OF_REQUESTS; i++)
    {
        ProcessUrlAsync(httpClient);
    }
}

private async void ProcessUrlAsync(HttpClient httpClient)
{
    HttpResponseMessage httpResponse = null;

    try
    {
        Task<HttpResponseMessage> getTask = httpClient.GetAsync(URL);
        httpResponse = await getTask;

        Interlocked.Increment(ref _successfulCalls);
    }
    catch (Exception ex)
    {
        Interlocked.Increment(ref _failedCalls);
    }
    finally
    { 
        if(httpResponse != null) httpResponse.Dispose();
    }

    lock (_syncLock)
    {
        _itemsLeft--;
        if (_itemsLeft == 0)
        {
            _utcEndTime = DateTime.UtcNow;
            this.DisplayTestResults();
        }
    }
}

下面列出了多线程实现中最重要的部分:

The most important part of the multithreading implementation is listed below:

public void TestParallel2()
{
    this.TestInit();
    ServicePointManager.DefaultConnectionLimit = 100;

    for (int i = 0; i < NUMBER_OF_REQUESTS; i++)
    {
        Task.Run(() =>
        {
            try
            {
                this.PerformWebRequestGet();
                Interlocked.Increment(ref _successfulCalls);
            }
            catch (Exception ex)
            {
                Interlocked.Increment(ref _failedCalls);
            }

            lock (_syncLock)
            {
                _itemsLeft--;
                if (_itemsLeft == 0)
                {
                    _utcEndTime = DateTime.UtcNow;
                    this.DisplayTestResults();
                }
            }
        });
    }
}

private void PerformWebRequestGet()
{ 
    HttpWebRequest request = null;
    HttpWebResponse response = null;

    try
    {
        request = (HttpWebRequest)WebRequest.Create(URL);
        request.Method = "GET";
        request.KeepAlive = true;
        response = (HttpWebResponse)request.GetResponse();
    }
    finally
    {
        if (response != null) response.Close();
    }
}

运行测试表明多线程版本速度更快.完成 10k 请求大约需要 0.6 秒,而异步请求大约需要 2 秒才能完成相同的负载量.这有点出乎意料,因为我期待 async 更快.也许是因为我的 HTTP 调用非常快.在现实世界中,服务器应该执行更有意义的操作并且还应该有一些网络延迟,结果可能会相反.

Running the tests revealed that the multithreaded version was faster. It took it around 0.6 seconds to complete for 10k requests, while the async one took around 2 seconds to complete for the same amount of load. This was a bit of a surprise, because I was expecting the async one to be faster. Maybe it was because of the fact that my HTTP calls were very fast. In a real world scenario, where the server should perform a more meaningful operation and where there should also be some network latency, the results might be reversed.

然而,真正让我担心的是当负载增加时 HttpClient 的行为方式.由于传递 10k 条消息需要大约 2 秒,因此我认为传递 10 倍的消息需要大约 20 秒,但运行测试表明传递 100k 条消息需要大约 50 秒.此外,传递 200k 条消息通常需要 2 分钟以上的时间,并且通常有数千条 (3-4k) 消息失败,但有以下例外情况:

However, what really concerns me is the way HttpClient behaves when the load is increased. Since it takes it around 2 seconds to deliver 10k messages, I thought it would take it around 20 seconds to deliver 10 times the number of messages, but running the test showed that it needs around 50 seconds to deliver the 100k messages. Furthermore, it usually takes it more than 2 minutes to deliver 200k messages and often, a few thousands of them (3-4k) fail with the following exception:

无法对套接字执行操作,因为系统缺少足够的缓冲区空间或队列已满.

An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full.

我检查了 IIS 日志和失败的操作从未到达服务器.他们在客户端失败了.我在 Windows 7 机器上运行测试,临时端口的默认范围为 49152 到 65535.运行 netstat 显示测试期间使用了大约 5-6k 端口,因此理论上应该有更多可用端口.如果缺少端口确实是导致异常的原因,则意味着 netstat 没有正确报告情况或 HttClient 仅使用了最大数量的端口,之后它开始抛出异常.

I checked the IIS logs and operations that failed never got to the server. They failed within the client. I ran the tests on a Windows 7 machine with the default range of ephemeral ports of 49152 to 65535. Running netstat showed that around 5-6k ports were being used during tests, so in theory there should have been many more available. If the lack of ports was indeed the cause of the exceptions it means that either netstat didn't properly report the situation or HttClient only uses a maximum number of ports after which it starts throwing exceptions.

相比之下,生成 HTTP 调用的多线程方法的行为非常可预测.我花了大约 0.6 秒来处理 10k 条消息,大约 5.5 秒处理 100k 条消息,正如预期的那样,处理 100 万条消息需要大约 55 秒.没有一条消息失败.此外,当它运行时,它从未使用超过 55 MB 的 RAM(根据 Windows 任务管理器).异步发送消息时使用的内存与负载成比例地增长.在 20 万条消息测试期间,它使用了大约 500 MB 的 RAM.

By contrast, the multithread approach of generating HTTP calls behaved very predictable. I took it around 0.6 seconds for 10k messages, around 5.5 seconds for 100k messages and as expected around 55 seconds for 1 million messages. None of the messages failed. Further more, while it ran, it never used more than 55 MB of RAM (according to Windows Task Manager). The memory used when sending messages asynchronously grew proportionally with the load. It used around 500 MB of RAM during the 200k messages tests.

我认为造成上述结果的主要原因有两个.第一个是 HttpClient 似乎非常贪婪地与服务器创建新连接.netstat 报告的大量使用端口意味着它可能不会从 HTTP 保持活动中受益.

I think there are two main reasons for the above results. The first one is that HttpClient seems to be very greedy in creating new connections with the server. The high number of used ports reported by netstat means that it probably doesn't benefit much from HTTP keep-alive.

第二个是 HttpClient 似乎没有节流机制.事实上,这似乎是与异步操作相关的普遍问题.如果您需要执行大量操作,它们将同时启动,然后它们的延续将在可用时执行.理论上这应该没问题,因为在异步操作中,负载在外部系统上,但正如上面所证明的那样,情况并非完全如此.一次启动大量请求会增加内存使用量并减慢整个执行速度.

The second is that HttpClient doesn't seem to have a throttling mechanism. In fact this seems to be a general problem related to async operations. If you need to perform a very large number of operations they will all be started at once and then their continuations will be executed as they are available. In theory this should be ok, because in async operations the load is on external systems but as proved above this is not entirely the case. Having a big number of requests started at once will increase the memory usage and slow down the entire execution.

通过使用简单但原始的延迟机制限制异步请求的最大数量,我设法获得了更好的结果、内存和执行时间:

I managed to obtain better results, memory and execution time wise, by limiting the maximum number of asynchronous requests with a simple but primitive delay mechanism:

public async void TestAsyncWithDelay()
{
    this.TestInit();
    HttpClient httpClient = new HttpClient();

    for (int i = 0; i < NUMBER_OF_REQUESTS; i++)
    {
        if (_activeRequestsCount >= MAX_CONCURENT_REQUESTS)
            await Task.Delay(DELAY_TIME);

        ProcessUrlAsyncWithReqCount(httpClient);
    }
}

如果 HttpClient 包含限制并发请求数量的机制,那将非常有用.使用Task类(基于.Net线程池)时,通过限制并发线程数自动实现节流.

It would be really useful if HttpClient included a mechanism for limiting the number of concurrent requests. When using the Task class (which is based on the .Net thread pool) throttling is automatically achieved by limiting the number of concurrent threads.

为了完整概述,我还创建了一个基于 HttpWebRequest 而不是 HttpClient 的异步测试版本,并设法获得了更好的结果.首先,它允许设置并发连接数的限制(使用 ServicePointManager.DefaultConnectionLimit 或通过配置),这意味着它永远不会用完端口并且永远不会失败任何请求(HttpClient,默认情况下,基于 HttpWebRequest,不过好像忽略了连接限制设置)

For a complete overview, I have also created a version of the async test based on HttpWebRequest rather than HttpClient and managed to obtain much better results. For a start, it allows setting a limit on the number of concurrent connections (with ServicePointManager.DefaultConnectionLimit or via config), which means that it never ran out of ports and never failed on any request (HttpClient, by default, is based on HttpWebRequest, but it seems to ignore the connection limit setting).

异步 ​​HttpWebRequest 方法仍然比多线程方法慢 50 - 60%,但它是可预测和可靠的.唯一的缺点是它在大负载下使用了大量内存.例如,它需要大约 1.6 GB 才能发送 100 万个请求.通过限制并发请求的数量(就像我上面对 HttpClient 所做的那样),我设法将使用的内存减少到 20 MB,并且获得的执行时间仅比多线程方法慢 10%.

The async HttpWebRequest approach was still about 50 - 60% slower than the multithreading one, but it was predictable and reliable. The only downside to it was that it used a huge amount of memory under big load. For example it needed around 1.6 GB for sending 1 million requests. By limiting the number of concurrent requests (like I did above for HttpClient) I managed to reduce the used memory to just 20 MB and obtain an execution time just 10% slower than the multithreading approach.

在这个冗长的演示之后,我的问题是:.Net 4.5 中的 HttpClient 类是否是密集负载应用程序的糟糕选择?有什么方法可以限制它,应该可以解决我提到的问题吗?HttpWebRequest 的异步风格怎么样?

After this lengthy presentation, my questions are: Is the HttpClient class from .Net 4.5 a bad choice for intensive load applications? Is there any way to throttle it, which should fix the problems I mention about? How about the async flavor of HttpWebRequest?

更新(感谢@Stephen Cleary)

事实证明,HttpClient 就像 HttpWebRequest(默认基于它)一样,可以在同一主机上使用 ServicePointManager.DefaultConnectionLimit 限制其并发连接数.奇怪的是,根据 MSDN,连接限制的默认值是 2.我还使用调试器检查了我这边的情况,它指出 2 确实是默认值.但是,似乎除非明确为 ServicePointManager.DefaultConnectionLimit 设置值,否则默认值将被忽略.由于我在 HttpClient 测试期间没有明确设置它的值,我认为它被忽略了.

As it turns out, HttpClient, just like HttpWebRequest (on which it is based by default), can have its number of concurrent connections on the same host limited with ServicePointManager.DefaultConnectionLimit. The strange thing is that according to MSDN, the default value for the connection limit is 2. I also checked that on my side using the debugger which pointed that indeed 2 is the default value. However, it seems that unless explicitly setting a value to ServicePointManager.DefaultConnectionLimit, the default value will be ignored. Since I didn't explicitly set a value for it during my HttpClient tests I thought it was ignored.

将 ServicePointManager.DefaultConnectionLimit 设置为 100 后,HttpClient 变得可靠且可预测(netstat 确认仅使用了 100 个端口).它仍然比异步 HttpWebRequest 慢(大约 40%),但奇怪的是,它使用的内存更少.对于涉及 100 万个请求的测试,它最多使用了 550 MB,而异步 HttpWebRequest 使用了 1.6 GB.

After setting ServicePointManager.DefaultConnectionLimit to 100 HttpClient became reliable and predictable (netstat confirms that only 100 ports are used). It is still slower than async HttpWebRequest (by about 40%), but strangely, it uses less memory. For the test which involves 1 million requests, it used a maximum of 550 MB, compared to 1.6 GB in the async HttpWebRequest.

因此,虽然 HttpClient 与 ServicePointManager.DefaultConnectionLimit 的组合似乎确保了可靠性(至少对于所有调用都针对同一主机的情况),但它的性能似乎仍因缺乏适当的节流机制.将并发请求数限制为可配置值并将其余请求放入队列中的东西将使其更适合高可扩展性场景.

So, while HttpClient in combination ServicePointManager.DefaultConnectionLimit seem to ensure reliability (at least for the scenario where all the calls are being made towards the same host), it still looks like its performance is negatively impacted by the lack of a proper throttling mechanism. Something that would limit the concurrent number of requests to a configurable value and put the rest in a queue would make it much more suitable for high scalability scenarios.

推荐答案

除了问题中提到的测试之外,我最近还创建了一些新的测试,涉及的 HTTP 调用要少得多(5000 次,而之前为 100 万次),但请求花费了很多时间执行时间更长(500 毫秒,而之前大约为 1 毫秒).两个测试应用程序,同步多线程应用程序(基于 HttpWebRequest)和异步 I/O 应用程序(基于 HTTP 客户端)产生了相似的结果:使用大约 3% 的 CPU 和 30 MB 的内存执行大约 10 秒.两个测试人员之间的唯一区别是多线程的使用 310 个线程来执行,而异步的只有 22 个.因此,在将 I/O 绑定和 CPU 绑定操作结合在一起的应用程序中,异步版本会产生更好的结果因为执行 CPU 操作的线程会有更多可用的 CPU 时间,而这些线程才是真正需要它的线程(等待 I/O 操作完成的线程只是在浪费).

Besides the tests mentioned in the question, I recently created some new ones involving much fewer HTTP calls (5000 compared to 1 million previously) but on requests that took much longer to execute (500 milliseconds compared to around 1 millisecond previously). Both tester applications, the synchronously multithreaded one (based on HttpWebRequest) and asynchronous I/O one (based on HTTP client) produced similar results: about 10 seconds to execute using around 3% of the CPU and 30 MB of memory. The only difference between the two testers was that the multithreaded one used 310 threads to execute, while the asynchronous one just 22. So in an application that would have combined both I/O bound and CPU bound operations the asynchronous version would have produced better results because there would have been more CPU time available for the threads performing CPU operations, which are the ones that actually need it (threads waiting for I/O operations to complete are just wasting).

作为我测试的结论,在处理非常快的请求时,异步 HTTP 调用不是最佳选择.其背后的原因是,当运行包含异步 I/O 调用的任务时,启动该任务的线程会在进行异步调用后立即退出,而任务的其余部分则注册为回调.然后,当 I/O 操作完成时,回调将排队等待在第一个可用线程上执行.所有这些都会产生开销,这使得在启动它们的线程上执行快速 I/O 操作时更高效.

As a conclusion to my tests, asynchronous HTTP calls are not the best option when dealing with very fast requests. The reason behind that is that when running a task that contains an asynchronous I/O call, the thread on which the task is started is quit as soon the as the asynchronous call is made and the rest of the task is registered as a callback. Then, when the I/O operation completes, the callback is queued for execution on the first available thread. All this creates an overhead, which makes fast I/O operations to be more efficient when executed on the thread that started them.

异步 ​​HTTP 调用在处理长时间或可能很长的 I/O 操作时是一个不错的选择,因为它不会让任何线程忙于等待 I/O 操作完成.这减少了应用程序使用的线程总数,从而允许 CPU 绑定操作花费更多的 CPU 时间.此外,在仅分配有限数量线程的应用程序(如 Web 应用程序的情况)上,异步 I/O 可防止线程池线程耗尽,如果同步执行 I/O 调用可能会发生这种情况.

Asynchronous HTTP calls are a good option when dealing with long or potentially long I/O operations because it doesn't keep any threads busy on waiting for the I/O operations to complete. This decreases the overall number of threads used by an application allowing more CPU time to be spent by CPU bound operations. Furthermore, on applications that only allocate a limited number of threads (like it is the case with web applications), asynchronous I/O prevents thread pool thread depletion, which can happen if performing I/O calls synchronously.

因此,异步 HttpClient 不是密集负载应用程序的瓶颈.只是就其性质而言,它不太适合非常快速的 HTTP 请求,相反,它非常适合长或可能长的请求,尤其是在只有有限数量可用线程的应用程序中.此外,通过 ServicePointManager.DefaultConnectionLimit 使用足够高的值来限制并发性是一种很好的做法,以确保良好的并行度,但又足够低以防止临时端口耗尽.您可以在此处.

So, async HttpClient is not a bottleneck for intensive load applications. It is just that by its nature it is not very well suited for very fast HTTP requests, instead it is ideal for long or potentially long ones, especially inside applications that only have a limited number of threads available. Also, it is a good practice to limit concurrency via ServicePointManager.DefaultConnectionLimit with a value that high enough to ensure a good level of parallelism, but low enough to prevent ephemeral port depletion. You can find more details on the tests and conclusions presented for this question here.

这篇关于.Net 4.5 中的异步 HttpClient 是密集负载应用程序的糟糕选择吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆