为什么在取消大量HTTP请求时取消会阻塞这么长时间? [英] Why does cancellation block for so long when cancelling a lot of HTTP requests?

查看:28
本文介绍了为什么在取消大量HTTP请求时取消会阻塞这么长时间?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些代码可以使用来自一个特定主机的内容执行批量 HTML 页面处理.它尝试使用 HttpClient 同时发出大量(约 400 个)HTTP 请求.我相信最大同时连接数受到 ServicePointManager.DefaultConnectionLimit 的限制,所以我没有应用我自己的并发限制.

I have some code that performs batch HTML page processing using content from one specific host. It tries to make a large number (~400) of simultaneous HTTP requests using HttpClient. I believe that the maximum number of simultaneous connections is restricted by ServicePointManager.DefaultConnectionLimit, so I'm not applying my own concurrency restrictions.

使用Task.WhenAll将所有请求异步发送到HttpClient后,可以使用CancellationTokenSource取消整个批处理操作>CancellationToken.操作进度可通过用户界面查看,点击按钮即可取消.

After sending all of the requests asynchronously to HttpClient using Task.WhenAll, the entire batch operation can be cancelled using CancellationTokenSource and CancellationToken. The progress of the operation is viewable via a user interface, and a button can be clicked to perform the cancellation.

CancellationTokenSource.Cancel() 的调用会阻塞大约 5 到 30 秒.这会导致用户界面冻结.怀疑是因为该方法调用了注册取消通知的代码.

The call to CancellationTokenSource.Cancel() blocks for roughly 5 - 30 seconds. This causes the user interface to freeze. Is suspect that this occurs because the method is calling the code that registered for cancellation notification.

  1. 限制并发 HTTP 请求任务的数量.我认为这是一种变通方法,因为 HttpClient 似乎已经将多余的请求排入队列.
  2. 在非 UI 线程中执行 CancellationTokenSource.Cancel() 方法调用.这不太好用;直到大多数其他任务完成后,该任务才真正运行.我认为该方法的 async 版本会很好用,但我找不到.另外,我的印象是在 UI 线程中使用该方法是合适的.
  1. Limiting the number of simultaneous HTTP request tasks. I consider this a work-around because HttpClient already seems to queue excess requests itself.
  2. Performing the CancellationTokenSource.Cancel() method call in a non-UI thread. This didn't work too well; the task didn't actually run until most of the others had finished. I think an async version of the method would work well, but I couldn't find one. Also, I have the impression that it's suitable to use the method in a UI thread.

演示

代码

class Program
{
    private const int desiredNumberOfConnections = 418;

    static void Main(string[] args)
    {
        ManyHttpRequestsTest().Wait();

        Console.WriteLine("Finished.");
        Console.ReadKey();
    }

    private static async Task ManyHttpRequestsTest()
    {
        using (var client = new HttpClient())
        using (var cancellationTokenSource = new CancellationTokenSource())
        {
            var requestsCompleted = 0;

            using (var allRequestsStarted = new CountdownEvent(desiredNumberOfConnections))
            {
                Action reportRequestStarted = () => allRequestsStarted.Signal();
                Action reportRequestCompleted = () => Interlocked.Increment(ref requestsCompleted);
                Func<int, Task> getHttpResponse = index => GetHttpResponse(client, cancellationTokenSource.Token, reportRequestStarted, reportRequestCompleted);
                var httpRequestTasks = Enumerable.Range(0, desiredNumberOfConnections).Select(getHttpResponse);

                Console.WriteLine("HTTP requests batch being initiated");
                var httpRequestsTask = Task.WhenAll(httpRequestTasks);

                Console.WriteLine("Starting {0} requests (simultaneous connection limit of {1})", desiredNumberOfConnections, ServicePointManager.DefaultConnectionLimit);
                allRequestsStarted.Wait();

                Cancel(cancellationTokenSource);
                await WaitForRequestsToFinish(httpRequestsTask);
            }

            Console.WriteLine("{0} HTTP requests were completed", requestsCompleted);
        }
    }

    private static void Cancel(CancellationTokenSource cancellationTokenSource)
    {
        Console.Write("Cancelling...");

        var stopwatch = Stopwatch.StartNew();
        cancellationTokenSource.Cancel();
        stopwatch.Stop();

        Console.WriteLine("took {0} seconds", stopwatch.Elapsed.TotalSeconds);
    }

    private static async Task WaitForRequestsToFinish(Task httpRequestsTask)
    {
        Console.WriteLine("Waiting for HTTP requests to finish");

        try
        {
            await httpRequestsTask;
        }
        catch (OperationCanceledException)
        {
            Console.WriteLine("HTTP requests were cancelled");
        }
    }

    private static async Task GetHttpResponse(HttpClient client, CancellationToken cancellationToken, Action reportStarted, Action reportFinished)
    {
        var getResponse = client.GetAsync("http://www.google.com", cancellationToken);

        reportStarted();
        using (var response = await getResponse)
            response.EnsureSuccessStatusCode();
        reportFinished();
    }
}

输出

为什么取消会阻止这么长时间?另外,我做错了什么或可以做得更好吗?

Why does cancellation block for so long? Also, is there anything that I'm doing wrong or could be doing better?

推荐答案

在非 UI 线程中执行 CancellationTokenSource.Cancel() 方法调用.这不太好用;直到大多数其他任务完成后,该任务才真正运行.

Performing the CancellationTokenSource.Cancel() method call in a non-UI thread. This didn't work too well; the task didn't actually run until most of the others had finished.

这告诉我,您可能正在遭受线程池耗尽"的困扰,这就是您的线程池队列中有太多项目(来自 HTTP 请求完成),需要一段时间才能全部完成.取消可能会阻塞正在执行的某个线程池工作项,并且它无法跳到队列的头部.

What this tells me is that you're probably suffering from 'threadpool exhaustion', which is where your threadpool queue has so many items in it (from HTTP requests completing) that it takes a while to get through them all. Cancellation probably is blocking on some threadpool work item executing and it can't skip to the head of the queue.

这表明您确实需要选择考虑列表中的选项 1.限制您自己的工作,以便线程池队列保持相对较短.无论如何,这对应用的整体响应能力是有好处的.

This suggests that you do need to go with option 1 from your consideration list. Throttle your own work so that the threadpool queue remains relatively short. This is good for app responsiveness overall anyway.

我最喜欢的限制异步工作的方法是使用 Dataflow.像这样:

My favorite way to throttle async work is to use Dataflow. Something like this:

var block = new ActionBlock<Uri>(
    async uri => {
        var httpClient = new HttpClient(); // HttpClient isn't thread-safe, so protect against concurrency by using a dedicated instance for each request.
        var result = await httpClient.GetAsync(uri);
        // do more stuff with result.
    },
    new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 20, CancellationToken = cancellationToken });
for (int i = 0; i < 1000; i++)
    block.Post(new Uri("http://www.server.com/req" + i));
block.Complete();
await block.Completion; // waits until everything is done or canceled.

作为替代方案,您可以使用 Task.Factory.StartNew 传入 TaskCreationOptions.LongRunning 以便您的任务获得一个 new 线程(不隶属于线程池),这将允许它立即启动并调用 Cancel从那里.但是您可能应该解决线程池耗尽问题.

As an alternative, you could use Task.Factory.StartNew passing in TaskCreationOptions.LongRunning so your task gets a new thread (not affiliated with threadpool) which would allow it to start immediately and call Cancel from there. But you should probably solve the threadpool exhaustion problem instead.

这篇关于为什么在取消大量HTTP请求时取消会阻塞这么长时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆