如何有效地使Web请求的1000尽可能快地 [英] How to efficiently make 1000s of web requests as quickly as possible

查看:120
本文介绍了如何有效地使Web请求的1000尽可能快地的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要从一个C#控制台应用程序轻量级(即小的Content-Length)Web请求100,000s。什么是我能做到的最快方法(即已经完成在最短的时间内所有的请求),什么最佳做法,我应该遵循?我不能射后不理,因为我需要捕捉的响应。

I need to make 100,000s of lightweight (i.e. small Content-Length) web requests from a C# console app. What is the fastest way I can do this (i.e. have completed all the requests in the shortest possible time) and what best practices should I follow? I can't fire and forget because I need to capture the responses.

presumably我想使用异步 Web请求的方法,但是我想知道什么存储所有<$ C的开销的影响$ C>任务延续和编组会。

Presumably I'd want to use the async web requests methods, however I'm wondering what the impact of the overhead of storing all the Task continuations and marshalling would be.

内存消耗不是一个整体的关注,其目的是速度。

Memory consumption is not an overall concern, the objective is speed.

presumably我也想利用所有内核的使用。

Presumably I'd also want to make use of all the cores available.

所以,我可以做这样的事情:

So I can do something like this:

Parallel.ForEach(iterations, i =>
{
    var response = await MakeRequest(i);
    // do thing with response
});

但不会让我比任何只是我的内核数量快...

but that wont make me any faster than just my number of cores...

我可以这样做:

Parallel.ForEach(iterations, i =>
{
    var response = MakeRequest(i);
    response.GetAwaiter().OnCompleted(() =>
    {
        // do thing with response
    });
});

但我怎么把我的程序运行后的的ForEach 。持有上所有的任务 WhenAll ING他们感觉臃肿,是否有任何现有的模式或佣工有某种任务队列?

but how do I keep my program running after the ForEach. Holding on to all the Tasks and WhenAlling them feels bloated, are there any existing patterns or helpers to have some kind of Task queue?

有没有办法得到任何好转,我怎么应该处理节流/错误检测?例如,如果远程端点反应慢,我不想继续滥发它

Is there any way to get any better, and how should I handle throttling/error detection? For instance, if the remote endpoint is slow to respond I don't want to continue spamming it

我知道我还需要做的:

ServicePointManager.DefaultConnectionLimit = int.MaxValue

还有什么必要吗?

Anything else necessary?

推荐答案

并行类不与异步循环体的工作,所以你不能使用它。您的循环体几乎立即完成并返回的任务。没有效益的并行这里。

The Parallel class does not work with async loop bodies so you can't use it. Your loop body completes almost immediately and returns a task. There is no parallelism benefit here.

这是一个非常简单的问题。使用标准的一种解决方案与给定DOP异步地处理了一系列的项目(这是很好的:的 http://blogs.msdn.com/b/pfxteam/archive/2012/03/05/10278165.aspx ,使用了最后一块code的)

This is a very easy problem. Use one of the standard solutions for processing a series of items asynchronously with a given DOP (this one is good: http://blogs.msdn.com/b/pfxteam/archive/2012/03/05/10278165.aspx. Use the last piece of code).

您需要凭经验确定合适的DOP。简单地尝试不同的价值观。还有就是要获得最好的价值,因为它取决于很多事情没有理论的方法。

You need to empirically determine the right DOP. Simply try different values. There is no theoretical way to derive the best value because it is dependent on many things.

连接限制是,在你的方式唯一的限制。

The connection limit is the only limit that's in your way.

response.GetAwaiter()。OnCompleted

response.GetAwaiter().OnCompleted

不知道你想完成那里......如果你对此有何评论我会解释误会什么。

Not sure what you tried to accomplish there... If you comment I'll explain the misunderstanding.

这篇关于如何有效地使Web请求的1000尽可能快地的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆