多个Web请求的最佳多线程方法 [英] Best multi-thread approach for multiple web requests
问题描述
我想创建一个程序来爬网并检查我的网站是否存在http错误和其他情况. 我想使用多个线程来执行此操作,这些线程应该接受要爬网的url之类的参数. 尽管我希望X线程处于活动状态,但仍有Y个任务正在等待执行.
I want to create a program to crawl and check my websites for http errors and other things. I want to do this with multiple threads that should accept parameters like the url to crawl. Although I want X threads to be active there are Y Tasks waiting already to be executed.
现在,我想知道执行此操作的最佳策略是什么:线程池,任务,线程甚至其他内容?
Now I wanted to know what is the best strategy to do this: ThreadPool, Tasks, Threads or even something else?
推荐答案
下面是一个示例,该示例显示了如何使一堆任务排队,但如何限制并发运行的任务数量.它使用Queue
跟踪准备运行的任务,并使用Dictionary
跟踪正在运行的任务.任务完成后,它会调用回调方法以将自己从Dictionary
中删除. async
方法用于在空间可用时启动排队的任务.
Here's an example that shows how to queue up a bunch of tasks but limit the number that are concurrently running . It uses a Queue
to keep track of tasks that are ready to run and uses a Dictionary
to keep track of tasks that are running. When a task finishes it invokes a callback method to remove itself from the Dictionary
. An async
method is used to launch queued tasks as space becomes available.
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
namespace MinimalTaskDemo
{
class Program
{
private static readonly Queue<Task> WaitingTasks = new Queue<Task>();
private static readonly Dictionary<int, Task> RunningTasks = new Dictionary<int, Task>();
public static int MaxRunningTasks = 100; // vary this to dynamically throttle launching new tasks
static void Main(string[] args)
{
var tokenSource = new CancellationTokenSource();
var token = tokenSource.Token;
Worker.Done = new Worker.DoneDelegate(WorkerDone);
for (int i = 0; i < 1000; i++) // queue some tasks
{
// task state (i) will be our key for RunningTasks
WaitingTasks.Enqueue(new Task(id => new Worker().DoWork((int)id, token), i, token));
}
LaunchTasks();
Console.ReadKey();
if (RunningTasks.Count > 0)
{
lock (WaitingTasks) WaitingTasks.Clear();
tokenSource.Cancel();
Console.ReadKey();
}
}
static async void LaunchTasks()
{
// keep checking until we're done
while ((WaitingTasks.Count > 0) || (RunningTasks.Count > 0))
{
// launch tasks when there's room
while ((WaitingTasks.Count > 0) && (RunningTasks.Count < MaxRunningTasks))
{
Task task = WaitingTasks.Dequeue();
lock (RunningTasks) RunningTasks.Add((int)task.AsyncState, task);
task.Start();
}
UpdateConsole();
await Task.Delay(300); // wait before checking again
}
UpdateConsole(); // all done
}
static void UpdateConsole()
{
Console.Write(string.Format("\rwaiting: {0,3:##0} running: {1,3:##0} ", WaitingTasks.Count, RunningTasks.Count));
}
// callback from finished worker
static void WorkerDone(int id)
{
lock (RunningTasks) RunningTasks.Remove(id);
}
}
internal class Worker
{
public delegate void DoneDelegate(int taskId);
public static DoneDelegate Done { private get; set; }
private static readonly Random Rnd = new Random();
public async void DoWork(object id, CancellationToken token)
{
for (int i = 0; i < Rnd.Next(20); i++)
{
if (token.IsCancellationRequested) break;
await Task.Delay(100); // simulate work
}
Done((int)id);
}
}
}
这篇关于多个Web请求的最佳多线程方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!