Task.WhenAll内部的Task的错误处理 [英] Error handling for Tasks inside Task.WhenAll
问题描述
我正在尝试创建一个网络爬虫,以并行方式查询许多网址,并使用Task.WhenAll()等待它们的响应.但是,如果任务之一不成功,则WhenAll将失败.我期望许多任务返回404,并希望处理或忽略这些任务.例如:
I'm trying to create a web-scraper that queries a lot of urls in parallel and waits for their responses using Task.WhenAll(). However if one of the Tasks are unsuccessful, WhenAll fails. I am expecting many of the Tasks to return a 404 and wish to handle or ignore those. For example:
string urls = Enumerable.Range(1, 1000).Select(i => "https://somewebsite.com/" + i));
List<Task<string>> tasks = new List<Task<string>>();
foreach (string url in urls)
{
tasks.Add(Task.Run(() => {
try
{
return (new HttpClient()).GetStringAsync(url);
}
catch (HttpRequestException)
{
return Task.FromResult<string>("");
}
}));
}
var responseStrings = await Task.WhenAll(tasks);
这永远不会碰到catch语句,并且WhenAll在第一个404处失败.如何让WhenAll忽略异常并仅返回成功完成的任务?更好的是,它可以在下面的代码中的某处完成吗?
This never hits the catch statement, and WhenAll fails at the first 404. How can I get WhenAll to ignore exceptions and just return the Tasks that completed successfully? Better yet, could it be done somewhere in the code below?
var tasks = Enumerable.Range(1, 1000).Select(i => (new HttpClient()).GetStringAsync("https://somewebsite.com/" + i))));
var responseStrings = await Task.WhenAll(tasks);
感谢您的帮助.
推荐答案
您需要使用 await
来观察异常:
You need to use await
to observe the exception:
var tasks = Enumerable.Range(1, 1000).Select(i => TryGetStringAsync("https://somewebsite.com/" + i));
var responseStrings = await Task.WhenAll(tasks);
var validResponses = responseStrings.Where(x => x != null);
private async Task TryGetStringAsync(string url)
{
try
{
return await httpClient.GetStringAsync(url);
}
catch (HttpRequestException)
{
return null;
}
}
这篇关于Task.WhenAll内部的Task的错误处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!