异步并行下载文件 [英] Asynchronously and parallelly downloading files

查看:37
本文介绍了异步并行下载文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

编辑

我已更改问题的标题以反映我遇到的问题,但也提供了有关如何轻松实现这一目标的答案.

I've changed the title of the question to reflect the issue I had but also an answer on how to achieve this easily.

我正在尝试使第二种方法返回 Task 而不是 Task 作为第一种方法的结果,但由于试图修复它.

I am trying to make the 2nd method to return Task<TResult> instead of Task as in 1st method but I am getting a cascade of errors as a consequence of trying to fix it.

  • 我在await body(partition.Current);
  • 之前添加了return
  • 反过来它要求我在下面添加一个 return 语句,所以我在下面添加了 return null
  • 但是现在 select 语句抱怨它无法从查询中推断出类型参数
  • 我将 Task.Run 更改为 Task.Run 但没有成功.
  • I added return before await body(partition.Current);
  • In turn it asks me to add a return statement below so I added return null below
  • But now the select statement complains that it cannot infer the type argument from the query
  • I change Task.Run to Task.Run<TResult> but without success.

我该如何解决?

第一种方法来自http://blogs.msdn.com/b/pfxteam/archive/2012/03/05/10278165.aspx,第二种方法是我试图创建的重载.

The first method comes from http://blogs.msdn.com/b/pfxteam/archive/2012/03/05/10278165.aspx, the second method is the overload that I'm trying to create.

public static class Extensions
{
    public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body)
    {
        return Task.WhenAll(
            from partition in Partitioner.Create(source).GetPartitions(dop)
            select Task.Run(async delegate
            {
                using (partition)
                    while (partition.MoveNext())
                        await body(partition.Current);
            }));
    }

    public static Task ForEachAsync<T, TResult>(this IEnumerable<T> source, int dop, Func<T, Task<TResult>> body)
    {
        return Task.WhenAll(
            from partition in Partitioner.Create(source).GetPartitions(dop)
            select Task.Run(async delegate
            {
                using (partition)
                    while (partition.MoveNext())
                        await body(partition.Current);
            }));
    }
}

使用示例:

使用这种方法,我想并行和异步下载多个文件:

With this method I'd like to download multiple files in parallel and asynchronously :

private async void MainWindow_Loaded(object sender, RoutedEventArgs e)
{
    Artist artist = await GetArtist();
    IEnumerable<string> enumerable = artist.Reviews.Select(s => s.ImageUrl);
    string[] downloadFile = await DownloadFiles(enumerable);
}

public static async Task<string[]> DownloadFiles(IEnumerable<string> enumerable)
{
    if (enumerable == null) throw new ArgumentNullException("enumerable");
    await enumerable.ForEachAsync(5, s => DownloadFile(s));
    // Incomplete, the above statement is void and can't be returned
}

public static async Task<string> DownloadFile(string address)
{
    /* Download a file from specified address, 
        * return destination file name on success or null on failure */

    if (address == null)
    {
        return null;
    }

    Uri result;
    if (!Uri.TryCreate(address, UriKind.Absolute, out result))
    {
        Debug.WriteLine(string.Format("Couldn't create URI from specified address: {0}", address));
        return null;
    }

    try
    {
        using (var client = new WebClient())
        {
            string fileName = Path.GetTempFileName();
            await client.DownloadFileTaskAsync(address, fileName);
            Debug.WriteLine(string.Format("Downloaded file saved to: {0} ({1})", fileName, address));
            return fileName;
        }
    }
    catch (WebException webException)
    {
        Debug.WriteLine(string.Format("Couldn't download file from specified address: {0}", webException.Message));
        return null;
    }
}

推荐答案

我解决了它并在这里发布,可能会帮助任何有同样问题的人.

I solved it and posting it here, might help anyone having the same issue.

我最初的需求是一个小助手,它可以快速下载图像,但如果服务器没有快速响应,也会断开连接,所有这些并行和异步.

My initial need was a small helper that would quickly download images but also just drop the connection if server does not respond quickly, all this in parallel and asynchronously.

这个助手会返回一个包含远程路径、本地路径和异常的元组;非常有用,因为知道错误下载为什么会出错总是很好的.我想我没有忘记下载可能发生的任何情况,但欢迎您发表评论.

This helper will return you a tuple that contains the remote path, the local path and the exception if one occurred; so quite useful as it's always good to know why faulty downloads have faulted. I think I forgot none of the situations that can occur for a download but you're welcome to comment it.

  • 您指定要下载的网址列表
  • 您可以指定一个本地文件名来保存它,如果没有,将为您生成一个
  • 可选的取消下载的持续时间(适用于缓慢或无法访问的服务器)

您可以仅使用 DownloadFileTaskAsync 本身或使用 ForEachAsync 帮助程序进行并行和异步下载.

You can just use DownloadFileTaskAsync itself or use the ForEachAsync helper for parallel and asynchronous downloads.

代码以及如何使用它的示例:

Code with an example on how to use it :

private async void MainWindow_Loaded(object sender, RoutedEventArgs e)
{
    IEnumerable<string> enumerable = your urls here;
    var results = new List<Tuple<string, string, Exception>>();
    await enumerable.ForEachAsync(s => DownloadFileTaskAsync(s, null, 1000), (url, t) => results.Add(t));
}

/// <summary>
///     Downloads a file from a specified Internet address.
/// </summary>
/// <param name="remotePath">Internet address of the file to download.</param>
/// <param name="localPath">
///     Local file name where to store the content of the download, if null a temporary file name will
///     be generated.
/// </param>
/// <param name="timeOut">Duration in miliseconds before cancelling the  operation.</param>
/// <returns>A tuple containing the remote path, the local path and an exception if one occurred.</returns>
private static async Task<Tuple<string, string, Exception>> DownloadFileTaskAsync(string remotePath,
    string localPath = null, int timeOut = 3000)
{
    try
    {
        if (remotePath == null)
        {
            Debug.WriteLine("DownloadFileTaskAsync (null remote path): skipping");
            throw new ArgumentNullException("remotePath");
        }

        if (localPath == null)
        {
            Debug.WriteLine(
                string.Format(
                    "DownloadFileTaskAsync (null local path): generating a temporary file name for {0}",
                    remotePath));
            localPath = Path.GetTempFileName();
        }

        using (var client = new WebClient())
        {
            TimerCallback timerCallback = c =>
            {
                var webClient = (WebClient) c;
                if (!webClient.IsBusy) return;
                webClient.CancelAsync();
                Debug.WriteLine(string.Format("DownloadFileTaskAsync (time out due): {0}", remotePath));
            };
            using (var timer = new Timer(timerCallback, client, timeOut, Timeout.Infinite))
            {
                await client.DownloadFileTaskAsync(remotePath, localPath);
            }
            Debug.WriteLine(string.Format("DownloadFileTaskAsync (downloaded): {0}", remotePath));
            return new Tuple<string, string, Exception>(remotePath, localPath, null);
        }
    }
    catch (Exception ex)
    {
        return new Tuple<string, string, Exception>(remotePath, null, ex);
    }
}

public static class Extensions
{
    public static Task ForEachAsync<TSource, TResult>(
        this IEnumerable<TSource> source,
        Func<TSource, Task<TResult>> taskSelector, Action<TSource, TResult> resultProcessor)
    {
        var oneAtATime = new SemaphoreSlim(5, 10);
        return Task.WhenAll(
            from item in source
            select ProcessAsync(item, taskSelector, resultProcessor, oneAtATime));
    }

    private static async Task ProcessAsync<TSource, TResult>(
        TSource item,
        Func<TSource, Task<TResult>> taskSelector, Action<TSource, TResult> resultProcessor,
        SemaphoreSlim oneAtATime)
    {
        TResult result = await taskSelector(item);
        await oneAtATime.WaitAsync();
        try
        {
            resultProcessor(item, result);
        }
        finally
        {
            oneAtATime.Release();
        }
    }
}

我没有改变ForEachAsync的签名来选择并行度,我会让你随意调整.

I haven't changed the signature of ForEachAsync to choose the level of parallelism, I'll let you adjust it as you wish.

输出示例:

DownloadFileTaskAsync (null local path): generating a temporary file name for http://cache.thephoenix.com/secure/uploadedImages/The_Phoenix/Music/CD_Review/main_OTR_Britney480.jpg
DownloadFileTaskAsync (null local path): generating a temporary file name for http://ssimg.soundspike.com/artists/britneyspears_femmefatale_cd.jpg
DownloadFileTaskAsync (null local path): generating a temporary file name for http://a323.yahoofs.com/ymg/albumreviewsuk__1/albumreviewsuk-526650850-1301400550.jpg?ymm_1xEDE5bu0tMi
DownloadFileTaskAsync (null remote path): skipping
DownloadFileTaskAsync (time out due): http://hangout.altsounds.com/geek/gars/images/3/9/8/5/2375.jpg
DownloadFileTaskAsync (time out due): http://www.beat.com.au/sites/default/files/imagecache/630_315sr/images/article/header/2011/april/britney-spears-femme-fatale.jpg
DownloadFileTaskAsync (time out due): http://cache.thephoenix.com/secure/uploadedImages/The_Phoenix/Music/CD_Review/main_OTR_Britney480.jpg
DownloadFileTaskAsync (downloaded): http://newblog.thecmuwebsite.com/wp-content/uploads/2009/12/britneyspears1.jpg
DownloadFileTaskAsync (downloaded): http://newblog.thecmuwebsite.com/wp-content/uploads/2009/12/britneyspears1.jpg
DownloadFileTaskAsync (downloaded): http://static.guim.co.uk/sys-images/Music/Pix/site_furniture/2011/3/22/1300816812640/Femme-Fatale.jpg
DownloadFileTaskAsync (downloaded): http://www.sputnikmusic.com/images/albums/72328.jpg

过去需要长达 1 分钟的操作现在只需 10 秒即可获得相同的结果:)

What used to take up to 1 minute now barely takes 10 seconds for the same result :)

非常感谢这两篇文章的作者:

And big thanks to the author of these 2 posts :

http://blogs.msdn.com/b/pfxteam/archive/2012/03/05/10278165.aspx

http://blogs.msdn.com/b/pfxteam/archive/2012/03/04/10277325.aspx

这篇关于异步并行下载文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆