一次消耗一次IEnumerable [英] Consuming an IEnumerable multiple times in one pass

查看:73
本文介绍了一次消耗一次IEnumerable的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有可能编写一个使 IEnumerable 被多次使用但仅一次通过而又不将所有数据读入内存的高阶函数?[请参阅下面的编辑"以澄清我要寻找的内容.]

Is it possible to write a higher-order function that causes an IEnumerable to be consumed multiple times but in only one pass and without reading all the data into memory? [See Edit below for a clarification of what I'm looking for.]

例如,在下面的代码中,可枚举的是 mynums (我在上面标记了 .Trace(),以查看我们枚举了多少次).确定目标是否有大于5的数字以及所有数字的总和.处理两次可枚举的函数是 Both_TwoPass ,但是将其枚举两次.相反, Both_NonStream 仅枚举一次,但是以将其读入内存为代价.原则上,可以像 Any5Sum 所示以单次传递和流式执行这两项任务,但这是特定的解决方案.是否可以编写与 Both _ * 具有相同签名的函数,但这是两全其美的方法?

For example, in the code below the enumerable is mynums (onto which I've tagged a .Trace() in order to see how many times we enumerate it). The goal is figure out if it has any numbers greater than 5, as well as the sum of all of the numbers. A function which processes an enumerable twice is Both_TwoPass, but it enumerates it twice. In contrast Both_NonStream only enumerates it once, but at the expense of reading it into memory. In principle it is possible carry out both of these tasks in a single pass and in a streaming fashion as shown by Any5Sum, but that is specific solution. Is it possible to write a function with the same signature as Both_* but that is the best of both worlds?

(在我看来,这应该可以使用线程.是否有更好的解决方案,例如使用 async ?)

(It seems to me that this should be possible using threads. Is there a better solution using, say, async?)

以下是有关我要寻找的内容的说明.我所做的是在方括号中包括了每个属性的详尽描述.

Below is a clarification regarding what I'm looking for. What I've done is included a very down-to-earth description of each property in square brackets.

我正在寻找具有以下特征的函数 Both :

I'm looking for a function Both with the following characteristics:

  1. 它具有签名(S1,S2)两者都是T,S1,S2(此IEnumerable T tt,Func )(并产生正确的"输出!)
  2. 它仅将第一个参数 tt 迭代一次.[我的意思是,当传递 mynums (如下定义)时,它仅输出 mynums:0 1 2 ... 一次.这排除了功能 Both_TwoPass .]
  3. 它以流方式处理第一个参数 tt 中的数据.[[我的意思是,例如,没有足够的内存来同时存储 tt 中的所有项目,因此排除了功能 Both_NonStream .]
  1. It has signature (S1, S2) Both<T, S1, S2>(this IEnumerable<T> tt, Func<IEnumerable<T>, S1>, Func<IEnumerable<T>, S2>) (and produces the "right" output!)
  2. It only iterates the first argument, tt, once. [What I mean by this is that when passed mynums (as defined below) it only outputs mynums: 0 1 2 ... once. This precludes function Both_TwoPass.]
  3. It processes the data from the first argument, tt, in a streaming fashion. [What I mean by this is that, for example, there is insufficient memory to store all the items from tt in memory simultaneously, thus precluding function Both_NonStream.]

using System;
using System.Collections.Generic;
using System.Linq;

namespace ConsoleApp
{
    static class Extensions
    {
        public static IEnumerable<T> Trace<T>(this IEnumerable<T> tt, string msg = "")
        {
            Console.Write(msg);
            try
            {
                foreach (T t in tt)
                {
                    Console.Write(" {0}", t);
                    yield return t;
                }
            }
            finally
            {
                Console.WriteLine('.');
            }
        }

        public static (S1, S2) Both_TwoPass<T, S1, S2>(this IEnumerable<T> tt, Func<IEnumerable<T>, S1> f1, Func<IEnumerable<T>, S2> f2)
        {
            return (f1(tt), f2(tt));
        }

        public static (S1, S2) Both_NonStream<T, S1, S2>(this IEnumerable<T> tt, Func<IEnumerable<T>, S1> f1, Func<IEnumerable<T>, S2> f2)
        {
            var tt2 = tt.ToList();
            return (f1(tt2), f2(tt2));
        }

        public static (bool, int) Any5Sum(this IEnumerable<int> ii)
        {
            int sum = 0;
            bool any5 = false;
            foreach (int i in ii)
            {
                sum += i;
                any5 |= i > 5; // or: if (!any5) any5 = i > 5;
            }
            return (any5, sum);
        }

    }
    class Program
    {
        static void Main()
        {
            var mynums = Enumerable.Range(0, 10).Trace("mynums:");
            Console.WriteLine("TwoPass: (any > 5, sum) = {0}", mynums.Both_TwoPass(tt => tt.Any(k => k > 5), tt => tt.Sum()));
            Console.WriteLine("NonStream: (any > 5, sum) = {0}", mynums.Both_NonStream(tt => tt.Any(k => k > 5), tt => tt.Sum()));
            Console.WriteLine("Manual: (any > 5, sum) = {0}", mynums.Any5Sum());
        }
    }
}

推荐答案

我认为在评论中描述了同一件事.但是,无需创建此类专用 IEnumerable ",因为

I think you and I are describing the same thing in the comments. There is no need to create such a "special-purpose IEnumerable", though, because the BlockingCollection<> class already exists for such producer-consumer scenarios. You'd use it as follows...

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆