参数最佳实践:IEnumerable的主场迎战主场迎战IList的IReadOnlyCollection [英] Best practice for parameter: IEnumerable vs. IList vs. IReadOnlyCollection

查看:190
本文介绍了参数最佳实践:IEnumerable的主场迎战主场迎战IList的IReadOnlyCollection的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我得到的时候一会的收益的IEnumerable 从一个方法,当有在延迟执行价值。并返回一个列表的IList 应该差不多只有当结果将被修改,否则我会返回 IReadOnlyCollection ,所以调用者知道他变得越来越不用于修改(这让方法甚至重用其他调用对象)。



不过,在参数输入端,我有点不太清楚。我的可能的采取的IEnumerable ,但如果我需要枚举超过一次?



更是什么

俗话说的对于你发送的保守,在你接受什么样的的自由主义的建议服用的IEnumerable 是好的,但我真的不知道。



例如,如果在下面的的IEnumerable 参数没有任何元素,一个显著量工作可以通过检查保存在这个方法。任何()第一,它要求了ToList()之前为< 。STRONG>避免两次枚举

 公开的IEnumerable<数据> RemoveHandledForDate(IEnumerable的<数据>的数据,日期时间DATETIME){
变种的DataList = data.ToList();

如果{
返回DataList控件(dataList.Any()!);
}

变种handledDataIds =新的HashSet< INT>(
GetHandledDataForDate(DATETIME)//昂贵的数据库操作
。选择(D => d.DataId)
);

返回dataList.Where(D =>!handledDataIds.Contains(d.DataId));
}



所以我不知道什么是最好的招牌,在这里?一种可能性是的IList<数据>数据,但接受清单表明你打算修改它,这是不正确的,这种方法不触及原有的列表,因此 IReadOnlyCollection<数据> 似乎更好。



IReadOnlyCollection 强制呼叫者做了ToList() .AsReadOnly()每次它获取有点难看,甚至与自定义扩展方法时, .AsReadOnlyCollection 。而这还不是在什么是公认的自由。



什么是在这种情况下的最佳做法?



这方法没有返回一个 IReadOnlyCollection ,因为有可能在价值最终其中,使用延迟执行的整个列表不是要求的被列举。但是,选择需要列举,因为这样做的成本。载将没有了<$ C可怕$ C> HashSet的。



我没有与调用问题了ToList ,它只是发生,我认为如果我需要一个列表来避免多次枚举,为什么我不能只问一个在参数?因此,这里的问题是,如果我不想要一个的IEnumerable 在我的方法,我应该真正接受一个,以便自由(和了ToList 它自己),或者我应该把来电者的负担,了ToList()。AsReadOnly()


对于那些不熟悉IEnumerables



在这里真正的问题

更多信息不是任何()与了ToList()。据我所知,枚举整个列表的成本比做更多的任何()。但是,假设的情况下,来电者将消耗在返回的所有项目的IEnumerable 从上面的方法,并假定源的IEnumerable<数据>数据参数来源于此方法的结果:

 公开的IEnumerable<数据> GetVeryExpensiveDataForDate(日期时间DATETIME){
//这个查询是非常昂贵的,不管有多少行返回。
//它花费5秒钟每个`.GetEnumerator`打电话送1的值或1000
返回MyDataProvider.Where(D => d.DataDate == DATETIME);
}

现在,如果你这样做:

  VAR的myData = GetVeryExpensiveDataForDate(todayDate); 
VAR unhandledData = RemoveHandledForDate(MYDATA的,todayDate);
的foreach(在unhandledData变种数据){
messageBus.Dispatch(数据); //完全列举



如果 RemovedHandledForDate 确实任何的不其中,,你会招致5秒成本的两次的,而不是一次。这就是为什么您应该总是采取极端的痛苦,避免不止一次枚举的的IEnumerable 更多。不要依赖于你的知识,实际上它是无害的,因为将来的某个倒霉的开发者可以调用你的方法有一天,一个新实施的IEnumerable 你从来没有想过的,具有不同的特点



的IEnumerable 合同说,你可以枚举它。它并不保证有关这样一次以上的性能特点什么。



在事实上,一些 IEnumerables 是的挥发性的并在随后的枚举不会返回任何数据!切换到之一将是一个完全重大更改,如果有多个枚举组合(和一个非常难以诊断的,如果多个枚举后来加入)。



不要做IEnumerable的多个枚举。



如果你接受一个I​​Enumerable参数,你实际上的看好以列举它究竟是0或1次


解决方案

有一定围绕如何让你接受的IEnumerable< T> ; ,只列举一次,并确保你没有数据库的多次查询。我能想到的解决方案:




  • 而不是使用任何其中,,你可以直接使用枚举。呼叫的MoveNext 而不是任何来查看是否有集合中的物品,手工制作后,再重复你的数据库查询。

  • 使用来初始化 HashSet的



第一个看起来难看,第二个实际上可能使一个很大的意义:

 公开的IEnumerable<数据> RemoveHandledForDate(IEnumerable的<数据>的数据,日期时间DATETIME)
{
变种IDS =新懒人< HashSet的< INT>>(
()=>新建的HashSet< INT>(
GetHandledDataForDate(DATETIME)//昂贵的数据库操作
。选择(D => d.DataId)
));

返回data.Where(D =>!ids.Value.Contains(d.DataId));
}


I get when one would return an IEnumerable from a method—when there's value in deferred execution. And returning a List or IList should pretty much be only when the result is going to be modified, otherwise I'd return an IReadOnlyCollection, so the caller knows what he's getting isn't intended for modification (and this lets the method even reuse objects from other callers).

However, on the parameter input side, I'm a little less clear. I could take an IEnumerable, but what if I need to enumerate more than once?

The saying "Be conservative in what you send, be liberal in what you accept" suggests taking an IEnumerable is good, but I'm not really sure.

For example, if there are no elements in the following IEnumerable parameter, a significant amount of work can be saved in this method by checking .Any() first, which requires ToList() before that to avoid enumerating twice.

public IEnumerable<Data> RemoveHandledForDate(IEnumerable<Data> data, DateTime dateTime) {
   var dataList = data.ToList();

   if (!dataList.Any()) {
      return dataList;
   }

   var handledDataIds = new HashSet<int>(
      GetHandledDataForDate(dateTime) // Expensive database operation
         .Select(d => d.DataId)
   );

   return dataList.Where(d => !handledDataIds.Contains(d.DataId));
}

So I'm wondering what is the best signature, here? One possibility is IList<Data> data, but accepting a list suggests that you plan to modify it, which is not correct—this method doesn't touch the original list, so IReadOnlyCollection<Data> seems better.

But IReadOnlyCollection forces callers to do ToList().AsReadOnly() every time which gets a bit ugly, even with a custom extension method .AsReadOnlyCollection. And that's not being liberal in what is accepted.

What is best practice in this situation?

This method is not returning an IReadOnlyCollection because there may be value in the final Where using deferred execution as the whole list is not required to be enumerated. However, the Select is required to be enumerated because the cost of doing .Contains would be horrible without the HashSet.

I don't have a problem with calling ToList, it just occurred to me that if I need a List to avoid multiple enumeration, why do I not just ask for one in the parameter? So the question here is, if I don't want an IEnumerable in my method, should I really accept one in order to be liberal (and ToList it myself), or should I put the burden on the caller to ToList().AsReadOnly()?

Further Information for those unfamiliar with IEnumerables

The real problem here is not the cost of Any() vs. ToList(). I understand that enumerating the entire list costs more than doing Any(). However, assume the case that the caller will consume all items in the return IEnumerable from the above method, and assume that the source IEnumerable<Data> data parameter comes from the result of this method:

public IEnumerable<Data> GetVeryExpensiveDataForDate(DateTime dateTime) {
    // This query is very expensive no matter how many rows are returned.
    // It costs 5 seconds on each `.GetEnumerator` call to get 1 value or 1000
    return MyDataProvider.Where(d => d.DataDate == dateTime);
}

Now if you do this:

var myData = GetVeryExpensiveDataForDate(todayDate);
var unhandledData = RemoveHandledForDate(myData, todayDate);
foreach (var data in unhandledData) {
   messageBus.Dispatch(data); // fully enumerate
)

And if RemovedHandledForDate does Any and does Where, you'll incur the 5 second cost twice, instead of once. This is why you should always take extreme pains to avoid enumerating an IEnumerable more than once. Do not rely on your knowledge that in fact it's harmless, because some future hapless developer may call your method some day with a newly implemented IEnumerable you never thought of, which has different characteristics.

The contract for an IEnumerable says that you can enumerate it. It does NOT promise anything about the performance characteristics of doing so more than once.

In fact, some IEnumerables are volatile and won't return any data upon a subsequent enumeration! Switching to one would be a totally breaking change if combined with multiple enumeration (and a very hard to diagnose one if the multiple enumeration was added later).

Don't do multiple enumeration of an IEnumerable.

If you accept an IEnumerable parameter, you are in effect promising to enumerate it exactly 0 or 1 times.

解决方案

There are definitely ways around that will let you accept IEnumerable<T>, only enumerate once and make sure you don't query the database multiple times. Solutions I can think of:

  • instead of using Any and Where you could use the enumerator directly. Call MoveNext instead of Any to see if there are any items in the collection, and manually iterate further in after making your database query.
  • use Lazy to initialize your HashSet.

The first one seems ugly, the second one might actually make a lot of sense:

public IEnumerable<Data> RemoveHandledForDate(IEnumerable<Data> data, DateTime dateTime)
{
    var ids = new Lazy<HashSet<int>>(
        () => new HashSet<int>(
       GetHandledDataForDate(dateTime) // Expensive database operation
          .Select(d => d.DataId)
    ));

    return data.Where(d => !ids.Value.Contains(d.DataId));
}

这篇关于参数最佳实践:IEnumerable的主场迎战主场迎战IList的IReadOnlyCollection的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆