如何计算字典中唯一值的出现? [英] How to count occurrences of unique values in Dictionary?

查看:81
本文介绍了如何计算字典中唯一值的出现?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个字典,它的双精度值是数值,字符串是键.

I have a Dictionary with doubles as values and strings as keys.

我想计算该词典中每个值的出现次数,并且想知道该值(例如重复).

I want to count occurrences of each value in this Dictionary and I want to know this value (that is for instance repeated).

例如:

key1, 2
key2, 2
key3, 3
key4, 2
key5, 5
key6, 5

我想获取一个列表:

2 - 3 (times)
3 - 1 (once)
5 - 2 (twice)

我该怎么办?

推荐答案

首先要注意的是,您实际上并不在乎字典的键.因此,第一步就是忽略它们与手头任务无关.我们将使用字典的Values属性,该工作与任何其他整数集合(或实际上我们可以比较相等的任何其他类型的其他可枚举)几乎相同.

The first thing to note, is that you don't actually care about the keys of the dictionary. Step one therefore is to ignore them as irrelevant to the task in hand. We're going to work with the Values property of the dictionary, and the work is much the same as for any other collection of integers (or indeed any other enumerable of any other type we can compare for equality).

有两种常见的方法可以解决此问题,

There are two common approaches to this problem, both of which are well worth knowing.

第一个使用另一个字典来保存值的计数:

The first uses another dictionary to hold the count of values:

//Start with setting up the dictionary you described.
Dictionary<string, int> dict = new Dictionary<string, int>{
    {"key1", 2},
    {"key2", 2},
    {"key3", 3},
    {"key4", 2},
    {"key5", 5},
    {"key6", 5}
};
//Create a different dictionary to store the counts.
Dictionary<int, int> valCount = new Dictionary<int, int>();
//Iterate through the values, setting count to 1 or incrementing current count.
foreach(int i in dict.Values)
    if(valCount.ContainsKey(i))
        valCount[i]++;
    else
        valCount[i] = 1;
//Finally some code to output this and prove it worked:
foreach(KeyValuePair<int, int> kvp in valCount)//note - not sorted, that must be added if needed
    Console.WriteLine("{0} - {1}", kvp.Key, kvp.Value);

希望这很简单.另一种方法比较复杂,但有一些优点:

Hopefully this is pretty straightforward. Another approach is more complicated but has some pluses:

//Start with setting up the dictionary you described.
Dictionary<string, int> dict = new Dictionary<string, int>{
    {"key1", 2},
    {"key2", 2},
    {"key3", 3},
    {"key4", 2},
    {"key5", 5},
    {"key6", 5}
};
IEnumerable<IGrouping<int, int>> grp = dict.Values.GroupBy(x => x);
//Two options now. One is to use the results directly such as with the
//equivalent code to output this and prove it worked:
foreach(IGrouping<int, int> item in grp)//note - not sorted, that must be added if needed
    Console.WriteLine("{0} - {1}", item.Key, item.Count());
//Alternatively, we can put these results into another collection for later use:
Dictionary<int, int> valCount = grp.ToDictionary(g => g.Key, g => g.Count());
//Finally some code to output this and prove it worked:
foreach(KeyValuePair<int, int> kvp in valCount)//note - not sorted, that must be added if needed
    Console.WriteLine("{0} - {1}", kvp.Key, kvp.Value);

(我们可能会使用var而不是冗长的IEnumerable<IGrouping<int, int>>,但是在解释代码时值得精确).

(We'd probably use var rather than the verbose IEnumerable<IGrouping<int, int>>, but it's worth being precise when explaining code).

直接比较,此版本较差-理解起来更复杂,效率也更低.但是,学习这种方法可以使同一技术具有某些简洁有效的变体,因此值得研究.

In a straight comparison, this version is inferior - both more complicated to understand and less efficient. However, learning this approach allows for some concise and efficient variants of the same technique, so it's worth examining.

GroupBy()进行枚举,并创建另一个包含键值对的枚举,其中值也是枚举. lambda x => x表示按其分组的是它本身,但是我们拥有灵活的分组规则. grp的内容有点像:

GroupBy() takes an enumeration and creates another enumeration that contains key-value pairs where the value is an enumeration too. The lambda x => x means that what it is grouped by is itself, but we've the flexibilty for different grouping rules than that. The contents of grp looks a bit like:

{
  {Key=2, {2, 2, 2}}
  {Key=3, {3}}
  {Key=5, {5, 5}}
}

因此,如果对每个组进行遍历,我们将拔出Key并在该组上调用Count(),我们将得到想要的结果.

So, if we loop through this an for each group we pull out the Key and call Count() on the group, we get the results we want.

现在,在第一种情况下,我们在一次O(n)传递中建立计数,而在这里,我们在O(n)传递中建立组,然后在第二个O(n)中获得计数通过,使其效率大大降低.这也很难理解,所以为什么要提起它呢?

Now, in the first case we built up our count in a single O(n) pass, while here we build up the group in a O(n) pass, and then obtain the count in a second O(n) pass, making it much less efficient. It's also a bit harder to understand, so why bother mentioning it?

首先,一旦我们理解了它,我们就可以改变思路了

Well, the first is that once we do understand it we can turn the lines:

IEnumerable<IGrouping<int, int>> grp = dict.Values.GroupBy(x => x);
foreach(IGrouping<int, int> item in grp)
    Console.WriteLine("{0} - {1}", item.Key, item.Count());

进入:

foreach(var item in dict.Values.GroupBy(x => x))
  Console.WriteLine("{0} - {1}", item.Key, item.Count());

这很简洁,并且很惯用.如果我们想继续使用值-计数对进行一些更复杂的操作,这是特别好的,因为我们可以将其链接到另一个操作中.

Which is quite concise, and becomes idiomatic. It's especially nice if we want to then go on and do something more complicated with the value-count pairs as we can chain this into another operation.

将结果放入字典的版本仍然更加简洁:

The version that puts the results into a dictionary can be even more concise still:

var valCount = dict.Values.GroupBy(x => x).ToDictionary(g => g.Key, g => g.Count());

在那里,您的整个问题只用一个短行回答,而不是第一个版本中的6个(删去注释).

There, your whole question answered in one short line, rather than the 6 (cutting out comments) for the first version.

(有些人可能更喜欢将dict.Values.GroupBy(x => x)替换为dict.GroupBy(x => x.Value),一旦我们在其上运行Count(),其结果将完全相同.如果您不能立即确定原因,请尝试解决该问题.)

(Some might prefer to replace dict.Values.GroupBy(x => x) with dict.GroupBy(x => x.Value) which will have exactly the same results once we run the Count() on it. If you aren't immediately sure why, try to work it out).

另一个优点是,在其他情况下,使用GroupBy更具灵活性.由于这些原因,习惯使用GroupBy的人们很可能从dict.Values.GroupBy(x => x).ToDictinary(g => g.Key, g => g.Count());的单行简洁开始,然后更改为第一个版本的更详细但更有效的形式(我们在其中递增运行总计在新词典中)是否证明是性能热点.

The other advantage, is that we have more flexibility with GroupBy in other cases. For these reasons, people who are used to using GroupBy are quite likely to start off with the one-line concision of dict.Values.GroupBy(x => x).ToDictinary(g => g.Key, g => g.Count()); and then change to the more verbose but more effient form of the first version (where we increment running totals in the new dictionary) if it proved a performance hotspot.

这篇关于如何计算字典中唯一值的出现?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆