在excel频率函数中计算不同的值 [英] Counting distinct values in excel - frequency function

查看:107
本文介绍了在excel频率函数中计算不同的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的任务是计算excel中列中不同字符串的数量。稍后的Google搜索会产生以下公式:这里



= SUM(IF(FREQUENCY(MATCH(B2:B10,B2: 0),MATCH(B2:B10,B2:B10,0))> 0,1))



考虑数据:



A
B
C
D
A
B
E
C



现在,匹配函数将返回一个数组(第一个参数是数组):



1
2
3
4
1
2
7
3



到目前为止这么好。我不明白的是FREQUENCY函数在这里的工作原理,特别是如何处理被复制的bin(例如bin 1在上面的数据中复制)。频率函数的结果是:



2
2
2
1
0
0
1
0
0



感谢



Taras

解决方案

编辑:我意识到您的解决方案是如何工作的 - 修改为反映这一点。



FREQUENCY正在搜索您的搜索列表中的条目。这是它的工作原理:



搜索阵列:1 2 3 4 1 2 7 3



Bins:1 2 3 4 1 2 7 3



Bin 1 =>有两个1的=> 2



Bin 2 = >有两个2的=> 2



Bin 3 =>有两个3的=> 2



Bin 4 =>有一个4 => 1



Bin 1 repeated => 1已计数=> 0



Bin 2 repeated => 2已经计数=> 0



Bin 7 =>有一个7 => 1



Bin 3 repeated => 3已经计数=> 0



几乎似乎解决方案是利用一个FREQUENCY的怪癖,也就是说,它不会计数相同的bin两次,因为您可能希望值为1的第二个bin也不为零。但是这是它的工作原理 - 因为它只会计算第一个bin而不是一个重复的bin的出现次数,值大于零的行数将为您提供不同的条目数。



这是一个可能会有用的替代方法。它可以用于计算不同值的数量:



假设您的字符串范围是B2:B10。填写另一列

  =(MATCH(B2,B $ 2:B2,1) - (ROW(B2)-ROW (B $ 2)))> 0 

行应该随着你的下降而改变,所以第二行应该是,例如:

  =(MATCH(B3,B $ 2:B3,1) - (ROW(B3) -ROW(B $ 2)))> 0 

如果当前行包含一个字符串的第一个实例(如果你给它几分钟,你应该可以弄清楚它在做什么)。因此,如果您使用COUNTIF()计算TRUE的数量,则应该获取不同字符串的数量。


I was tasked with counting the number of distinct strings in a column in excel. A quick Google search later yielded the following formula found here:

=SUM(IF(FREQUENCY(MATCH(B2:B10,B2:B10,0),MATCH(B2:B10,B2:B10,0))>0,1))

Consider the data:

A B C D A B E C

Now, the match function would return an array (as the first argument is an array):

1 2 3 4 1 2 7 3

So far so good. What I don't understand is how the FREQUENCY function works here, in particular how it treats bins that are replicated (for example the bin 1 is replicated in the above data). The result of the frequency function is:

2 2 2 1 0 0 1 0 0

Thanks

Taras

解决方案

EDIT: I realised how your solution was working - amended to reflect this.

FREQUENCY is searching for entries from your bins in the search array. Here's how it's working:

Search array: 1 2 3 4 1 2 7 3

Bins: 1 2 3 4 1 2 7 3

Bin 1 => there are two 1's => 2

Bin 2 => there are two 2's => 2

Bin 3 => there are two 3's => 2

Bin 4 => there is one 4 => 1

Bin 1 repeated => 1 already counted => 0

Bin 2 repeated => 2 already counted => 0

Bin 7 => there is one 7 => 1

Bin 3 repeated => 3 already counted => 0

It almost seems that the solution is exploiting a FREQUENCY quirk, that is, it won't count the same bin twice, because you might expect the second bin with value 1 to be non-zero as well. But that's how it works -- as it will only count the number of occurrences for the first bin and not a duplicate bin, the number of rows with a value greater than zero will give you the number of distinct entries.

Here's an alternative approach which you might find useful. it can be used to calculate the number of distinct values:

Suppose your string range is B2:B10. Fill down in another column

=(MATCH(B2,B$2:B2,1)-(ROW(B2)-ROW(B$2)))>0

The row should change as you copy down, so the second row should be, for example:

=(MATCH(B3,B$2:B3,1)-(ROW(B3)-ROW(B$2)))>0

This is signalling TRUE if the current row contains the first instance of a string (if you give it a couple of minutes you should be able to work out what it's doing). Therefore, if you count the number of TRUEs with COUNTIF() then you should get the number of distinct strings.

这篇关于在excel频率函数中计算不同的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆