使用data.table根据另一列中的类别来计算发生百分比 [英] Use data.table to calculate the percentage of occurrence depending on the category in another column
问题描述
最近,我正在R中使用data.table,它非常流行且高效。目前,我遇到了一个问题,我认为可以使用data.table来解决。
Recently I'm working with data.table in R and it is quite popular and efficient. Currently I come across a problem which I think could be solved using data.table.
我有一个像这样的数据集:
I have a data set like this:
event | group_ind
1 | group1
1 | group1
1 | group1
2 | group1
2 | group1
1 | group2
1 | group2
2 | group2
2 | group3
2 | group3
此数据集的结果很明显:第1组的事件1为60%,第2组的为67%,第3组为0。实际上,数据集具有更多的观察结果,且具有两种以上的事件类型,并且未按特定顺序对行进行排序。在R中,我可以通过非常虚拟的方式获得想要的结果(通过将事件列中的发生次数除以每个组中的总观察数),但是我认为应该有一种更理想的方法。
Now I want to know the percentage of event 1 occurs in each group. The result for this data set is obvious: 60% for event 1 in group1, 67% in group2 and 0 in group3. In reality the data set has many more observations with more than 2 event types and rows are not sorted in a certain order. I can get what I want in a very dummy way in R (by counting occurrence in event column divided by total observations in each group) but I think there should be a fancier way of doing this.
所以我想要的结果是这样的:
So the result I want would be like this:
event | group_ind | percentage
1 | group1 | 0.6
2 | group1 | 0.4
1 | group2 | 0.67
2 | group2 | 0.33
1 | group3 | 0
2 | group3 | 100
我希望可以在data.table中完成此操作。非常感谢您的帮助。
I hope this can be done in data.table. Much appreciate for the help.
推荐答案
一个简单的解决方案就是
A simple solution would be just
setDT(DT)[, .(event = 1:2, percentage = tabulate(event)/.N), by = group_ind]
# group_ind event percentage
# 1: group1 1 0.6000000
# 2: group1 2 0.4000000
# 3: group2 1 0.6666667
# 4: group2 2 0.3333333
# 5: group3 1 0.0000000
# 6: group3 2 1.0000000
尽管更通用的解决方案是使用在
(并且还可以对其进行预购-如@EdM的建议)。事件
上是唯一的
Though a more general solution would be to use unique
on event
(and also pre-order it - as suggested by @EdM).
setDT(DT)[order(event), .(event = unique(event), percentage = tabulate(event)/.N), by = group_ind]
这篇关于使用data.table根据另一列中的类别来计算发生百分比的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!