计数时间组合的数量显示在数据框列中 [英] Count number of time combination of events appear in dataframe columns ext

查看:138
本文介绍了计数时间组合的数量显示在数据框列中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是计数次数组合的事件发生在数据框列中,我将再次重新填写问题,所有这些都在这里:

This is an extension of the question asked in Count number of times combination of events occurs in dataframe columns, I will reword the question again so it is all here:

我有一个数据框并且我想计算两列中的事件的每个组合的发生次数(以任何顺序),如果没有出现组合,则为零。

I have a data frame and I want to calculate the number of times each combination of events in two columns occur (in any order), with a zero if a combination doesn't appear.

例如说我有

df <- data.frame('x' = c('a', 'b', 'c', 'c', 'c'), 
                 'y' = c('c', 'c', 'a', 'a', 'b'))

所以

x y  
a c  
b c  
c a  
c a  
c a  
c b



< a b 不一起出现, a c 4次(ro ws 2,4,5,6)和 b c 两次(第3和第7行),所以我想返回

a and b do not occur together, a and c 4 times (rows 2, 4, 5, 6) and b and c twice (3rd and 7th rows) so I would want to return

x-y num  
a-b 0  
a-c 4  
b-c 2  

我希望这有道理吗?感谢提前

I hope this makes sense? Thanks in advance

推荐答案

如上所述,您可以使用 factor() expand.grid()(或以其他方式获取所有可能的组合)

As said, you can do this with factor() and expand.grid() (or another way to get all possible combinations)

all.possible <- expand.grid(c('a','b','c'), c('a','b','c'))
all.possible <- all.possible[all.possible[, 1] != all.possible[, 2], ]
all.possible <- unique(apply(all.possible, 1, function(x) paste(sort(x), collapse='-')))

df <- data.frame('x' = c('a', 'b', 'c', 'c', 'c'), 
                 'y' = c('c', 'c', 'a', 'a', 'b'))
table(factor(apply(df , 1, function(x) paste(sort(x), collapse='-')), levels=all.possible))

这篇关于计数时间组合的数量显示在数据框列中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆