计算 R 中列表的出现次数 [英] Count Occurrences of a List in R

查看：23 发布时间：2021/12/30 16:21:14 r count group-by

本文介绍了计算 R 中列表的出现次数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个大约 100,000 次一起订购的项目的列表，我已将这些项目粘贴到一列中，以便我可以计算每个组合出现的次数.

I have a list of roughly 100,000 occurrences of items being ordered together that I have pasted into one column so I can count the number of times each combination occurs.

4845   Curly Fries California Burger   1
4846   French Fries California Burger  1
4847   Hamburger California Burger     1
4848   $1 Fountain Drinks Curly Fries  1
4849   $1 Fountain Drinks Curly Fries  1
4850   California Burger Curly Fries   1
4851   Curly Fries Curly Fries         1

我探索了聚合函数，它给了我以下错误:

I have explored the aggregate function which gives me the following error:

aggregate(t1$count,list(t1$pc), sum) <br>
Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list? <br>

我也尝试过 ddply 的变体:

I have also tried variations of ddply:

ddply(t1,t1$pc,transform,occurances=sum(t1$count))

但我收到此错误

Error in UseMethod("as.quoted") : 
no applicable method for 'as.quoted' applied to an object of class "c('matrix', 'list')"

我假设我得到了这个，因为我试图基本上按字符值分组".我还根据对类似问题的回答探索了 tapply 和 recast，但无济于事.

I am assuming I get this because I am trying to essentially "group" by a character value. I have also explored tapply and recast based on answers to similar questions, but to no avail.

我怎样才能得到这个组合数?

How can I get this count of combinations?

作为考虑，单独列出的项目示例(再次为格式问题道歉):

For consideration, a sample of items listed separately (again, apologies for the formatting issues):

                   Var1                     Var2 Var3
>2               Onion Rings              Onion Rings    1
>3  Pineapple Cheddar Burger              Onion Rings    1
>4               Onion Rings Pineapple Cheddar Burger    1
>5  Pineapple Cheddar Burger Pineapple Cheddar Burger    1
>5              Onion Rings              Onion Rings     1
>6  Pineapple Cheddar Burger              Onion Rings    1
>7               Onion Rings Pineapple Cheddar Burger    1
>8  Pineapple Cheddar Burger Pineapple Cheddar Burger    1
>9             Fountain Soda            Fountain Soda    1
>10             French Fries            Fountain Soda    1

推荐答案

你最初的方法非常接近我认为你想要的.将这些组合成一个因素肯定会奏效，前提是您将它们以相同的顺序组合，这样您就不会以薯条，汉堡"和汉堡，薯条"结束.

Your initial approach was pretty close to what I think you want. Combining those into a single factor will definitely work, provided you combine them in the same order, such that you don't end up with "Fries, Burger" and "Burger, Fries."

可能有更简单的方法来做你想做的事，但我不知道那是什么.尽管如此，我认为这符合您的要求:

There may be an easier way of doing what you want, but I'm failing to brain what that is. Nevertheless, I think this does what you're looking for:

# Let's assume your data looks like this:
> df
                       Var1                      Var2 Var3
1               Onion Rings               Onion Rings    1
2  Pineapple Cheddar Burger               Onion Rings    1
3               Onion Rings  Pineapple Cheddar Burger    1
4  Pineapple Cheddar Burger  Pineapple Cheddar Burger    1
5               Onion Rings               Onion Rings    1
6  Pineapple Cheddar Burger               Onion Rings    1
7               Onion Rings  Pineapple Cheddar Burger    1
8  Pineapple Cheddar Burger  Pineapple Cheddar Burger    1
9             Fountain Soda             Fountain Soda    1
10             French Fries             Fountain Soda    1

# Now, for each row
#     1. sort the Var1 and Var2,
#     2. combine the sorted vars, and
#     3. convert them back into a factor

df$sortcomb <- as.factor(apply(df[,1:2], 1, function(x) paste(sort(x), collapse=", ")))

table(df$sortcomb) # then use table as per normal

ddply(df, .(sortcomb), summarize, count=length(sortcomb)) # or ddply

这篇关于计算 R 中列表的出现次数的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

计算 R 中列表的出现次数 [英] Count Occurrences of a List in R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

计算 R 中列表的出现次数 [英] Count Occurrences of a List in R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭