dcast错误:“缺少汇总功能:默认为长度” [英] dcast error: ‘Aggregation function missing: defaulting to length’

查看:789
本文介绍了dcast错误:“缺少汇总功能:默认为长度”的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的 df 看起来像这样:

  Id任务类型频率
3 1 A 2
3 1 B 3
3 2 A 3
3 2 B 0
4 1 A 3
4 1 B 3
4 2 A 1
4 2 B 3

我想按ID和得到:

  Id AB…Z 
3 5 3
4 4 6

我尝试过:

  df_wide<-dcast(df,Id + Task〜Type,value.var = Freq)

并得到错误:


缺少聚合函数:默认为长度


我不知道要在 fun.aggregate 中放入什么。有什么问题?

解决方案

得到此警告的原因是在乐趣的描述中。总计(请参阅?dcast ):


聚合如果变量不能为每个输出像元标识单个
观测值,则需要该函数。默认值是长度(带有消息)
(如果需要但未指定)


因此,当存在



基于您的数据的说明: p>

当您使用 dcast(df,Id + Task〜Type,value.var = Freq)时,您会得到:

  Id任务AB 
1 3 1 2 3
2 3 2 3 0
3 4 1 3 3
4 4 2 2 1 3

这是合乎逻辑的,因为对于 Id Task Type 仅在频率。但是,当您使用 dcast(df,Id〜Type,value.var = Freq)时,会收到此消息(包括警告消息):

 缺少聚合函数:默认长度为
Id AB
1 3 2 2
2 4 2 2

现在,回头查看数据顶部:

  Id任务类型频率
3 1 A 2
3 1 B 3
3 2 A 3
3 2 B 0

您明白为什么会这样。对于 Id Type 的每种组合, Freq (对于ID 3: A 2 3 >& 3 0 对于类型 B ),而对于每个 type 值,您只能在宽数据框中将此位置放一个值。因此 dcast 希望将这些值聚合为一个值。默认的聚合函数是 length ,但是您可以使用其他聚合函数,例如 sum mean sd 或自定义函数,方法是使用 fun.aggregate 指定它们。



例如,使用 fun.aggregate = sum ,您将得到:

  ID AB 
1 3 5 3
2 4 4 6

现在没有任何警告,因为当有多个值时,将通知 dcast 该怎么做:返回这些值的总和。 / p>

My df looks like this:

Id  Task Type    Freq  
3     1    A       2
3     1    B       3
3     2    A       3
3     2    B       0
4     1    A       3
4     1    B       3
4     2    A       1
4     2    B       3

I want to restructure by Id and get:

Id   A    B …  Z    
3    5    3      
4    4    6        

I tried:

df_wide <- dcast(df, Id + Task ~ Type, value.var="Freq")

and got the error:

Aggregation function missing: defaulting to length

I can't figure out what to put in the fun.aggregate. What's the problem?

解决方案

The reason why you are getting this warning is in the description of fun.aggregate (see ?dcast):

aggregation function needed if variables do not identify a single observation for each output cell. Defaults to length (with a message) if needed but not specified

So, an aggregation function is needed when there is more than one value for one spot in the wide dataframe.

An explanation based on your data:

When you use dcast(df, Id + Task ~ Type, value.var="Freq") you get:

  Id Task A B
1  3    1 2 3
2  3    2 3 0
3  4    1 3 3
4  4    2 1 3

Which is logical because for each combination of Id, Task and Type there is only value in Freq. But when you use dcast(df, Id ~ Type, value.var="Freq") you get this (including a warning message):

Aggregation function missing: defaulting to length
  Id A B
1  3 2 2
2  4 2 2

Now, looking back at the top part of your data:

Id  Task Type    Freq  
3     1    A       2
3     1    B       3
3     2    A       3
3     2    B       0

You see why this is the case. For each combination of Id and Type there are two values in Freq (for Id 3: 2 and 3 for A & 3 and 0 for Type B) while you can only put one value in this spot in the wide dataframe for each values of type. Therefore dcast wants to aggregate these values into one value. The default aggregation function is length, but you can use other aggregation functions like sum, mean, sd or a custom function by specifying them with fun.aggregate.

For example, with fun.aggregate = sum you get:

  Id A B
1  3 5 3
2  4 4 6

Now there is no warning because dcast is being told what to do when there is more than one value: return the sum of the values.

这篇关于dcast错误:“缺少汇总功能:默认为长度”的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆