dcast 警告:“缺少聚合函数:默认为长度" [英] dcast warning: ‘Aggregation function missing: defaulting to length’

查看:45
本文介绍了dcast 警告:“缺少聚合函数:默认为长度"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的 df 看起来像这样:

My df looks like this:

Id  Task Type    Freq  
3     1    A       2
3     1    B       3
3     2    A       3
3     2    B       0
4     1    A       3
4     1    B       3
4     2    A       1
4     2    B       3

我想通过Id重组并得到:

I want to restructure by Id and get:

Id   A    B …  Z    
3    5    3      
4    4    6        

我试过了:

df_wide <- dcast(df, Id + Task ~ Type, value.var="Freq")

并收到以下警告:

缺少聚合函数:默认为长度

Aggregation function missing: defaulting to length

我不知道要在 fun.aggregate 中放入什么.有什么问题?

I can't figure out what to put in the fun.aggregate. What's the problem?

推荐答案

您收到此警告的原因在 fun.aggregate 的描述中(请参阅 ?dcast):

The reason why you are getting this warning is in the description of fun.aggregate (see ?dcast):

如果变量不能识别单个,则需要聚合函数观察每个输出单元格.默认为长度(带有消息)如果需要但未指定

aggregation function needed if variables do not identify a single observation for each output cell. Defaults to length (with a message) if needed but not specified

因此,当宽数据框中的一个点有多个值时,需要一个聚合函数.

So, an aggregation function is needed when there is more than one value for one spot in the wide dataframe.

基于您的数据的解释:

当你使用 dcast(df, Id + Task ~ Type, value.var="Freq") 你得到:

  Id Task A B
1  3    1 2 3
2  3    2 3 0
3  4    1 3 3
4  4    2 1 3

这是合乎逻辑的,因为对于 IdTaskType 的每个组合,Freq 中只有一个值.但是当你使用 dcast(df, Id ~ Type, value.var="Freq") 你会得到这个(包括警告信息):

Which is logical because for each combination of Id, Task and Type there is only value in Freq. But when you use dcast(df, Id ~ Type, value.var="Freq") you get this (including a warning message):

Aggregation function missing: defaulting to length
  Id A B
1  3 2 2
2  4 2 2

现在,回顾数据的顶部:

Now, looking back at the top part of your data:

Id  Task Type    Freq  
3     1    A       2
3     1    B       3
3     2    A       3
3     2    B       0

你明白为什么会这样了.对于 IdType 的每个组合,Freq 中有两个值(对于 Id 3:2>3 for A & 3 and 0 for Type B) 而你只能把对于 type 的每个值,在宽数据框中的这个位置有一个值.因此 dcast 想要将这些值聚合为一个值.默认聚合函数是 length,但您可以使用其他聚合函数,如 summeansd 或 a通过使用 fun.aggregate 指定它们来自定义函数.

You see why this is the case. For each combination of Id and Type there are two values in Freq (for Id 3: 2 and 3 for A & 3 and 0 for Type B) while you can only put one value in this spot in the wide dataframe for each values of type. Therefore dcast wants to aggregate these values into one value. The default aggregation function is length, but you can use other aggregation functions like sum, mean, sd or a custom function by specifying them with fun.aggregate.

例如,使用 fun.aggregate = sum 你得到:

For example, with fun.aggregate = sum you get:

  Id A B
1  3 5 3
2  4 4 6

现在没有警告,因为 dcast 被告知当有多个值时该怎么做:返回值的总和.

Now there is no warning because dcast is being told what to do when there is more than one value: return the sum of the values.

这篇关于dcast 警告:“缺少聚合函数:默认为长度"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆