dcast错误:“缺少汇总功能:默认为长度” [英] dcast error: ‘Aggregation function missing: defaulting to length’
问题描述
我的 df
看起来像这样:
Id任务类型频率
3 1 A 2
3 1 B 3
3 2 A 3
3 2 B 0
4 1 A 3
4 1 B 3
4 2 A 1
4 2 B 3
我想按ID和得到:
Id AB…Z
3 5 3
4 4 6
我尝试过:
df_wide<-dcast(df,Id + Task〜Type,value.var = Freq)
并得到错误:
缺少聚合函数:默认为长度
我不知道要在 fun.aggregate
中放入什么。有什么问题?
得到此警告的原因是在乐趣的描述中。总计
(请参阅?dcast
):
聚合如果变量不能为每个输出像元标识单个
观测值,则需要该函数。默认值是长度(带有消息)
(如果需要但未指定)
因此,当存在
基于您的数据的说明:
p>当您使用 dcast(df,Id + Task〜Type,value.var = Freq)
时,您会得到:
Id任务AB
1 3 1 2 3
2 3 2 3 0
3 4 1 3 3
4 4 2 2 1 3
这是合乎逻辑的,因为对于 Id
, Task
和 Type
仅在频率
。但是,当您使用 dcast(df,Id〜Type,value.var = Freq)
时,会收到此消息(包括警告消息):
缺少聚合函数:默认长度为
Id AB
1 3 2 2
2 4 2 2
现在,回头查看数据顶部:
Id任务类型频率
3 1 A 2
3 1 B 3
3 2 A 3
3 2 B 0
您明白为什么会这样。对于 Id
和 Type
的每种组合, Freq $ c $中都有两个值c>(对于ID 3:
和 A
2 3
>& 3
和 0
对于类型 B
),而对于每个 type
值,您只能在宽数据框中将此位置放一个值。因此 dcast
希望将这些值聚合为一个值。默认的聚合函数是 length
,但是您可以使用其他聚合函数,例如 sum
, mean
, sd
或自定义函数,方法是使用 fun.aggregate
指定它们。
例如,使用 fun.aggregate = sum
,您将得到:
ID AB
1 3 5 3
2 4 4 6
现在没有任何警告,因为当有多个值时,将通知 dcast
该怎么做:返回这些值的总和。 / p>
My df
looks like this:
Id Task Type Freq
3 1 A 2
3 1 B 3
3 2 A 3
3 2 B 0
4 1 A 3
4 1 B 3
4 2 A 1
4 2 B 3
I want to restructure by Id and get:
Id A B … Z
3 5 3
4 4 6
I tried:
df_wide <- dcast(df, Id + Task ~ Type, value.var="Freq")
and got the error:
Aggregation function missing: defaulting to length
I can't figure out what to put in the fun.aggregate
. What's the problem?
The reason why you are getting this warning is in the description of fun.aggregate
(see ?dcast
):
aggregation function needed if variables do not identify a single observation for each output cell. Defaults to length (with a message) if needed but not specified
So, an aggregation function is needed when there is more than one value for one spot in the wide dataframe.
An explanation based on your data:
When you use dcast(df, Id + Task ~ Type, value.var="Freq")
you get:
Id Task A B
1 3 1 2 3
2 3 2 3 0
3 4 1 3 3
4 4 2 1 3
Which is logical because for each combination of Id
, Task
and Type
there is only value in Freq
. But when you use dcast(df, Id ~ Type, value.var="Freq")
you get this (including a warning message):
Aggregation function missing: defaulting to length
Id A B
1 3 2 2
2 4 2 2
Now, looking back at the top part of your data:
Id Task Type Freq
3 1 A 2
3 1 B 3
3 2 A 3
3 2 B 0
You see why this is the case. For each combination of Id
and Type
there are two values in Freq
(for Id 3: 2
and 3
for A
& 3
and 0
for Type B
) while you can only put one value in this spot in the wide dataframe for each values of type
. Therefore dcast
wants to aggregate these values into one value. The default aggregation function is length
, but you can use other aggregation functions like sum
, mean
, sd
or a custom function by specifying them with fun.aggregate
.
For example, with fun.aggregate = sum
you get:
Id A B
1 3 5 3
2 4 4 6
Now there is no warning because dcast
is being told what to do when there is more than one value: return the sum of the values.
这篇关于dcast错误:“缺少汇总功能:默认为长度”的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!