R reshape2中的cast()调用的自定义聚合函数出错 [英] Error with custom aggregate function for a cast() call in R reshape2

查看:417
本文介绍了R reshape2中的cast()调用的自定义聚合函数出错的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用R将具有非唯一rownames的表中的数值数据汇总到具有唯一行名的结果表,并使用自定义函数汇总值。概括逻辑是:如果最大值与最小值的比率是< 1.5,else use median。因为表非常大,我试图在 reshape2 package。

I want to use R to summarize numerical data in a table with non-unique rownames to a result table with unique row-names with values summarized using a custom function. The summarization logic is: use the mean of values if the ratio of the maximum to the minimum value is < 1.5, else use median. Because the table is very large, I am trying to use the melt() and cast() functions in the reshape2 package.

# example table with non-unique row-names
tab <- data.frame(gene=rep(letters[1:3], each=3), s1=runif(9), s2=runif(9))
# melt
tab.melt <- melt(tab, id=1)
# function to summarize with logic: mean if max/min < 1.5, else median
summarize <- function(x){ifelse(max(x)/min(x)<1.5, mean(x), median(x))}
# cast with summarized values
dcast(tab.melt, gene~variable, summarize)

上面代码的最后一行导致错误通知。 p>

The last line of code above results in an error notice.

Error in vapply(indices, fun, .default) : 
  values must be type 'logical',
 but FUN(X[[1]]) result is type 'double'
In addition: Warning messages:
1: In max(x) : no non-missing arguments to max; returning -Inf
2: In min(x) : no non-missing arguments to min; returning Inf

我做错了什么?注意,如果summarize函数只返回min()或max(),没有错误,虽然有关于没有非缺失的参数的警告消息。感谢任何建议。

What am I doing wrong? Note that if the summarize function were to just return min(), or max(), there is no error, though there is the warning message about 'no non-missing arguments.' Thank you for any suggestion.

(我想使用的实际表是一个200x10000的表。)

(The actual table I want to work with is a 200x10000 one.)

推荐答案

短回答:提供填充值如下
acast(tab.melt,gene〜variable,summarize,fill = 0)

Short answer: provide a value for fill as follows acast(tab.melt, gene~variable, summarize, fill=0)

长回答:
在vaggregate函数(dcast调用cast,它调用vaggregate,它调用vapply)传递给vapply之前,你的函数看起来如下所示:

Long answer: It appears your function gets wrapped as follows, before being passed to vapply in the vaggregate function (dcast calls cast which calls vaggregate which calls vapply):

fun <- function(i) {
    if (length(i) == 0) 
        return(.default)
    .fun(.value[i], ...)
}

要找出.default应该是什么,

To find out what .default should be, this code is executed

if (is.null(.default)) {
    .default <- .fun(.value[0])
}

ie .value [0]传递给函数。当x是数字(0)时,min(x)或max(x)返回Inf或-Inf。然而,max(x)/ min(x)返回具有类逻辑的NaN。所以当执行vapply时

i.e. .value[0] is passed to the function. min(x) or max(x) returns Inf or -Inf on when x is numeric(0). However, max(x)/min(x) returns NaN which has class logical. So when vapply is executed

vapply(indices, fun, .default)

默认值为逻辑类型(用作vapply的模板),函数在开始返回双精度时失败。

with the default value being is of class logical (used as template by vapply), the function fails when starting to return doubles.

这篇关于R reshape2中的cast()调用的自定义聚合函数出错的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆