如何将na.rm作为参数传递给tapply? [英] How to pass na.rm as argument to tapply?

查看:211
本文介绍了如何将na.rm作为参数传递给tapply?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从一个数据帧计算平均值和sd,其中一列作为参数,一列作为组标识符.使用tapply时如何计算它们?我可以使用sd(v1, group, na.rm=TRUE),但是在使用tapply时无法将na.rm=TRUE放入语句中. omit.na是不可选项.我有一堆参数,当排除所有缺少一个缺失值的行时,必须一步一步地遍历它们,而不会丢失数据框的一半.

I´d like to calculate mean and sd from a dataframe with one column for the parameter and one column for a group identifier. How can I calculate them when using tapply? I could use sd(v1, group, na.rm=TRUE), but can´t fit the na.rm=TRUE into the statement when using tapply. omit.na is no option. I have a whole bunch of parameters and have to go through them step by step without losing half of the dataframe when excluding all lines with one missing value.

data("weightgain", package = "HSAUR")
tapply(weightgain$weightgain, list(weightgain$source, weightgain$type), mean)

by语句也是如此.

x<-c(1,2,3,4,5,6,7,8,9,NA)
y<-c(2,3,NA,3,4,NA,2,3,NA,2)
group<-rep((factor(LETTERS[1:2])),5)
df<-data.frame(x,y,group)
df

by(df$x,df$group,summary)
by(df$x,df$group,mean)

sd(df$x) #result: NA
sd(df$x, na.rm=TRUE) #result: 2.738613

有什么想法可以做到这一点吗?

Any ideas how to get this done?

推荐答案

我认为这应该可以满足您的要求.

I think this should do what you want.

  1. 选择所需的列:

  1. Select the columns you want:

v = c("x", "y")#or
v = colnames(df)[1:2]

  • 使用sapply遍历v并将值传递给tapply:

  • Use sapply to iterate over v and pass the values to tapply:

    sapply(v, function(i) tapply(df[[i]], df$group, sd, na.rm=TRUE))
    

  • 这篇关于如何将na.rm作为参数传递给tapply?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆