R:如何使用dplyr计算具有缺失值的每一行的均值 [英] R: How to calculate mean for each row with missing values using dplyr
问题描述
我想为数据框中包含缺失值的每一行计算几列的均值,然后将结果放置在名为均值"的新列中.这是我的数据框:
I want to calculate means over several columns for each row in my dataframe containing missing values, and place results in a new column called 'means.' Here's my dataframe:
df <- data.frame(A=c(3,4,5),B=c(0,6,8),C=c(9,NA,1))
A B C
1 3 0 9
2 4 6 NA
3 5 8 1
如果列没有缺失值(例如列A和B),则下面的代码成功完成任务.
The code below successfully accomplishes the task if columns have no missing values, such as columns A and B.
library(dplyr)
df %>%
rowwise() %>%
mutate(means=mean(A:B, na.rm=T))
A B C means
<dbl> <dbl> <dbl> <dbl>
1 3 0 9 1.5
2 4 6 NA 5.0
3 5 8 1 6.5
但是,如果一列缺少值(例如C),则会出现错误:
However, if a column has missing values, such as C, then I get an error:
> df %>% rowwise() %>% mutate(means=mean(A:C, na.rm=T))
Error: NA/NaN argument
理想情况下,我想用dplyr实现它.
Ideally, I'd like to implement it with dplyr.
推荐答案
df %>%
mutate(means=rowMeans(., na.rm=TRUE))
.
是一个代词",它引用通过管道传输到mutate
的数据帧df
.
The .
is a "pronoun" that references the data frame df
that was piped into mutate
.
A B C means
1 3 0 9 4.000000
2 4 6 NA 5.000000
3 5 8 1 4.666667
您还可以使用所有常用方法(列名,索引,grep
等)仅选择要包括的特定列.
You can also select only specific columns to include, using all the usual methods (column names, indices, grep
, etc.).
df %>%
mutate(means=rowMeans(.[ , c("A","C")], na.rm=TRUE))
A B C means
1 3 0 9 6
2 4 6 NA 4
3 5 8 1 3
这篇关于R:如何使用dplyr计算具有缺失值的每一行的均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!