有条件地求和r中的动态列 [英] Conditionally sum dynamic columns in r
问题描述
我试图有条件地对许多列求和,具体取决于它们是否大于或小于0.我很惊讶我找不到 dplyr
或 data.table
为此工作.我想为一个大的data.frame计算4个新列(要计算的列在文章的底部).
I am trying to conditionally sum across many columns depending on if they are greater than or less than 0. I am surprised I cannot find a dplyr
or data.table
work around for this. I want to calculate 4 new columns for a large data.frame (columns to calculate are at bottom of post).
dat2=matrix(nrow=10,rnorm(100));colnames(dat2)=paste0('V',rep(1:10))
dat2 %>% as.data.frame() %>%
rowwise() %>%
select_if(function(col){mean(col)>0}) %>%
mutate(sum_pos=rowSums(.)) ##Obviously doesn't work
这些是我想计算的简单统计信息(是的;这些apply语句有效,但是我想在dplyr链中做其他事情,所以这就是为什么我要寻找 dplyr
或 data.table
方式.每个给定行的正数列或负数列均不同,因此我无法获取列列表(必须逐行动态地完成)).
These are the simple statistics I want to calculate (yes; these apply statements work, but there are other things in my dplyr chain I want to do, so thats why I am looking for a dplyr
or data.table
way. The columns that are positive or negative for each given row are different, so I cannot grab a list of columns (must be done dynamically, by row).
#Calculate these, but in a dplyr chain?
n_pos=apply(dat2,1,function(x) sum((x>0)))
n_neg=apply(dat2,1,function(x) sum((x<0)))
sum_pos=apply(dat2,1,function(x) sum(x[(x>0)]))
sum_neg=apply(dat2,1,function(x) sum(x[(x<0)]))
推荐答案
我们不需要 rowwise
和 rowSums
,因为 rowSums
可以做到没有任何分组的总和
We don't need rowwise
with rowSums
as rowSums
can do the sum without any groupings
library(dplyr)
dat2 %>%
as.data.frame() %>%
select_if(~ is.numeric(.) && mean(.) > 0) %>%
mutate(sum_pos = rowSums(.))
根据描述,似乎不是 mean
条件,而是与正负值分别按行, sum
和
Based on the description, it seems that it is not the mean
condition, but related to rowwise, sum
of the positive and negative values separately
dat2 %>%
as.data.frame %>%
mutate(sum_pos = rowSums(. * NA^(. < 0), na.rm = TRUE),
sum_neg = rowSums(.[1:10] * NA^(.[1:10] > 0), na.rm = TRUE) )
这篇关于有条件地求和r中的动态列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!