dplyr:group_by,子集和摘要 [英] dplyr: group_by, subset and summarise

查看:71
本文介绍了dplyr:group_by,子集和摘要的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我有一个由许多行组成的数据框,如下所示:

Say I have a data frame consisting of a number of rows, like this:

df <- data.frame(Group = c(0,0,1,1,1,0),V1=c(0,0,0,4,5,7), V2=c(0,3,0,4,0,1))

  Group V1 V2
1     0  0  0
2     0  0  3
3     1  0  0
4     1  4  4
5     1  5  0
6     0  7  1

组为二进制,V1和V2的值为零,通货膨胀(许多观察值== 0)
我想对每列进行子集化(依次)以删除0 obs,然后根据剩余数据计算分位数。至关重要的是,我只想为一个给定的变量删除0,而不是删除整行,因为我想为下一列重新设置并重新设置子集。

Group is binary, V1 and V2 have zero-inflation (many observations == 0) I'd like to subset each column (in turn) to remove 0 obs and then calculate quantiles on remaining data. Crucially, I'd like to remove 0s for a given variable only, and not remove whole rows, as I'd be wanting to reset and subset again for the next column.

下面有我的分位数代码。有什么方法可以潜入子集函数中吗?还是需要其他方法?

I have my code for quantiles below. Is there any way I can sneak in the subset function or do I need an different approach?

#Functions for quantiles
quant25 <- function(x) quantile(x, probs=0.25, na.rm=TRUE)
quant50 <- function(x) quantile(x, probs=0.50, na.rm=TRUE)  
quant75 <- function(x) quantile(x, probs=0.75, na.rm=TRUE)

#Grouped calls on these functions
group_by(df, Group) %>%
summarise_each(funs(quant25, quant50, quant75), V1, V2)


推荐答案

我认为出于我的目的已经解决了这一问题: df [,2:3] [df [,2:3] == 0]< ;-NA 声明缺少0个观察值,其余似乎都按预期处理。 (谢谢Jaap)

I think I've figured this one out for my purposes: df[,2:3][df[,2:3]==0] <- NA to declare 0 observations missing and the rest seems to handle as expected. (Thanks, Jaap)

这篇关于dplyr:group_by,子集和摘要的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆