r按条件和因子组的子集行 [英] r subset rows by criteria and by factor group

查看：64 发布时间：2020/10/17 0:21:10 r dataframe subset rows

本文介绍了r按条件和因子组的子集行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的data.frame有很多NA：

I have this data.frame with a lot of NAs:

df <- data.frame(a = rep(letters[1:3], each = 3), 
                 b = c(NA, NA, NA, 1, NA, 3, NA, NA, 7))
df
> df
  a  b
1 a NA
2 a NA
3 a NA
4 b  1
5 b NA
6 b  3
7 c NA
8 c NA
9 c  7

我想对该子数据框进行子集化，以获得仅具有不少于两个值的因子组行，例如：

I would like to subset this dataframe to obtain only factor group rows that have no less than two values, such as this:

  a  b
1 b 1
2 b NA
3 b 3

我尝试过此函数但不起作用：

I have tried this function but it doesn't work:

subset(df, sum(!is.na(b)) < 1, by = a)

> [1] a b
<0 rows> (or 0-length row.names)

有什么建议吗？（欢迎使用其他软件包解决方案）

Any suggestion? (other packages solutions are welcome)

推荐答案

我们可以使用 data.table 。将'data.frame'转换为'data.table'（ setDT（df）），按'a'分组， if 逻辑向量的和（即非NA元素-！is.na（b））为大于1，然后对Data.table进行子集设置。

We can use data.table. Convert the 'data.frame' to 'data.table' (setDT(df)), grouped by 'a', if the sum of logical vector (i.e. non-NA elements - !is.na(b)) is greater than 1, then Subset the Data.table.

library(data.table)
setDT(df)[,if(sum(!is.na(b))>1) .SD , by = a]
#   a  b
#1: b  1
#2: b NA
#3: b  3

或使用 dplyr ，按照相同的逻辑，在按'a'分组后，我们过滤行。

Or using dplyr, with the same logic, after grouping by 'a', we filter the rows.

library(dplyr)
df %>% 
    group_by(a) %>%
    filter(sum(!is.na(b))>1)
#      a     b
#  <fctr> <dbl>
#1      b     1
#2      b    NA
#3      b     3

或者在 base R 中使用 ave

df[with(df, ave(b, a, FUN = function(x) sum(!is.na(x))>1)!=0),]

这篇关于r按条件和因子组的子集行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

r按条件和因子组的子集行 [英] r subset rows by criteria and by factor group

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

r按条件和因子组的子集行 [英] r subset rows by criteria and by factor group

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭