如何根据dplyr中的值过滤列? [英] How to filter columns based on values in dplyr?

查看:114
本文介绍了如何根据dplyr中的值过滤列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想删除最后一个值为零的所有行,以及最后一个值为零的所有列。

I want to delete all the rows that have a last value of zero and all the columns that have a final value of zero.

这是一个虚拟对象(可复制)我的数据集示例:

This a dummy (reproducible) example of my dataset:

library(dplyr)

x = c("apples" ,1,0,1,2)
y = c("bananas",0,0,0,0)
z = c("apples" ,2,0,4,6)
t = c("rowsum" ,3,0,5,8)

my_table = rbind(x,y,z,t)
colnames(my_table) = c("product","day1","day2","day3","colsum")

my_table = as.tbl(as.data.frame(my_table)) %>% 
  mutate(day1 = as.integer(as.character(day1)),
         day2 = as.integer(as.character(day2)),
         day3 = as.integer(as.character(day3)),
         colsum = as.integer(as.character(colsum)))

虚拟示例具有此输出:

> my_table
# A tibble: 4 × 5
  product  day1  day2  day3 colsum
   <fctr> <int> <int> <int>  <int>
1  apples     1     0     1      2
2 bananas     0     0     0      0
3  apples     2     0     4      6
4  rowsum     3     0     5      8

现在,我删除最终值为零的行:

Now I remove the rows with a final value of zero:

my_table = my_table %>% 
  filter(colsum > 0)

> my_table
# A tibble: 3 × 5
  product  day1  day2  day3 colsum
   <fctr> <int> <int> <int>  <int>
1  apples     1     0     1      2
2  apples     2     0     4      6
3  rowsum     3     0     5      8

问题是:

我想做这样的事情:

# code that does NOT work
my_table = my_table %>% 
  filter(my_table[nrow(my_table)] > 0)

要获得:

> my_table
# A tibble: 3 × 5
  product  day1  day3 colsum
   <fctr> <int> <int>  <int>
1  apples     1     1      2
2  apples     2     4      6
3  rowsum     3     5      8

更新:
@Patronius的解决方案(与 dplyr 0.5.0 一起使用)

Update: Solution by @Patronius (works with dplyr 0.5.0)

my_table %>% 
  filter(colsum > 0) %>% 
  select_if(function(.) last(.) != 0)

# A tibble: 3 × 4
  product  day1  day3 colsum
   <fctr> <int> <int>  <int>
1  apples     1     1      2
2  apples     2     4      6
3  rowsum     3     5      8


推荐答案

您可以使用dplyr的 select_if last

You can use dplyr's select_if and last:

my_table %>%
  select_if(function(.) last(.) != 0)

请注意,它保持因子列为 product (因为它是不正确的是产品因子的最后一项为零)。

Note that it kept the factor column product (since it's not true that the last item of the product factor is zero).

这篇关于如何根据dplyr中的值过滤列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆