删除基于一列为键的总和为零的行 [英] remove rows that sum zero based on one column as key

查看:42
本文介绍了删除基于一列为键的总和为零的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的目标是删除column-sum为零的行,但不包括一个特定的列.在此示例中,我想基于 id 列删除.

My goal is to remove rows that column-sum is zero excluding one specific column. Here in example, I'd like to remove based on id column.

sample_DT<- data.table(id = paste("GENE",1:10,sep="_"), laptop=c(1,2,3,0,5),desktop=c(2,1,4,0,3)) ##create data.table with three columns and 10 rows.

需要根据数据表和其中的信息删除

GENE_4 GENE_9 ,因为它们的总和为零(添加笔记本电脑和台式机).

GENE_4 and GENE_9 need to be removed based on the data table and info in it as they have zero sum (adding laptop and desktop).

然后,我使用dplyr执行混乱的管道传输,以获取每行的计数并将该总和添加到新列中.但在此之前,我删除了 id

I then perform a messy piping using dplyr to get count per row and add that sum in a new column. But before that I remove the id column

perGene_summed_sample<-sample_DT %>% select(-c("id")) %>% dplyr::mutate(allele_count = rowSums(., na.rm = TRUE))

然后我存储allele_count值为零的行的索引

I then store index of rows where allele_count value is zero

throw_genes<- which(perGene_summed_sample$allele_count == 0)
 

稍后,我再次添加id列,以此类推,直到没有通过检查索引来抛出该值的地方.

Later, I again add the id column and so and so forth where the value hasn't been thrown by checking indices.

这看起来真糟糕.有更好的方法吗?

This looks so bad. Is there a better way?

使用 sample 删除列名,因为这不是实际的测试用例.我急于创建data.table数据的样本.

Edits: removing columns names with sample as that is not the actual test case. I had put sample in rush to create the data.table data.

推荐答案

删除除一列以外的所有列均为零的行很简单:

Removing rows where all columns except one are zeros is straight-forward:

sample_DT[ rowSums(sample_DT[,-1]) > 0, ]
#         id sample1 sample2
# 1:  GENE_1       1       2
# 2:  GENE_2       2       1
# 3:  GENE_3       3       4
# 4:  GENE_5       5       3
# 5:  GENE_6       1       2
# 6:  GENE_7       2       1
# 7:  GENE_8       3       4
# 8: GENE_10       5       3

这篇关于删除基于一列为键的总和为零的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆