删除基于一列为键的总和为零的行 [英] remove rows that sum zero based on one column as key
问题描述
我的目标是删除column-sum为零的行,但不包括一个特定的列.在此示例中,我想基于 id
列删除.
My goal is to remove rows that column-sum is zero excluding one specific column. Here in example, I'd like to remove based on id
column.
sample_DT<- data.table(id = paste("GENE",1:10,sep="_"), laptop=c(1,2,3,0,5),desktop=c(2,1,4,0,3)) ##create data.table with three columns and 10 rows.
需要根据数据表和其中的信息删除
GENE_4
和 GENE_9
,因为它们的总和为零(添加笔记本电脑和台式机).
GENE_4
and GENE_9
need to be removed based on the data table and info in it as they have zero sum (adding laptop and desktop).
然后,我使用dplyr执行混乱的管道传输,以获取每行的计数并将该总和添加到新列中.但在此之前,我删除了 id
列
I then perform a messy piping using dplyr to get count per row and add that sum in a new column. But before that I remove the id
column
perGene_summed_sample<-sample_DT %>% select(-c("id")) %>% dplyr::mutate(allele_count = rowSums(., na.rm = TRUE))
然后我存储allele_count值为零的行的索引
I then store index of rows where allele_count value is zero
throw_genes<- which(perGene_summed_sample$allele_count == 0)
稍后,我再次添加id列,以此类推,直到没有通过检查索引来抛出该值的地方.
Later, I again add the id column and so and so forth where the value hasn't been thrown by checking indices.
这看起来真糟糕.有更好的方法吗?
This looks so bad. Is there a better way?
使用 sample
删除列名,因为这不是实际的测试用例.我急于创建data.table数据的样本.
Edits: removing columns names with sample
as that is not the actual test case. I had put sample in rush to create the data.table data.
推荐答案
删除除一列以外的所有列均为零的行很简单:
Removing rows where all columns except one are zeros is straight-forward:
sample_DT[ rowSums(sample_DT[,-1]) > 0, ]
# id sample1 sample2
# 1: GENE_1 1 2
# 2: GENE_2 2 1
# 3: GENE_3 3 4
# 4: GENE_5 5 3
# 5: GENE_6 1 2
# 6: GENE_7 2 1
# 7: GENE_8 3 4
# 8: GENE_10 5 3
这篇关于删除基于一列为键的总和为零的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!