相当于dplyr :: filter_at的data.table [英] data.table equivalent of dplyr::filter_at

查看：70 发布时间：2020/10/15 20:35:50 r dplyr data.table

本文介绍了相当于dplyr :: filter_at的data.table的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

考虑数据：

library(data.table)
library(magrittr)

vec1 <- c("Iron", "Copper")

vec2 <- c("Defective", "Passed", "Error")

set.seed(123)
a1 <- sample(x = vec1, size = 20, replace = T)
b1 <- sample(x = vec2, size = 20, replace = T)

set.seed(1234)
a2 <- sample(x = vec1, size = 20, replace = T)
b2 <- sample(x = vec2, size = 20, replace = T)

DT <- data.table(
  c(1:20), a1, b1, a2, b2
) %>% .[order(V1)]

names(DT) <- c("id", "prod_name_1", "test_1", "prod_name_2", "test_2")

我需要过滤 test_1 或 test_2 是通过 。因此，如果这些列都不具有指定的值，则删除该行。通过 dplyr ，我们可以使用 filter_at（）动词：

I need to filter the rows whose value for test_1 OR test_2 is "Passed". Thus if neither of these columns have the specified value, then delete the row. With dplyr, we can use the filter_at() verb:

> # dplyr solution...
> 
> cols <- grep(x = names(DT), pattern = "test", value = T, ignore.case = T)
> 
> 
> DT %>% 
+   dplyr::filter_at(.vars = grep(x = names(DT), pattern = "test", value = T, ignore.case = T), 
+                    dplyr::any_vars(. == "Passed")) -> DT.2
> 
> DT.2
  id prod_name_1 test_1 prod_name_2    test_2
1  3        Iron Passed      Copper Defective
2  5      Copper Passed      Copper Defective
3  7      Copper Passed        Iron    Passed
4  8      Copper Passed        Iron     Error
5 11      Copper  Error      Copper    Passed
6 14      Copper  Error      Copper    Passed
7 16      Copper Passed      Copper     Error

酷。有什么类似的方法可以在 data.table 中执行此操作吗？

Cool. Is there any similar way to perform this operation in data.table?

这是我最近的：

> lapply(seq_along(cols), function(x){
+   
+   setkeyv(DT, cols[[x]])
+   
+   DT["Passed"]
+   
+ }) %>% 
+   do.call(rbind,.) %>% 
+   unique -> DT.3
> 
> DT.3
   id prod_name_1 test_1 prod_name_2    test_2
1:  3        Iron Passed      Copper Defective
2:  5      Copper Passed      Copper Defective
3:  8      Copper Passed        Iron     Error
4: 16      Copper Passed      Copper     Error
5:  7      Copper Passed        Iron    Passed
6: 11      Copper  Error      Copper    Passed
7: 14      Copper  Error      Copper    Passed
> 
> identical(data.table(DT.2)[order(id)], DT.3[order(id)])
[1] TRUE

你们中的任何人有更优雅的解决方案吗？动词中最好包含 dplyr :: filter_at（）。

Does any of you have a more elegant solution? Preferably something contained in a verb like dplyr::filter_at().

推荐答案

我们可以在 .SDcols 中指定'cols'，循环遍历Data.table的子集（ .SD ）以比较该值是否为已通过，将 Reduce 还原为具有的单个 vector | 并对行进行子集

We can specify the 'cols' in .SDcols, loop through the Subset of Data.table (.SD) to compare whether the value is "Passed", Reduce it to a single vector with | and subset the rows

res2 <- DT[DT[,  Reduce(`|`, lapply(.SD, `==`, "Passed")), .SDcols = cols]]

与OP帖子中的 dplyr 输出相比

Comparing with the dplyr output in the OP's post

identical(as.data.table(res1), res2)
#[1] TRUE

这篇关于相当于dplyr :: filter_at的data.table的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

相当于dplyr :: filter_at的data.table [英] data.table equivalent of dplyr::filter_at

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

相当于dplyr :: filter_at的data.table [英] data.table equivalent of dplyr::filter_at

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭