data.table 中 mutate_at (dplyr) 的等价物是什么? [英] What is the equivalent of mutate_at (dplyr) in data.table?
问题描述
我正在尝试将 dplyr 中的一些较慢的进程移动到使用 data.table,但是似乎无法找到一种在 data.table 中使用mutate_at"类型方法的有效方法.特别是,在命名创建的新变量时对多列应用 1 个以上的函数.
I am trying to move some of my slower processes in dplyr to using data.table, however can not seem to find an efficient way of using a "mutate_at" type approach in data.table. Especially, when it comes to naming the new variables created & applying more than 1 function to multiple columns.
下面我使用 mutate_at 将 2 个不同的函数应用到 2 个不同的列,并使用关联命名 + 使用 group by 语句.我希望能够在 data.table 中轻松复制.
Below I use mutate_at to apply 2 different functions to 2 different columns with associated naming + using a group by statement. I want to be able to replicate this easily in data.table.
library(tibble)
library(zoo)
Data = tibble(A = rep(c(1,2),50),
B = 1:100,
C = 101:200)
Data %>%
group_by(A) %>%
mutate_at(vars(B,C), funs(Roll.Mean.Week = 7 * rollapply(., width = 7, mean, align = "right", fill = 0, na.rm = T, partial = T),
Roll.Mean.Two.Week = 7 * rollapply(., width = 14, mean, align = "right", fill = 0, na.rm = T, partial = T))) %>%
ungroup()
推荐答案
通过data.table
,我们可以在.SDcols
中指定感兴趣的列,循环遍历.SD
和 lapply
并应用感兴趣的函数.在这里,函数 rollapply
被重复,仅改变 width
参数.因此,最好创建一个函数以避免重复整个参数.此外,在应用函数 (f1
) 时,输出可以保存在 list
中,稍后使用 recursive = FALSE<
unlist
/code> 并将 (:=
) 分配给感兴趣的列
With data.table
, we can specify the columns of interest in .SDcols
, loop through the .SD
with lapply
and apply the function of interest. Here, the funcion rollapply
is repeated with only change in width
parameter. So, it may be better to create a function to avoid repeating the whole arguments. Also, while applying the function (f1
), the output can be kept in a list
, later unlist
with recursive = FALSE
and assign (:=
) to columns of interest
library(data.table)
library(zoo)
nm1 <- c("B", "C")
nm2 <- paste0(nm1, "_Roll.Mean.Week")
nm3 <- paste0(nm1, "_Roll.Mean.Two.Week")
f1 <- function(x, width) rollapply(x, width = width, mean,
align = "right", fill = 0, na.rm = TRUE, partial = TRUE)
setDT(Data)[, c(nm2, nm3) := unlist(lapply(.SD, function(x)
list(f1(x, 7), f1(x, 14))), recursive = FALSE), by = A, .SDcols = nm1]
head(Data)
# A B C B_Roll.Mean.Week C_Roll.Mean.Week B_Roll.Mean.Two.Week C_Roll.Mean.Two.Week
#1: 1 1 101 1 1 101 101
#2: 2 2 102 2 2 102 102
#3: 1 3 103 2 2 102 102
#4: 2 4 104 3 3 103 103
#5: 1 5 105 3 3 103 103
#6: 2 6 106 4 4 104 104
<小时>
请注意,funs
在 tidyverse
中已被弃用,取而代之的是,可以使用 list(~
或仅使用 ~
>
Note that funs
is deprecated in tidyverse
and in its place, can use list(~
or just ~
Data %>%
group_by(A) %>%
mutate_at(vars(B,C), list(Roll.Mean.Week = ~f1(., 7),
Roll.Mean.Two.Week = ~ f1(., 14)))%>%
ungroup()
这篇关于data.table 中 mutate_at (dplyr) 的等价物是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!