data.table中的mutate_at(dplyr)等价于什么? [英] What is the equivalent of mutate_at (dplyr) in data.table?
问题描述
我正在尝试将dplyr中一些较慢的进程移至使用data.table,但是似乎找不到在data.table中使用"mutate_at"类型方法的有效方法.特别是在命名新创建的变量&时,将多个函数应用于多个列.
I am trying to move some of my slower processes in dplyr to using data.table, however can not seem to find an efficient way of using a "mutate_at" type approach in data.table. Especially, when it comes to naming the new variables created & applying more than 1 function to multiple columns.
下面,我使用mutate_at将2个不同的函数应用到2个不同的列上,并使用group by语句将它们与相关的命名+关联.我希望能够轻松地在data.table中复制它.
Below I use mutate_at to apply 2 different functions to 2 different columns with associated naming + using a group by statement. I want to be able to replicate this easily in data.table.
library(tibble)
library(zoo)
Data = tibble(A = rep(c(1,2),50),
B = 1:100,
C = 101:200)
Data %>%
group_by(A) %>%
mutate_at(vars(B,C), funs(Roll.Mean.Week = 7 * rollapply(., width = 7, mean, align = "right", fill = 0, na.rm = T, partial = T),
Roll.Mean.Two.Week = 7 * rollapply(., width = 14, mean, align = "right", fill = 0, na.rm = T, partial = T))) %>%
ungroup()
推荐答案
使用data.table
,我们可以指定.SDcols
中感兴趣的列,使用lapply
遍历.SD
并应用感兴趣的功能.在此,仅在width
参数中进行更改的情况下重复功能rollapply
.因此,最好创建一个函数以避免重复整个参数.另外,在应用函数(f1
)时,输出可以保存在list
中,以后可以保存在unlist
中,并使用recursive = FALSE
并将(:=
)分配给感兴趣的列
With data.table
, we can specify the columns of interest in .SDcols
, loop through the .SD
with lapply
and apply the function of interest. Here, the funcion rollapply
is repeated with only change in width
parameter. So, it may be better to create a function to avoid repeating the whole arguments. Also, while applying the function (f1
), the output can be kept in a list
, later unlist
with recursive = FALSE
and assign (:=
) to columns of interest
library(data.table)
library(zoo)
nm1 <- c("B", "C")
nm2 <- paste0(nm1, "_Roll.Mean.Week")
nm3 <- paste0(nm1, "_Roll.Mean.Two.Week")
f1 <- function(x, width) rollapply(x, width = width, mean,
align = "right", fill = 0, na.rm = TRUE, partial = TRUE)
setDT(Data)[, c(nm2, nm3) := unlist(lapply(.SD, function(x)
list(f1(x, 7), f1(x, 14))), recursive = FALSE), by = A, .SDcols = nm1]
head(Data)
# A B C B_Roll.Mean.Week C_Roll.Mean.Week B_Roll.Mean.Two.Week C_Roll.Mean.Two.Week
#1: 1 1 101 1 1 101 101
#2: 2 2 102 2 2 102 102
#3: 1 3 103 2 2 102 102
#4: 2 4 104 3 3 103 103
#5: 1 5 105 3 3 103 103
#6: 2 6 106 4 4 104 104
请注意,funs
在tidyverse
中已被弃用,而在其位置上,可以使用list(~
或仅使用~
Note that funs
is deprecated in tidyverse
and in its place, can use list(~
or just ~
Data %>%
group_by(A) %>%
mutate_at(vars(B,C), list(Roll.Mean.Week = ~f1(., 7),
Roll.Mean.Two.Week = ~ f1(., 14)))%>%
ungroup()
这篇关于data.table中的mutate_at(dplyr)等价于什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!