data.table 中 mutate_at (dplyr) 的等价物是什么? [英] What is the equivalent of mutate_at (dplyr) in data.table?

查看:20
本文介绍了data.table 中 mutate_at (dplyr) 的等价物是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将 dplyr 中的一些较慢的进程移动到使用 data.table,但是似乎无法找到一种在 data.table 中使用mutate_at"类型方法的有效方法.特别是,在命名创建的新变量时对多列应用 1 个以上的函数.

I am trying to move some of my slower processes in dplyr to using data.table, however can not seem to find an efficient way of using a "mutate_at" type approach in data.table. Especially, when it comes to naming the new variables created & applying more than 1 function to multiple columns.

下面我使用 mutate_at 将 2 个不同的函数应用到 2 个不同的列,并使用关联命名 + 使用 group by 语句.我希望能够在 data.table 中轻松复制.

Below I use mutate_at to apply 2 different functions to 2 different columns with associated naming + using a group by statement. I want to be able to replicate this easily in data.table.

library(tibble)
library(zoo)

Data = tibble(A = rep(c(1,2),50),
              B = 1:100,
              C = 101:200)

Data %>% 
    group_by(A) %>% 
    mutate_at(vars(B,C), funs(Roll.Mean.Week = 7 * rollapply(., width = 7, mean, align = "right", fill = 0, na.rm = T, partial = T),
                              Roll.Mean.Two.Week = 7 * rollapply(., width = 14, mean, align = "right", fill = 0, na.rm = T, partial = T))) %>% 
    ungroup()

推荐答案

通过data.table,我们可以在.SDcols中指定感兴趣的列,循环遍历.SDlapply 并应用感兴趣的函数.在这里,函数 rollapply 被重复,仅改变 width 参数.因此,最好创建一个函数以避免重复整个参数.此外,在应用函数 (f1) 时,输出可以保存在 list 中,稍后使用 recursive = FALSE<unlist/code> 并将 (:=) 分配给感兴趣的列

With data.table, we can specify the columns of interest in .SDcols, loop through the .SD with lapply and apply the function of interest. Here, the funcion rollapply is repeated with only change in width parameter. So, it may be better to create a function to avoid repeating the whole arguments. Also, while applying the function (f1), the output can be kept in a list, later unlist with recursive = FALSE and assign (:=) to columns of interest

library(data.table)
library(zoo)
nm1 <- c("B", "C")
nm2 <- paste0(nm1, "_Roll.Mean.Week")
nm3 <- paste0(nm1, "_Roll.Mean.Two.Week")
f1 <- function(x, width) rollapply(x, width = width, mean,
        align = "right", fill = 0, na.rm = TRUE, partial = TRUE)
setDT(Data)[, c(nm2, nm3) := unlist(lapply(.SD, function(x)
  list(f1(x, 7), f1(x, 14))), recursive = FALSE), by = A, .SDcols = nm1]
head(Data)
#   A B   C B_Roll.Mean.Week C_Roll.Mean.Week B_Roll.Mean.Two.Week C_Roll.Mean.Two.Week
#1: 1 1 101                1                1                  101                  101
#2: 2 2 102                2                2                  102                  102
#3: 1 3 103                2                2                  102                  102
#4: 2 4 104                3                3                  103                  103
#5: 1 5 105                3                3                  103                  103
#6: 2 6 106                4                4                  104                  104

<小时>

请注意,funstidyverse 中已被弃用,取而代之的是,可以使用 list(~ 或仅使用 ~>


Note that funs is deprecated in tidyverse and in its place, can use list(~ or just ~

Data %>% 
    group_by(A) %>% 
    mutate_at(vars(B,C), list(Roll.Mean.Week =  ~f1(., 7),
                              Roll.Mean.Two.Week = ~ f1(., 14)))%>% 
    ungroup()

这篇关于data.table 中 mutate_at (dplyr) 的等价物是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆