data.table中的mutate_at(dplyr)等价于什么? [英] What is the equivalent of mutate_at (dplyr) in data.table?

查看:237
本文介绍了data.table中的mutate_at(dplyr)等价于什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将dplyr中一些较慢的进程移至使用data.table,但是似乎找不到在data.table中使用"mutate_at"类型方法的有效方法.特别是在命名新创建的变量&时,将多个函数应用于多个列.

I am trying to move some of my slower processes in dplyr to using data.table, however can not seem to find an efficient way of using a "mutate_at" type approach in data.table. Especially, when it comes to naming the new variables created & applying more than 1 function to multiple columns.

下面,我使用mutate_at将2个不同的函数应用到2个不同的列上,并使用group by语句将它们与相关的命名+关联.我希望能够轻松地在data.table中复制它.

Below I use mutate_at to apply 2 different functions to 2 different columns with associated naming + using a group by statement. I want to be able to replicate this easily in data.table.

library(tibble)
library(zoo)

Data = tibble(A = rep(c(1,2),50),
              B = 1:100,
              C = 101:200)

Data %>% 
    group_by(A) %>% 
    mutate_at(vars(B,C), funs(Roll.Mean.Week = 7 * rollapply(., width = 7, mean, align = "right", fill = 0, na.rm = T, partial = T),
                              Roll.Mean.Two.Week = 7 * rollapply(., width = 14, mean, align = "right", fill = 0, na.rm = T, partial = T))) %>% 
    ungroup()

推荐答案

使用data.table,我们可以指定.SDcols中感兴趣的列,使用lapply遍历.SD并应用感兴趣的功能.在此,仅在width参数中进行更改的情况下重复功能rollapply.因此,最好创建一个函数以避免重复整个参数.另外,在应用函数(f1)时,输出可以保存在list中,以后可以保存在unlist中,并使用recursive = FALSE并将(:=)分配给感兴趣的列

With data.table, we can specify the columns of interest in .SDcols, loop through the .SD with lapply and apply the function of interest. Here, the funcion rollapply is repeated with only change in width parameter. So, it may be better to create a function to avoid repeating the whole arguments. Also, while applying the function (f1), the output can be kept in a list, later unlist with recursive = FALSE and assign (:=) to columns of interest

library(data.table)
library(zoo)
nm1 <- c("B", "C")
nm2 <- paste0(nm1, "_Roll.Mean.Week")
nm3 <- paste0(nm1, "_Roll.Mean.Two.Week")
f1 <- function(x, width) rollapply(x, width = width, mean,
        align = "right", fill = 0, na.rm = TRUE, partial = TRUE)
setDT(Data)[, c(nm2, nm3) := unlist(lapply(.SD, function(x)
  list(f1(x, 7), f1(x, 14))), recursive = FALSE), by = A, .SDcols = nm1]
head(Data)
#   A B   C B_Roll.Mean.Week C_Roll.Mean.Week B_Roll.Mean.Two.Week C_Roll.Mean.Two.Week
#1: 1 1 101                1                1                  101                  101
#2: 2 2 102                2                2                  102                  102
#3: 1 3 103                2                2                  102                  102
#4: 2 4 104                3                3                  103                  103
#5: 1 5 105                3                3                  103                  103
#6: 2 6 106                4                4                  104                  104


请注意,funstidyverse中已被弃用,而在其位置上,可以使用list(~或仅使用~


Note that funs is deprecated in tidyverse and in its place, can use list(~ or just ~

Data %>% 
    group_by(A) %>% 
    mutate_at(vars(B,C), list(Roll.Mean.Week =  ~f1(., 7),
                              Roll.Mean.Two.Week = ~ f1(., 14)))%>% 
    ungroup()

这篇关于data.table中的mutate_at(dplyr)等价于什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆