自动执行Dplyr的变异功能 [英] Automate Dplyr's mutate function

查看:47
本文介绍了自动执行Dplyr的变异功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在一个dplyr聚合中自动化 mutate 函数的最佳方法是什么.

What is the best way to automate mutate function in one dplyr aggregation.

如果我在示例中进行演示,那是最好的.因此,在示例的第一部分中,我将基于变量 gear 的值创建新列.但是,想象一下我需要自动执行此步骤,以自动迭代" gear 的所有唯一值并为每个值创建新列.

Best if I demonstrate on the example. So in the first part of an example I am creating new columns based on values of variable gear. However, imagine I need to automate this step to automatically 'iterate' over all unique values of gear and creates new columns for each value.

有什么办法吗?

library(tidyverse)

cr <- 
  mtcars %>% 
  group_by(gear) %>% 
  nest()


# This is 'by-hand' approach of what I would like to do - How to automate it? E.g. we do not know all values of 'carb'

cr$data[[1]] %>% 
  mutate(VARIABLE1 = 
           case_when(carb == 1 ~ hp/mpg,
                          TRUE ~ 0)) %>%
  mutate(VARIABLE2 = 
          case_when(carb == 2 ~ hp/mpg,
                         TRUE ~ 0)) %>%
  mutate(VARIABLE4 =
          case_when(carb == 4 ~ hp/mpg,
                         TRUE ~ 0))

# This is a pseodu-idea of what I need to do. Is the any way how to change iteration number in ONE dplyr code?

vals <- cr$data[[1]] %>% pull(carb) %>% sort %>% unique()

for (i in vals) {
  message(i)

cr$data[[1]] %>% 
  mutate(paste('VARIABLE', i, sep = '') =  case_when(carb == i ~ hp/mpg, # At this line, all i shall be first element of vals
                          TRUE ~ 0)) %>% 
  mutate(paste('VARIABLE', i, sep = '') =  case_when(carb == i ~ hp/mpg, # At this line, all i shall be second element of vals
                          TRUE ~ 0)) %>% 
  mutate(paste('VARIABLE', i, sep = '') =  case_when(carb == i ~ hp/mpg, # At this line, all i shall be third element of vals
                                                            TRUE ~ 0))
}

推荐答案

一种方法是使用包 fastDummies

一次执行一个数据帧:

cr$data[[1]] %>%
  dummy_cols(select_columns = 'carb')%>%
  mutate_at(vars(starts_with('carb_')),funs(.*hp/mpg))

您也可以先执行此操作,然后按齿轮进行分组,因为您没有在计算中使用齿轮值,所以没关系.为此:

You can also do this first and the group by gear since you are not using gear value in calculation so it wouldn't matter. For that:

cr_new=mtcars %>%
  dummy_cols(select_columns = 'carb')%>%
  mutate_at(vars(starts_with('carb_')),funs(.*hp/mpg))%>%
  group_by(gear)%>%
  nest()

这篇关于自动执行Dplyr的变异功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆