突变多个变量以创建多个新变量 [英] Mutate multiple variable to create multiple new variables

查看：74 发布时间：2020/10/26 2:36:05 r dplyr tidyverse tidyselect

本文介绍了突变多个变量以创建多个新变量的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设我有一个 tibble ，我需要在其中采用多个变量并将其变异为新的多个新变量。

Let's say I have a tibble where I need to take multiple variables and mutate them into new multiple new variables.

例如，下面是一个简单的小标题：

As an example, here is a simple tibble:

tb <- tribble(
  ~x, ~y1, ~y2, ~y3, ~z,
  1,2,4,6,2,
  2,1,2,3,3,
  3,6,4,2,1
)

I想要从名称以 y开头的每个变量中减去变量z，并将结果变异为tb的新变量。另外，假设我不知道我有多少个 y变量。我希望该解决方案很好地适合 tidyverse / dplyr 工作流程。

I want to subtract variable z from every variable with a name starting with "y", and mutate the results as new variables of tb. Also, suppose I don't know how many "y" variables I have. I want the solution to fit nicely within tidyverse / dplyr workflow.

本质上，我不了解如何将多个变量突变为多个新变量。我不确定在这种情况下是否可以使用 mutate ？我已经尝试过 mutate_if ，但是我认为我使用的方式不正确（并且出现错误）：

In essence, I don't understand how to mutate multiple variables into multiple new variables. I'm not sure if you can use mutate in this instance? I've tried mutate_if, but I don't think I'm using it right (and I get an error):

tb %>% mutate_if(starts_with("y"), funs(.-z))

#Error: No tidyselect variables were registered

提前谢谢！

推荐答案

由于要对列名进行操作，因此需要使用 mutate_at 而不是 mutate_if 它使用列中的值

Because you are operating on column names, you need to use mutate_at rather than mutate_if which uses the values within columns

tb %>% mutate_at(vars(starts_with("y")), funs(. - z))
#> # A tibble: 3 x 5
#>       x    y1    y2    y3     z
#>   <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1     1     0     2     4     2
#> 2     2    -2    -1     0     3
#> 3     3     5     3     1     1

要创建新列，而不是覆盖现有列，我们可以将名称命名为 funs

To create new columns, instead of overwriting existing ones, we can give name to funs

# add suffix
tb %>% mutate_at(vars(starts_with("y")), funs(mod = . - z))
#> # A tibble: 3 x 8
#>       x    y1    y2    y3     z y1_mod y2_mod y3_mod
#>   <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>
#> 1     1     2     4     6     2      0      2      4
#> 2     2     1     2     3     3     -2     -1      0
#> 3     3     6     4     2     1      5      3      1

# remove suffix, add prefix
tb %>%
  mutate_at(vars(starts_with("y")),  funs(mod = . - z)) %>%
  rename_at(vars(ends_with("_mod")), funs(paste("mod", gsub("_mod", "", .), sep = "_")))
#> # A tibble: 3 x 8
#>       x    y1    y2    y3     z mod_y1 mod_y2 mod_y3
#>   <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>
#> 1     1     2     4     6     2      0      2      4
#> 2     2     1     2     3     3     -2     -1      0
#> 3     3     6     4     2     1      5      3      1

编辑：在 dplyr 0.8.0 或更高版本中，不建议使用 funs（）（ source1 & 源2 ），需要改用 list（）

Edit: In dplyr 0.8.0 or higher versions, funs() will be deprecated (source1 & source2), need to use list() instead

tb %>% mutate_at(vars(starts_with("y")), list(~ . - z))
#> # A tibble: 3 x 5
#>       x    y1    y2    y3     z
#>   <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1     1     0     2     4     2
#> 2     2    -2    -1     0     3
#> 3     3     5     3     1     1

tb %>% mutate_at(vars(starts_with("y")), list(mod = ~ . - z))
#> # A tibble: 3 x 8
#>       x    y1    y2    y3     z y1_mod y2_mod y3_mod
#>   <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>
#> 1     1     2     4     6     2      0      2      4
#> 2     2     1     2     3     3     -2     -1      0
#> 3     3     6     4     2     1      5      3      1

tb %>%
  mutate_at(vars(starts_with("y")),  list(mod = ~ . - z)) %>%
  rename_at(vars(ends_with("_mod")), list(~ paste("mod", gsub("_mod", "", .), sep = "_")))
#> # A tibble: 3 x 8
#>       x    y1    y2    y3     z mod_y1 mod_y2 mod_y3
#>   <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>
#> 1     1     2     4     6     2      0      2      4
#> 2     2     1     2     3     3     -2     -1      0
#> 3     3     6     4     2     1      5      3      1

编辑2 ： dplyr 1.0。 0+ 具有 across（） 函数可进一步简化此任务

Edit 2: dplyr 1.0.0+ has across() function which simplifies this task even further

基本用法

across（）有两个主要参数：

第一个参数 .cols 选择所需的列进行操作。
它使用整洁的选择（例如 select（）），因此您可以按
的位置，名称和类型来选择变量。

The first argument, .cols, selects the columns you want to operate on. It uses tidy selection (like select()) so you can pick variables by position, name, and type.

第二个参数 .fns 是要应用于
每列的一个函数或函数列表。这也可以是Purrr样式的公式（或公式列表）
，例如〜.x / 2 。（该参数是可选的，如果只希望
来获取基础数据，则可以将其忽略；您将看到
vignette（ rowwise）中使用的技术。 。）

The second argument, .fns, is a function or list of functions to apply to each column. This can also be a purrr style formula (or list of formulas) like ~ .x / 2. (This argument is optional, and you can omit it if you just want to get the underlying data; you'll see that technique used in vignette("rowwise").)

# Control how the names are created with the `.names` argument which 
# takes a [glue](http://glue.tidyverse.org/) spec:
tb %>% 
  mutate(
    across(starts_with("y"), ~ .x - z, .names = "mod_{col}")
  )
#> # A tibble: 3 x 8
#>       x    y1    y2    y3     z mod_y1 mod_y2 mod_y3
#>   <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>
#> 1     1     2     4     6     2      0      2      4
#> 2     2     1     2     3     3     -2     -1      0
#> 3     3     6     4     2     1      5      3      1

tb %>% 
  mutate(
    across(num_range(prefix = "y", range = 1:3), ~ .x - z, .names = "mod_{col}")
  )
#> # A tibble: 3 x 8
#>       x    y1    y2    y3     z mod_y1 mod_y2 mod_y3
#>   <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>
#> 1     1     2     4     6     2      0      2      4
#> 2     2     1     2     3     3     -2     -1      0
#> 3     3     6     4     2     1      5      3      1

### Multiple functions
tb %>% 
  mutate(
    across(c(matches("x"), contains("z")), ~ max(.x, na.rm = TRUE), .names = "max_{col}"),
    across(c(y1:y3), ~ .x - z, .names = "mod_{col}")
  )
#> # A tibble: 3 x 10
#>       x    y1    y2    y3     z max_x max_z mod_y1 mod_y2 mod_y3
#>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>
#> 1     1     2     4     6     2     3     3      0      2      4
#> 2     2     1     2     3     3     3     3     -2     -1      0
#> 3     3     6     4     2     1     3     3      5      3      1

^{由 reprex包（v0.2.1）}

^{Created on 2018-10-29 by the reprex package (v0.2.1)}

这篇关于突变多个变量以创建多个新变量的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

突变多个变量以创建多个新变量 [英] Mutate multiple variable to create multiple new variables

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

突变多个变量以创建多个新变量 [英] Mutate multiple variable to create multiple new variables

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭