使用dplyr mutate自动生成新的变量名 [英] Automatically generate new variable names using dplyr mutate

查看:398
本文介绍了使用dplyr mutate自动生成新的变量名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在使用dplyr时动态创建变量名;尽管如此,我也可以使用非dplyr解决方案。

I would like to create variable names dynamically while using dplyr; although, I’d be fine with a non-dplyr solution as well.

例如:

data(iris)
library(dplyr) 

iris <- iris %>%
  group_by(Species) %>%
  mutate(
    lag_Sepal.Length = lag(Sepal.Length),
    lag_Sepal.Width  = lag(Sepal.Width),
    lag_Petal.Length = lag(Petal.Length)
  ) %>%
  ungroup

head(iris)

    Sepal.Length Sepal.Width Petal.Length Petal.Width Species lag_Sepal.Length lag_Sepal.Width
             (dbl)       (dbl)        (dbl)       (dbl)  (fctr)            (dbl)           (dbl)
    1          5.1         3.5          1.4         0.2  setosa               NA              NA
    2          4.9         3.0          1.4         0.2  setosa              5.1             3.5
    3          4.7         3.2          1.3         0.2  setosa              4.9             3.0
    4          4.6         3.1          1.5         0.2  setosa              4.7             3.2
    5          5.0         3.6          1.4         0.2  setosa              4.6             3.1
    6          5.4         3.9          1.7         0.4  setosa              5.0             3.6
    Variables not shown: lag_Petal.Length (dbl)

但是,而不是这三次,我想创建这些lag变量中的100个,名称为:lag_original变量名称。我试图找出如何做到这一点,而不需要输入新的变量名称100次,但我很快就会发现。

But, instead of doing this three times, I want to create 100 of these "lag" variables that take in the name: lag_original variable name. I’m trying to figure out how to do this without typing the new variable name 100 times, but I’m coming up short.

我已经研究了示例和这个例子在其他地方。他们是相似的,但我不能把我需要的具体解决方案拼凑在一起。任何帮助都不胜感激!

I’ve looked into this example and this example elsewhere on SO. They are similar, but I’m not quite able to piece together the specific solution I need. Any help is appreciated!

修改

感谢@BenFasoli的灵感。我拿了他的答案,调整了一点,以获得我需要的解决方案。
我也使用了
这个RStudio Blog 此SO帖子。变量名称中的滞后是尾随而不是领先,但是我可以这样做。

Edit
Thanks to @BenFasoli for the inspiration. I took his answer and tweaked it just a bit to get the solution I needed. I also used This RStudio Blog and This SO post. The "lag" in the variable name is trailing instead of leading, but I can live with that.

我的最终代码发布在这里,以防其他人有所帮助:

My final code is posted here in case it’s helpful to anyone else:

lagged <- iris %>%
  group_by(Species) %>%
  mutate_at(
    vars(Sepal.Length:Petal.Length),
    funs("lag" = lag)) %>%
  ungroup

# A tibble: 6 x 8
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.Length_lag Sepal.Width_lag
         <dbl>       <dbl>        <dbl>       <dbl>  <fctr>            <dbl>           <dbl>
1          5.1         3.5          1.4         0.2  setosa               NA              NA
2          4.9         3.0          1.4         0.2  setosa              5.1             3.5
3          4.7         3.2          1.3         0.2  setosa              4.9             3.0
4          4.6         3.1          1.5         0.2  setosa              4.7             3.2
5          5.0         3.6          1.4         0.2  setosa              4.6             3.1
6          5.4         3.9          1.7         0.4  setosa              5.0             3.6
# ... with 1 more variables: Petal.Length_lag <dbl>


推荐答案

您可以使用 mutate_all (或 mutate_each 为特定的列),然后在列名前添加 lag _


You can use mutate_all (or mutate_each for specific columns) then prepend lag_ to the column names.

data(iris)
library(dplyr) 

lag_iris <- iris %>%
  group_by(Species) %>%
  mutate_all(funs(lag(.))) %>%
  ungroup
colnames(lag_iris) <- paste0('lag_', colnames(lag_iris))

head(lag_iris)

  lag_Sepal.Length lag_Sepal.Width lag_Petal.Length lag_Petal.Width lag_Species
             <dbl>           <dbl>            <dbl>           <dbl>      <fctr>
1               NA              NA               NA              NA      setosa
2              5.1             3.5              1.4             0.2      setosa
3              4.9             3.0              1.4             0.2      setosa
4              4.7             3.2              1.3             0.2      setosa
5              4.6             3.1              1.5             0.2      setosa
6              5.0             3.6              1.4             0.2      setosa

这篇关于使用dplyr mutate自动生成新的变量名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆