使用dplyr mutate自动生成新的变量名 [英] Automatically generate new variable names using dplyr mutate
问题描述
我想在使用dplyr时动态创建变量名;尽管如此,我也可以使用非dplyr解决方案。
I would like to create variable names dynamically while using dplyr; although, I’d be fine with a non-dplyr solution as well.
例如:
data(iris)
library(dplyr)
iris <- iris %>%
group_by(Species) %>%
mutate(
lag_Sepal.Length = lag(Sepal.Length),
lag_Sepal.Width = lag(Sepal.Width),
lag_Petal.Length = lag(Petal.Length)
) %>%
ungroup
head(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species lag_Sepal.Length lag_Sepal.Width
(dbl) (dbl) (dbl) (dbl) (fctr) (dbl) (dbl)
1 5.1 3.5 1.4 0.2 setosa NA NA
2 4.9 3.0 1.4 0.2 setosa 5.1 3.5
3 4.7 3.2 1.3 0.2 setosa 4.9 3.0
4 4.6 3.1 1.5 0.2 setosa 4.7 3.2
5 5.0 3.6 1.4 0.2 setosa 4.6 3.1
6 5.4 3.9 1.7 0.4 setosa 5.0 3.6
Variables not shown: lag_Petal.Length (dbl)
但是,而不是这三次,我想创建这些lag变量中的100个,名称为:lag_original变量名称。我试图找出如何做到这一点,而不需要输入新的变量名称100次,但我很快就会发现。
But, instead of doing this three times, I want to create 100 of these "lag" variables that take in the name: lag_original variable name. I’m trying to figure out how to do this without typing the new variable name 100 times, but I’m coming up short.
我已经研究了此示例和这个例子在其他地方。他们是相似的,但我不能把我需要的具体解决方案拼凑在一起。任何帮助都不胜感激!
I’ve looked into this example and this example elsewhere on SO. They are similar, but I’m not quite able to piece together the specific solution I need. Any help is appreciated!
修改
感谢@BenFasoli的灵感。我拿了他的答案,调整了一点,以获得我需要的解决方案。
我也使用了这个RStudio Blog 和此SO帖子。变量名称中的滞后是尾随而不是领先,但是我可以这样做。
Edit
Thanks to @BenFasoli for the inspiration. I took his answer and tweaked it just a bit to get the solution I needed.
I also used This RStudio Blog and This SO post. The "lag" in the variable name is trailing instead of leading, but I can live with that.
我的最终代码发布在这里,以防其他人有所帮助:
My final code is posted here in case it’s helpful to anyone else:
lagged <- iris %>%
group_by(Species) %>%
mutate_at(
vars(Sepal.Length:Petal.Length),
funs("lag" = lag)) %>%
ungroup
# A tibble: 6 x 8
Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.Length_lag Sepal.Width_lag
<dbl> <dbl> <dbl> <dbl> <fctr> <dbl> <dbl>
1 5.1 3.5 1.4 0.2 setosa NA NA
2 4.9 3.0 1.4 0.2 setosa 5.1 3.5
3 4.7 3.2 1.3 0.2 setosa 4.9 3.0
4 4.6 3.1 1.5 0.2 setosa 4.7 3.2
5 5.0 3.6 1.4 0.2 setosa 4.6 3.1
6 5.4 3.9 1.7 0.4 setosa 5.0 3.6
# ... with 1 more variables: Petal.Length_lag <dbl>
推荐答案
您可以使用 mutate_all
(或 mutate_each
为特定的列),然后在列名前添加 lag _
。
You can use mutate_all
(or mutate_each
for specific columns) then prepend lag_
to the column names.
data(iris)
library(dplyr)
lag_iris <- iris %>%
group_by(Species) %>%
mutate_all(funs(lag(.))) %>%
ungroup
colnames(lag_iris) <- paste0('lag_', colnames(lag_iris))
head(lag_iris)
lag_Sepal.Length lag_Sepal.Width lag_Petal.Length lag_Petal.Width lag_Species
<dbl> <dbl> <dbl> <dbl> <fctr>
1 NA NA NA NA setosa
2 5.1 3.5 1.4 0.2 setosa
3 4.9 3.0 1.4 0.2 setosa
4 4.7 3.2 1.3 0.2 setosa
5 4.6 3.1 1.5 0.2 setosa
6 5.0 3.6 1.4 0.2 setosa
这篇关于使用dplyr mutate自动生成新的变量名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!