dplyr:mutate_at +合并:列的动态名称 [英] dplyr: mutate_at + coalesce: dynamic names of columns
问题描述
我一直在尝试将 mutate_at
与 coalesce
结合使用,以防产生列名
I've been trying for awhile to combine mutate_at
with coalesce
in case in which names of columns are generated dynamically.
在我的示例中,只有五列,但在实际数据中,则有更多列(并非所有列都应包含在中
In my example there are only five columns, but in the real data there are much more (and not all columns should be included in coalesce
step).
示例DF:
data_example <- data.frame(
aa = c(1, NA, NA),
bb = c(NA, NA, 2),
cc = c(6, 7, 8),
aa_extra = c(2, 2, NA),
bb_extra = c(1, 2, 3)
)
预期输出:
aa bb cc aa_extra bb_extra
1 1 1 6 2 1
2 2 2 7 2 2
3 NA 2 8 NA 3
输出为结构
:
structure(list(aa = c(1, 2, NA), bb = c(1, 2, 2), cc = c(6, 7,
8), aa_extra = c(2, 2, NA), bb_extra = c(1, 2, 3)), class = "data.frame", row.names = c(NA,
-3L))
我已经尝试过类似的方法,但是没有成功(只能将字符串转换为符号)。我想避免创建额外的变量,只需在 mutate_at
表达式中包含所有内容,因为这是较长的dplyr流的一部分。
I've tried something like this, but without success ("Only strings can be converted to symbols"). I would like to avoid creation of extra variables, just include everything in mutate_at
expression, since this is a part of longer dplyr "flow".
data_example %>%
dplyr::mutate_at(
gsub("_extra", "", grep("_extra$",
colnames(.),
perl = T,
value = T)),
dplyr::funs(
dplyr::coalesce(., !!! dplyr::sym(paste0(., "_extra")))
)
)
我也尝试过此操作(没有错误,但列 bb
的值是错误的):
I've tried also this (no error, but values for column bb
are wrong):
data_example %>%
dplyr::mutate_at(
gsub("_extra", "", grep("_extra$",
colnames(.),
perl = T,
value = T)),
dplyr::funs(
dplyr::coalesce(., !!as.name(paste0(names(.), "_extra")))
)
)
如何获取已处理列的名称并将其传递给 coalesce
?
How to get name of processed column and pass it to coalesce
?
推荐答案
我们可以将数据集拆分
到删除列名称的子字符串( _ extra
)后,然后使用<$ c的数据的列表
$ c> map 循环遍历列表
, coalesce
列,然后在绑定
与原始数据集中的 _extra列
We can split
the dataset into a list
of data.frames after removing the substring of column names ("_extra"
), then with map
loop through the list
, coalesce
the column and then bind
with the "_extra" columns in the original dataset
library(tidyverse)
data_example %>%
split.default(str_remove(names(.), "_extra")) %>%
map_df(~ coalesce(!!! .x)) %>%
#or use
# map_df(reduce, coalesce) %>%
bind_cols(., select(data_example, ends_with("extra")))
# A tibble: 3 x 5
# aa bb cc aa_extra bb_extra
# <dbl> <dbl> <dbl> <dbl> <dbl>
#1 1 1 6 2 1
#2 2 2 7 2 2
#3 NA 2 8 NA 3
这篇关于dplyr:mutate_at +合并:列的动态名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!