重命名数据框列的列表以模拟连接的后缀 [英] rename list of dataframe columns to mimic joined suffixes

查看:34
本文介绍了重命名数据框列的列表以模拟连接的后缀的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据帧列表:

dd <- list()

dd$data <- list(
  ONE = data.frame(inAll = c(1.1,1.2,1.3), inAll_2 = c(1.4,1.5,1.6)),
  TWO = data.frame(inAll = c(2.1,2.2,2.3), inAll_2 = c(2.4,2.5,2.6)),
  THREE = data.frame(inAll = c(3.1,3.2,3.3), inAll_2 = c(3.4,3.5,3.6)),
  FOUR = data.frame(inAll = c(4.1,4.2,4.3), inAll_2 = c(4.4,4.5,4.6)),
  FIVE = data.frame(inAll = c(5.1,5.2,5.3), inAll_2 = c(5.4,5.5,5.6)),
  SIX = data.frame(inAll = c(6.1,6.2,6.3), inAll_2 = c(6.4,6.5,6.6))
)

然后使用后缀减少这些数据框

And then reduce those dataframes using suffixes

reduce(dd$data, full_join, by = "inAll", suffix = c("_x", "_y"))

我想要的输出是

map(dd$data, list())

但是我希望名称与简化数据集中的后缀相同.

BUT I want the names to be the same as the suffixes in the reduced dataset.

如何扩展此地图功能,以便重命名列表中的列以反映简化的名称?

How do I expand on this map function so that I rename the columns in the list to reflect the reduced names?

[我尝试查看此连接的源代码!],看起来所有匹配但未连接到列的都是:

[I tried looking at the join source code for this!] and it looks like all matching but not joined on columns are:

  • 先给出_x后缀_y
  • 以_x_x和_y_y继续,以此类推
  • 如果带有重复列无后缀的列表项数最后一次

请注意,在我的示例中,这些数据帧通常具有其他列,并且这些列并不总是具有相同的顺序,因此我想避免任何易碎的事情,例如按索引进行匹配!

new_names <- function(df) {
  # logic about new suffixes somehow
  map(df,list())
}

所需的输出

看起来像这样的列表:

Desired Output

A list that looks like this:

dd$data2 <- list(
  ONE = data.frame(inAll = c(1.1,1.2,1.3), inAll_2_x = c(1.4,1.5,1.6)),
  TWO = data.frame(inAll = c(2.1,2.2,2.3), inAll_2_y = c(2.4,2.5,2.6)),
  THREE = data.frame(inAll = c(3.1,3.2,3.3), inAll_2_x_x = c(3.4,3.5,3.6)),
  FOUR = data.frame(inAll = c(4.1,4.2,4.3), inAll_2_y_y = c(4.4,4.5,4.6)),
  FIVE = data.frame(inAll = c(5.1,5.2,5.3), inAll_2_x_x_x = c(5.4,5.5,5.6)),
  SIX = data.frame(inAll = c(6.1,6.2,6.3), inAll_2_y_y_y = c(6.4,6.5,6.6))
)

推荐答案

我们可以使用 strrep (来自 base R )为'x'创建重复的字符串,'y', rep 对其进行匹配,并在 list 列上循环显示 map2 rename_at 第二列,粘贴( str_c )后缀

We can create a repeated character string with strrep (from base R) for 'x', 'y', replicate it, loop over the list column with map2 and rename_at the 2nd column by pasteing (str_c) the suffix passed

library(dplyr)
library(purrr)
library(stringr)
n <- ceiling(length(dd$data)/2)
map2(dd$data,  strrep(rep(c('_x', '_y'), n), rep(seq_len(n), each = 2)), ~
             {nm <- .y
               .x %>% 
                 rename_at(vars(inAll_2), ~ str_c(., nm))
     })
#$ONE
#  inAll inAll_2_x
#1   1.1       1.4
#2   1.2       1.5
#3   1.3       1.6

#$TWO
#  inAll inAll_2_y
#1   2.1       2.4
#2   2.2       2.5
#3   2.3       2.6

#$THREE
#  inAll inAll_2_x_x
#1   3.1         3.4
#2   3.2         3.5
#3   3.3         3.6

#$FOUR
#  inAll inAll_2_y_y
#1   4.1         4.4
#2   4.2         4.5
#3   4.3         4.6

#$FIVE
#  inAll inAll_2_x_x_x
#1   5.1           5.4
#2   5.2           5.5
#3   5.3           5.6

#$SIX
#  inAll inAll_2_y_y_y
#1   6.1           6.4
#2   6.2           6.5
#3   6.3           6.6

这篇关于重命名数据框列的列表以模拟连接的后缀的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆