将数据框列表转换为具有列表名称的单个数据框 [英] Convert a list of data frames into a single data frame with list name

查看:75
本文介绍了将数据框列表转换为具有列表名称的单个数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望确定一种将数据帧列表转换为单个数据帧的有效方法.以下是我的可复制MWE:

I am hoping to determine an efficient way to convert a list of data frames into a single data frame. Below is my reproducible MWE:

set.seed(1)
ABAge = runif(100)
ABPoints = rnorm(100)
ACAge = runif(100)
ACPoints = rnorm(100)
BCAge = runif(100)
BCPoints = rnorm(100)

A_B <- data.frame(ID = as.character(paste0("ID", 1:100)), Age = ABAge, Points = ABPoints)
A_C <- data.frame(ID = as.character(paste0("ID", 1:100)), Age = ACAge, Points = ACPoints)
B_C <- data.frame(ID = as.character(paste0("ID", 1:100)), Age = BCAge, Points = BCPoints)
A_B$ID <- as.character(A_B$ID)
A_C$ID <- as.character(A_C$ID)
B_C$ID <- as.character(B_C$ID)

listFormat <- list("A_B" = A_B, "A_C" = A_C, "B_C" = B_C)

dfFormat <- data.frame(ID = as.character(paste0("ID", 1:100)), A_B.Age = ABAge, A_B.Points = ABPoints, A_C.Age = ACAge, A_C.Points = ACPoints, B_C.Age = BCAge, B_C.Points = BCPoints)
dfFormat$ID = as.character(dfFormat$ID)

这将导致数据帧格式(dfFormat)如下所示:

This results in a data frame format (dfFormat) that looks like this:

'data.frame':   100 obs. of  7 variables:
 $ ID        : chr  "ID1" "ID2" "ID3" "ID4" ...
 $ A_B.Age   : num  0.266 0.372 0.573 0.908 0.202 ...
 $ A_B.Points: num  0.398 -0.612 0.341 -1.129 1.433 ...
 $ A_C.Age   : num  0.6737 0.0949 0.4926 0.4616 0.3752 ...
 $ A_C.Points: num  0.409 1.689 1.587 -0.331 -2.285 ...
 $ B_C.Age   : num  0.814 0.929 0.147 0.75 0.976 ...
 $ B_C.Points: num  1.474 0.677 0.38 -0.193 1.578 ...

以及数据帧listFormat的列表,如下所示:

and a list of data frames listFormat that looks like this:

List of 3
 $ A_B:'data.frame':    100 obs. of  3 variables:
  ..$ ID    : chr [1:100] "ID1" "ID2" "ID3" "ID4" ...
  ..$ Age   : num [1:100] 0.266 0.372 0.573 0.908 0.202 ...
  ..$ Points: num [1:100] 0.398 -0.612 0.341 -1.129 1.433 ...
 $ A_C:'data.frame':    100 obs. of  3 variables:
  ..$ ID    : chr [1:100] "ID1" "ID2" "ID3" "ID4" ...
  ..$ Age   : num [1:100] 0.6737 0.0949 0.4926 0.4616 0.3752 ...
  ..$ Points: num [1:100] 0.409 1.689 1.587 -0.331 -2.285 ...
 $ B_C:'data.frame':    100 obs. of  3 variables:
  ..$ ID    : chr [1:100] "ID1" "ID2" "ID3" "ID4" ...
  ..$ Age   : num [1:100] 0.814 0.929 0.147 0.75 0.976 ...
  ..$ Points: num [1:100] 1.474 0.677 0.38 -0.193 1.578 ...

我希望提出一种自动方法,将dfFormat转换为listFormat.从以上对象可以看出,有两个主要条件:

I am hoping to come up with an automated way to convert the dfFormat to listFormat. As can be seen in the above objects there are two main conditions:

1)如果listFormat的每个子列表中有一个公共列(名称和内容)(在这些示例中为ID),则在输出的dfFormat中将不重复它们(在此示例中,最后一个ID列)

1) If there is a common column (name and contents) in each sublist of listFormat (in these examples ID), then they are not repeated in the outputted dfFormat (in this example, it has one final ID column),

2)listFormat子列表中的其余列名称成为dfFormat中的列,并且具有这样的名称:它们保留其子列表名称(即"A_B"),后跟一个点,然后保留其原始列名称(即年龄),使其成为dfFormat中的(即"A_B.Age").

2) The rest of the column names in sublists of listFormat become columns in dfFormat and have names such that they retain their sublist name (i.e "A_B") followed by a dot and then their original column name (i.e. Age), so that it becomes (i.e. "A_B.Age") in the dfFormat.

我尝试了各种unlist()sapply代码,但到目前为止仍未成功.有什么有效的方法可以做到这一点?

I have tried various unlist() and sapply codes but have been unsuccessful thus far. What is an efficient way to accomplish this?

推荐答案

在需要保留输入listFormat的情况下,将listFormat复制到L.从第一个列中的cbind组成部分中除去L的每个列中的ID列,然后确定第一列的名称.不使用任何软件包.

Copy listFormat to L in case we need to preserve the input, listFormat. Remove the ID column from each component of L except the first, cbind what we have left together and then fix up the name of the first column. No packages are used.

L <- listFormat
L[-1] <- lapply(L[-1], transform, ID = NULL)
DF <- do.call(cbind, L)
names(DF)[1] <- "ID"

给予:

> str(DF)
'data.frame':   100 obs. of  7 variables:
 $ ID        : chr  "ID1" "ID2" "ID3" "ID4" ...
 $ A_B.Age   : num  0.9932 0.1451 0.6166 0.0372 0.9039 ...
 $ A_B.Points: num  0.4752 0.0288 1.0548 0.6113 0.0651 ...
 $ A_C.Age   : num  0.912 0.761 0.618 0.895 0.507 ...
 $ A_C.Points: num  -0.515 -0.945 0.398 0.502 -1.021 ...
 $ B_C.Age   : num  0.7935 0.2747 0.0487 0.6307 0.3499 ...
 $ B_C.Points: num  -0.963 -1.772 1.716 -0.819 0.577 ...

这篇关于将数据框列表转换为具有列表名称的单个数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆