将不同长度的列表组合到数据帧中 [英] Combining lists of different lengths into data frame

查看:83
本文介绍了将不同长度的列表组合到数据帧中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有下面的SampleData之类的数据,其中有不同长度的列表,我想将它们合并到一个数据帧中,例如下面的Desired Result.我已经尝试过使用qpcR包中的lapply和cbind.na,如下例所示,但是由于某种原因,它不允许我将结果转换为数据帧.如果我只使用了两个列表和cbind.na,它将把它们组合在一起,并按需要添加NA到末尾,但是当我尝试在lapply中使用它时,它只是将它们保留为不同长度列表的列表.任何提示都将不胜感激.

I have data like the SampleData below, which has lists of different length that I'd like to combine in to a data frame like the Desired Result below. I've tried using lapply and cbind.na from the qpcR package like the example below, but for some reason it won't let me turn the result into a data frame. If I just used two of the lists and cbind.na it will combine them and add the NA to the end like I want, but when I try using it in lapply it just leaves them as a list of different length lists. Any tips are greatly appreciated.

SampleData<-list(list(1,2,3),list(1,2),list(3,4,6,7))

Desired Result:
structure(list(V1 = c(1, 2, 3, NA), V2 = c(1, 2, NA, NA), V3 = c(3, 
4, 6, 7)), .Names = c("V1", "V2", "V3"), row.names = c(NA, -4L
), class = "data.frame")


Example Code:

lapply(SampleData,qpcR:::cbind.na)

推荐答案

我对数据的第一个直觉是,通过使用data.frame,您隐式地声明一行中的项目是配对的.也就是说,在您的示例中,$V1的"3"和$V3的"6"旨在相互关联. (如果查看mtcars,则第一行的每一列都直接与 关联"Mazda RX4".)如果不正确,则进行扭曲将它们放入data.frame这样会错误地表示您的数据,并希望鼓励进行错误的分析/假设.

My first instinct looking at your data is that, by using a data.frame, you are implicitly stating that items across a row are paired. That is, in your example, the "3" of $V1 and "6" of $V3 are meant to be associated with each other. (If you look at mtcars, each column of the first row is associated directly and solely with the "Mazda RX4".) If this is not true, then warping them into a data.frame like this is mis-representing your data and like to encourage incorrect analysis/assumptions.

假设它们实际上是配对"的,我的下一个直觉是尝试使用do.call(cbind, SampleData)之类的方法,但这会带来回收的数据,而不是您想要的数据.因此,阻止回收的诀窍是迫使它们具有相同的长度.

Assuming that they are in fact "paired", my next instinct is to try something like do.call(cbind, SampleData), but that lends to recycled data, not what you want. So, the trick to deter recycling is to force them to be all the same length.

maxlen <- max(lengths(SampleData))
SampleData2 <- lapply(SampleData, function(lst) c(lst, rep(NA, maxlen - length(lst))))

我们可以先重命名:

names(SampleData2) <- paste("V", seq_along(SampleData2), sep = "")

由于数据看起来是同质的(并且应该,如果您打算将每个元素都放在data.frame的列中),取消列出它很有用:

Since the data appears homogenous (and should be, if you intend to put each element as a column of a data.frame), it is useful to un-list it:

SampleData3 <- lapply(SampleData2, unlist)

然后就这么简单:

as.data.frame(SampleData3)
#   V1 V2 V3
# 1  1  1  3
# 2  2  2  4
# 3  3 NA  6
# 4 NA NA  7

这篇关于将不同长度的列表组合到数据帧中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆