将Mixed-Length命名列表转换为data.frame [英] Convert Mixed-Length named List to data.frame
问题描述
我有以下格式的列表:
[[1]]
[[1]] $ a
[1] 1
[[1]]
$ b [1] 3
[[1]] $ c
[1] 5
[[2]]
[[2]] $ c
[1] 2
[[2] ] $ a
[1] 3
有一个可能的键的预定义列表( a
, b
和 c
,在这种情况下),列表中的每个元素(row)将为这些键中的一个或多个键定义值。我正在寻找一个快速的方式从上面的列表结构到一个data.frame,看起来像下面这样:在这种情况下:
abc
1 1 3 5
2 3 NA 2
任何帮助将不胜感激!
附录
我正在处理一个最多可以有50,000行和3-6列的表,其中指定了大多数值。我将从JSON中获取表,并尝试快速将其转换为data.frame结构。
以下是一些代码,用于创建我将使用的比例的示例列表:
ids< - c(a,b,c)
createList< - function(approxSize = 100){
set.seed )
第五< - round(约大小/ 5)
列表< - list()
列表[1:(第五* 5)] - rep(
list(list(a = 1,b = 2,c = 3),
list(a = 3,b = 4,c = 5),
list = 7,c = 9),
列表(c = 6,a = 8,b = 3),
列表(b = 6)),
第五)
列表
}
只需创建一个包含约
50,000以测试这个大小列表的表现。
这是我的初步想法。它不会加快你的方法,但它确实简化了代码:
#makeDF< - function(List,名称){
pre>
#m < - t(sapply(List,function(X)unlist(X)[Names],
#as.data.frame(m)
#}
## vapply()比sapply()快一点
makeDF< - function(List,Names){
m< - t(vapply(List,
FUN = function(X)unlist(X)[Names],
FUN.VALUE = numeric(length(Names))))
as.data.frame(m)
}
##使用50k项目列表测试计时
ll< - createList(50000)
nms < - c(a,b,c )
system.time(makeDF(ll,nms))
#用户系统已用
#0.47 0.00 0.47
I have a list of the following format:
[[1]] [[1]]$a [1] 1 [[1]]$b [1] 3 [[1]]$c [1] 5 [[2]] [[2]]$c [1] 2 [[2]]$a [1] 3
There is a predefined list of possible "keys" (
a
,b
, andc
, in this case) and each element in the list ("row") will have values defined for one or more of these keys. I'm looking for a fast way to get from the list structure above to a data.frame which would look like the following, in this case:a b c 1 1 3 5 2 3 NA 2
Any help would be appreciated!
Appendix
I'm dealing with a table that will have up to 50,000 rows and 3-6 columns, with most of the values specified. I'll be taking the table in from JSON and trying to quickly get it into data.frame structure.
Here's some code to create a sample list of the scale with which I'll be working:
ids <- c("a", "b", "c") createList <- function(approxSize=100){ set.seed(1234) fifth <- round(approxSize/5) list <- list() list[1:(fifth*5)] <- rep( list(list(a=1, b=2, c=3), list(a=3, b=4, c=5), list(a=7, c=9), list(c=6, a=8, b=3), list(b=6)), fifth) list }
Just create a list with
approxSize
of 50,000 to test the performance on a list of this size.解决方案Here's my initial thought. It doesn't speed up your approach, but it does simplify the code considerably:
# makeDF <- function(List, Names) { # m <- t(sapply(List, function(X) unlist(X)[Names], # as.data.frame(m) # } ## vapply() is a bit faster than sapply() makeDF <- function(List, Names) { m <- t(vapply(List, FUN = function(X) unlist(X)[Names], FUN.VALUE = numeric(length(Names)))) as.data.frame(m) } ## Test timing with a 50k-item list ll <- createList(50000) nms <- c("a", "b", "c") system.time(makeDF(ll, nms)) # user system elapsed # 0.47 0.00 0.47
这篇关于将Mixed-Length命名列表转换为data.frame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!