R-从数据列表列表中提取信息 [英] R - Extracting information from list of lists of data.frames
问题描述
我有两个需求,它们都连接到类似于以下可重现的一个数据集.我有一个18个实体的列表,每个实体由17-19个data.frames列表组成.可重现的数据集如下(有矩阵而不是data.frames,但我不认为这有所作为):
I have two needs, both connected to a dataset similar to the reproducible one below. I have a list of 18 entities, each composed of a list of 17-19 data.frames. Reproducible dataset follows (there are matrices instead of data.frames, but I do not suppose that makes a difference):
test <- list(list(matrix(10:(50-1), ncol = 10), matrix(60:(100-1), ncol = 10), matrix(110:(150-1), ncol = 10)),
list(matrix(200:(500-1), ncol = 10), matrix(600:(1000-1), ncol = 10), matrix(1100:(1500-1), ncol = 10)))
- 我需要将每个数据框/矩阵分为两个部分(按给定的行数),然后保存到列表的新列表中
- 第二,我需要从列表的每个
data.frame
中提取并保存给定的列.
- I need to subset each dataframe/matrix into two parts (by a given number of rows) and save to a new list of lists
- Secondly, I need to extract and save a given column(s) out of every
data.frame
in a list of lists.
除了for()
之外,我不知道该怎么做,但是我相信apply()
系列功能应该可以实现.
I have no idea how to go around doing it apart from for()
, but I am sure it should be possible with apply()
family of functions.
感谢您阅读
我的预期输出如下:
extractedColumns <- list(list(matrix(10:(50-1), ncol = 10)[, 2], matrix(60:(100-1), ncol = 10)[, 2], matrix(110:(150-1), ncol = 10)[, 2]),
list(matrix(200:(500-1), ncol = 10)[, 2], matrix(600:(1000-1), ncol = 10)[, 2], matrix(1100:(1500-1), ncol = 10)[, 2]))
numToSubset <- 3
substetFrames <- list(list(list(matrix(10:(50-1), ncol = 10)["first length - numToSubset rows", ], matrix(10:(50-1), ncol = 10)["last numToSubset rows", ]),
list(matrix(60:(100-1), ncol = 10)["first length - numToSubset rows", ], matrix(60:(100-1), ncol = 10)["last numToSubset rows", ]),
list(matrix(110:(150-1), ncol = 10)["first length - numToSubset rows", ], matrix(110:(150-1), ncol = 10)["last numToSubset rows", ])),
etc...)
它看起来很凌乱,希望您能遵循我的要求.
It gets to look very messy, hope you can follow what I want.
推荐答案
您可以使用两个嵌套的lapply
:
You can use two nested lapply
s:
lapply(test, function(x) lapply(x, '[', c(2, 3)))
输出:
[[1]]
[[1]][[1]]
[1] 11 12
[[1]][[2]]
[1] 61 62
[[1]][[3]]
[1] 111 112
[[2]]
[[2]][[1]]
[1] 201 202
[[2]][[2]]
[1] 601 602
[[2]][[3]]
[1] 1101 1102
说明
第一个lapply
将应用于test
的两个列表.这两个列表中的每个列表都包含另一个3.第二个lapply
将遍历这3个列表和子集(第二个lapply
中的'['
函数)列c(2, 3)
.
Explanation
The first lapply
will be applied on the two lists of test
. Each one of those two lists contain another 3. The second lapply
will iterate over those 3 lists and subset (thats the '['
function in the second lapply
) columns c(2, 3)
.
注意:对于矩阵,[
将对元素2和3进行子集化,但是在data.frame上使用时,相同的函数将对列进行子集化.
Note: In the case of a matrix [
will subset elements 2 and 3 but the same function will subset columns when used on a data.frame.
lapply
使用匿名函数非常灵活.通过将代码更改为:
lapply
is very flexible with the use of anonymous functions. By changing the code into:
#change rows and columns into what you need
lapply(test, function(x) lapply(x, function(y) y[rows, columns]))
您可以指定所需的行或列的任意组合.
You can specify any combination of rows or columns you want.
这篇关于R-从数据列表列表中提取信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!