R-从数据列表列表中提取信息 [英] R - Extracting information from list of lists of data.frames

查看:509
本文介绍了R-从数据列表列表中提取信息的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个需求,它们都连接到类似于以下可重现的一个数据集.我有一个18个实体的列表,每个实体由17-19个data.frames列表组成.可重现的数据集如下(有矩阵而不是data.frames,但我不认为这有所作为):

I have two needs, both connected to a dataset similar to the reproducible one below. I have a list of 18 entities, each composed of a list of 17-19 data.frames. Reproducible dataset follows (there are matrices instead of data.frames, but I do not suppose that makes a difference):

test <- list(list(matrix(10:(50-1), ncol = 10), matrix(60:(100-1), ncol = 10), matrix(110:(150-1), ncol = 10)),
             list(matrix(200:(500-1), ncol = 10), matrix(600:(1000-1), ncol = 10), matrix(1100:(1500-1), ncol = 10)))

  1. 我需要将每个数据框/矩阵分为两个部分(按给定的行数),然后保存到列表的新列表中
  2. 第二,我需要从列表的每个data.frame中提取并保存给定的列.
  1. I need to subset each dataframe/matrix into two parts (by a given number of rows) and save to a new list of lists
  2. Secondly, I need to extract and save a given column(s) out of every data.frame in a list of lists.

除了for()之外,我不知道该怎么做,但是我相信apply()系列功能应该可以实现.

I have no idea how to go around doing it apart from for(), but I am sure it should be possible with apply() family of functions.

感谢您阅读

我的预期输出如下:

extractedColumns <- list(list(matrix(10:(50-1), ncol = 10)[, 2], matrix(60:(100-1), ncol = 10)[, 2], matrix(110:(150-1), ncol = 10)[, 2]),
                         list(matrix(200:(500-1), ncol = 10)[, 2], matrix(600:(1000-1), ncol = 10)[, 2], matrix(1100:(1500-1), ncol = 10)[, 2]))


numToSubset <- 3
substetFrames <- list(list(list(matrix(10:(50-1), ncol = 10)["first length - numToSubset rows", ], matrix(10:(50-1), ncol = 10)["last numToSubset rows", ]), 
                           list(matrix(60:(100-1), ncol = 10)["first length - numToSubset rows", ], matrix(60:(100-1), ncol = 10)["last numToSubset rows", ]),
                                list(matrix(110:(150-1), ncol = 10)["first length - numToSubset rows", ], matrix(110:(150-1), ncol = 10)["last numToSubset rows", ])),
                      etc...)

它看起来很凌乱,希望您能遵循我的要求.

It gets to look very messy, hope you can follow what I want.

推荐答案

您可以使用两个嵌套的lapply:

You can use two nested lapplys:

lapply(test, function(x) lapply(x, '[', c(2, 3)))

输出:

[[1]]
[[1]][[1]]
[1] 11 12

[[1]][[2]]
[1] 61 62

[[1]][[3]]
[1] 111 112


[[2]]
[[2]][[1]]
[1] 201 202

[[2]][[2]]
[1] 601 602

[[2]][[3]]
[1] 1101 1102

说明

第一个lapply将应用于test的两个列表.这两个列表中的每个列表都包含另一个3.第二个lapply将遍历这3个列表和子集(第二个lapply中的'['函数)列c(2, 3).

Explanation

The first lapply will be applied on the two lists of test. Each one of those two lists contain another 3. The second lapply will iterate over those 3 lists and subset (thats the '[' function in the second lapply) columns c(2, 3).

注意:对于矩阵,[将对元素2和3进行子集化,但是在data.frame上使用时,相同的函数将对列进行子集化.

Note: In the case of a matrix [ will subset elements 2 and 3 but the same function will subset columns when used on a data.frame.

lapply使用匿名函数非常灵活.通过将代码更改为:

lapply is very flexible with the use of anonymous functions. By changing the code into:

#change rows and columns into what you need
lapply(test, function(x) lapply(x, function(y) y[rows, columns]))

您可以指定所需的行或列的任意组合.

You can specify any combination of rows or columns you want.

这篇关于R-从数据列表列表中提取信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆