根据向量的内容对列表中的数据帧进行子集 [英] Subset a dataframes in a list based on the content of a vector

查看:50
本文介绍了根据向量的内容对列表中的数据帧进行子集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有五个数据框的列表.每个数据框包含一个维度列和4个值列.我想基于向量的内容对列表中的每个数据帧进行子集化.

I have a list of five dataframes. Each dataframe contains one dimension column and 4 value columns. I would like to subset each dataframe in the list based on the contents of a vector.

df <- data.frame(x = 1:100, y2 = runif(100, 0, 100), y3 = runif(100, 0, 100), y4 = runif(100, 0, 100), y5 = runif(100,0,100))
df2 <- data.frame(x = 1:100, y2 = runif(100, 0, 100), y3 = runif(100, 0, 100), y4 = runif(100, 0, 100), y5 = runif(100,0,100))
df3 <- data.frame(x = 1:100, y2 = runif(100, 0, 100), y3 = runif(100, 0, 100), y4 = runif(100, 0, 100), y5 = runif(100,0,100))
df4 <- data.frame(x = 1:100, y2= runif(100, 0, 100), y4 = runif(100, 0, 100), y4 = runif(100, 0, 100), y5 = runif(100,0,100))
df5 <- data.frame(x = 1:100, y2= runif(100, 0, 100), y4 = runif(100, 0, 100), y4 = runif(100, 0, 100), y5 = runif(100,0,100))
frames <- list(df, df2, df3, df4, df5)

因此,在此示例中,我的列表是框架".假设我有以下向量:

So in this example, my list is "frames". Let's say I have the following vector:

subs <- 50:60

我在这里的目标是对数据框列表进行子集处理,以使每个数据框仅包含第一个列的值在subs向量内的行.

My goal here would be to subset the list of dataframes such that each dataframe only contains rows where the value of the first colunmn is inside the subs vector.

有什么建议吗?

谢谢, 本

推荐答案

在我看来,您几乎所有的问题都是关于具有相同列的数据帧列表的,这导致您在每个操作上都使用lapply循环(这似乎效率很低).

It seems to me that almost all of your questions are regarding a list of data frames with same columns which cause you to use lapply loops on every single operation (which seem highly inefficient).

或者,您可以通过简单地将所有列表绑定到一个对象中,同时保持每个data.frame的ID来向量化大多数操作,完成所有数据操作后,可以使用.

Alternatively, you could vectorize most of your operations by simply binding all the lists into a single object while maintaining the ID of each data.frame and when finished with all the data manipulations, you could split them back into lists using split.

以下是在Github上使用data.table s 开发版本的示例> (您可以使用dplyr::unnest获得类似的结果)

Here's an example using data.tables development version on Github (you could achieve similar results using dplyr::unnest)

library(data.table)
Res <- rbindlist(frames, idcol = "ID")[x %between% subs]
#     ID  x        y2       y3        y4       y5
#  1:  1 50 54.692889 58.51886 12.754368 35.61516
#  2:  1 51 21.206308 12.77442 52.440787 93.67734
#  3:  2 50 12.655685 84.55044  3.194644 54.46706
#  4:  2 51 83.840276 61.32614 61.139038 92.39402
#  5:  3 50 54.847797 20.68419 19.585931 48.87072
#  6:  3 51 75.510691 68.17955 98.696579 91.48688
#  7:  4 50 63.203071 95.94132 41.835923 60.68250
#  8:  4 51 75.481676 51.67619 80.393557 24.48381
#  9:  5 50 65.744847 50.36983 86.548843 83.31730
# 10:  5 51  4.956835 57.25666 27.106395 32.92020

最终(在完成所有数据操作之后)您将要做的

Eventually (after finished with the all the data manipulations) you will just do

split(Res, Res$ID)

为了将数据帧重新放入列表

In order to get the data.frames back into lists

这篇关于根据向量的内容对列表中的数据帧进行子集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆