如何将数据子集存储在列表中？ [英] How to subset data.frames stored in a list?

查看：145 发布时间：2017/3/26 2:14:40 r list dataframe subset lapply

本文介绍了如何将数据子集存储在列表中？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我创建了一个列表，我在每个组件中存储了一个数据帧。现在我想过滤这些数据帧，只保留在特定列中具有NA的行。我希望这个操作的结果是另一个列表，其中包含数据帧，只有那列在该列中有NA。

这是一些代码来澄清我在说什么。假设 d1 和 d2 是我的数据框架

  set.seed（1）
 
 d1< -data.frame（a = rnorm（5），b = c（rep（2006，times = 4） ）
 d2< -data.frame（a = 1：5，b = c（2007，2007，NA，NA，2007））
 
 print（d1）
ab 
 1.3011543 2006 
 0.3780023 2006 
 -0.3101449 2006 
 -1.3927445 2006 
 -1.0726218 NA 
 
打印（d2）
ab 
 1 2007 
 2 2007 
 3 NA 
 4 NA 
 5 2007

我放在列表中

  ls< -list（）
 
r（i in 1：2）{
 
 str< -paste（d，i，sep =）
 dat< -get（str）
 ls [[str]]< -dat 
 
}

喜欢过滤每个列表组件，以便只留下包含NA的列b的行。为此，我尝试使用以下命令，从一开始就知道它将失败。我的问题是我不知道如果 subset（）是正确的使用功能，如果是，我不知道如何限定每个数据帧（即子集功能的第一个元素）

  lsNA< -lapply（ls，subset（ls，is.na （b）））

你能帮我超过我的严格限制吗？

解决方案

lapply 的第二个参数是一个函数（ / code>）和子集的额外参数作为 ... 参数传递到 lapply 。因此：

  my.ls<  -  list（d1 = d1，d2 = d2）
 my.lsNA< ;  -  lapply（my.ls，subset，is.na（b））

（我也是向您展示如何轻松创建data.frames列表，而不使用 get ，建议您不要使用 ls 作为一个变量名称，因为它也是一个相当常见的函数的名称。）

 
I created a list and I stored one data frame in each component. Now I would like to filter those data frames keeping only the rows that have NA in a specific column. I would like the result of this operation to be another list containing data frames with only those rows having NA in that column.

Here is some code to clarify what I am saying. Assume d1 and d2 are my data frames
set.seed(1)

d1<-data.frame(a=rnorm(5), b=c(rep(2006, times=4),NA))
d2<-data.frame(a=1:5, b=c(2007, 2007, NA, NA, 2007))  

print(d1)
 a    b
 1.3011543 2006
 0.3780023 2006
-0.3101449 2006
-1.3927445 2006
-1.0726218   NA

print(d2)
a    b
1 2007
2 2007
3   NA
4   NA
5 2007
which I place in a list
ls<-list()

r (i in 1:2){

  str<-paste("d", i, sep="")
  dat<-get(str)
  ls[[str]]<-dat

}
Now I would like to filter each list component so to leave only rows of column b that contain NA. To do this I tried using the following command, knowing from the beginning it would have failed. My problem is that I don't know if subset() is the right function to use and, in case it is, I don't know how to qualify each data frame (that is, the first element of the subset function)    
lsNA<-lapply(ls, subset(ls, is.na(b)))
Can you please help me get past my severe limitations?
 解决方案 
lapply's second argument is a function (subset) and extra arguments to subset are passed as the ... arguments to lapply. Hence:
my.ls <- list(d1 = d1, d2 = d2)
my.lsNA <- lapply(my.ls, subset, is.na(b))
(I am also showing you how to easily create the list of data.frames without using get, and recommend you don't use ls as a variable name since it is also the name of a rather common function.)

                        这篇关于如何将数据子集存储在列表中？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

如何将数据子集存储在列表中？ [英] How to subset data.frames stored in a list?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何将数据子集存储在列表中？ [英] How to subset data.frames stored in a list?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭