如何在列表中保存数据帧 [英] How to save data frames in a list
问题描述
df
,我用它来存储即将到来的帧。 数据框是
第一次迭代
V1 V2 V3 V4
5.1 3.5 1.4 0.2
4.9 3.0 1.4 0.2
4.7 3.2 1.3 0.2
4.6 3.1 1.5 0.2
5.0 3.6 1.4 0.2
第二次迭代
V1 V2 V3 V4
5.1 3.5 1.4 0.2
4.9 3.0 1.4 0.2
4.7 3.2 1.3 0.2
4.6 3.1 1.5 0.2
5.0 3.6 1.4 0.2
第三次迭代
V1 V2 V3 V4
5.1 3.5 1.4 0.2
4.9 3.0 1.4 0.2
4.7 3.2 1.3 0.2
4.6 3.1 1.5 0.2
5.0 3.6 1.4 0.2
等等
最后我想把所有的数据框都列在列表中,所以我可以处理列出其他操作。如何做到这一点?
这是一个示例代码
data = list.files(pattern =。csv)
data1 = lapply(data,function(x)read.csv(x,header = TRUE))
files = length(data1)
for(i in 1:length(files))
{
...
code
...
}
df ##表示每次生成一些df
从评论中,我明白你正在尝试通过一些算法的顺序迭代生成一个数据框架对象列表,每个算法都会生成一个新的数据框架。
假设我们有一些函数 f()
,它可以从一些源生成一个新的data.frame,并且可能在返回之前上传data.frame。
f< - function(){
#读取文件,做一些工作,生成数据框等
df #返回新的data.frame()
}
使用 append
或者类似于将新数据添加到列表中的东西是具有展开框架并将其合并的习惯。
相反,您的代码需要一个这样的结构:
output_list< - list()#保存生成的框架的列表
while(more_work_to_do){
df< - f()#One迭代
output_list [[length(output_list)+1]]< - df
}
#此时,output_list是生成的数据帧
#的列表,其内部结构保留所有。
使用 [[]]
操作员为插入避免要更换的项目数量不是替换长度的倍数错误。 length(output_list)+1
构造简单地意味着一个超过数组的当前结尾,实际上是为你添加,而不需要维护一个单独的计数器。 / p>
这是一个例子
> f< -function(){data.frame(x = rnorm(5),y = rnorm(5))}
> output_list< - list()
> for(i in 1:5)output_list [[length(output_list)+1]]< - f()
> length(output_list)
[1] 5
> str(output_list)
列表5
$:'data.frame':5 obs。的2个变量:
.. $ x:num [1:5] -0.347 0.194 -0.406 -0.384 2.24
.. $ y:num [1:5] -0.756 0.3417 -0.7542 0.1612 -0.0494
$:'data.frame':5 obs。的2个变量:
.. $ x:num [1:5] 0.667 -0.186 0.602 -0.239 1.516
.. $ y:num [1:5] 0.263 -1.322 0.604 -0.135 -0.339
$:'data.frame':5 obs。的2个变量:
.. $ x:num [1:5] 1.064 -0.365 -1.584 0.163 0.142
.. $ y:num [1:5] -0.0782 1.3314 0.0797 -0.4096 0.4819
$:'data.frame':5 obs。的2个变量:
.. $ x:num [1:5] -2.0448 -0.4228 -0.5305 -0.0611 0.4114
.. $ y:num [1:5] -0.608 -0.74 -0.196 - 0.957 0.653
$:'data.frame':5 obs。的2个变量:
.. $ x:num [1:5] 0.582 -1.029 -1.222 1.755 0.259
.. $ y:num [1:5] 1.733 0.319 -0.597 -1.814 0.446
> output_list
[[1]]
xy
1 -0.3474823 -0.75595301
2 0.1941049 0.34170577
3 -0.4055180 -0.75424689
4 -0.3838479 0.16122522
5 2.2397387 -0.04936943
[[2]]
xy
1 0.6674517 0.2625242
2 -0.1859460 -1.3219586
3 0.6020241 0.6042548
4 -0.2387514 -0.1345904
5 1.5158875 -0.3392787
[[3]]
xy
1 1.0639814 -0.07823834
2 -0.3645768 1.33144410
3 -1.5839606 0.07973743
4 0.1630311 -0.40957609
5 0.1420562 0.48187377
[[4]]
xy
1 -2.04475082 -0.6083283
2 - 0.42280601 -0.7396052
3 -0.53048188 -0.1961052
4 -0.06107144 -0.9571272
5 0.41136718 0.6526753
[[5]]
xy
1 0.5821866 1.7325293
2 -1.0289847 0.3186825
3 -1.2218606 -0.5971967
4 1.7548963 -1.8136810
5 0.2592219 0.4463977
>
I have some data frames which comes on different iterations of my code. Let it be some 100 iterations. Each time i write the data frame to df
which i use to store the upcoming frame.
The data frames are
first iteration
V1 V2 V3 V4
5.1 3.5 1.4 0.2
4.9 3.0 1.4 0.2
4.7 3.2 1.3 0.2
4.6 3.1 1.5 0.2
5.0 3.6 1.4 0.2
second iteration
V1 V2 V3 V4
5.1 3.5 1.4 0.2
4.9 3.0 1.4 0.2
4.7 3.2 1.3 0.2
4.6 3.1 1.5 0.2
5.0 3.6 1.4 0.2
third iteration
V1 V2 V3 V4
5.1 3.5 1.4 0.2
4.9 3.0 1.4 0.2
4.7 3.2 1.3 0.2
4.6 3.1 1.5 0.2
5.0 3.6 1.4 0.2
and so on
Now at the end I want to have all the data frames in a list so I can process the list for other operation. How do I do this?
Here is a sample code
data = list.files(pattern=".csv")
data1 = lapply(data, function(x) read.csv(x, header = TRUE))
files = length(data1)
for(i in 1:length(files))
{
......
code
......
}
df ## say some df is generated each time
From the comments, I understand you are trying to generate a list of data.frame objects over sequential iterations of some algorithm - each of which produces a new data.frame.
Suppose we have some function f()
which generates a new data.frame, from some source, and perhaps uploads the data.frame before returning it.
f <- function() {
# read a file, do some work, produce a dataframe, etc
df # return the new data.frame()
}
The problem with using append
or something similar to add the new data.frame to the list is that is has a habit of "unrolling" the frame and merging it in.
Instead, your code needs a structure like this:
output_list <- list() # A list to hold the generated frames
while (more_work_to_do) {
df <- f() #One iteration
output_list[[length(output_list)+1]] <- df
}
# At this point, output_list is a list of the generated data frames
# with all their internal structure preserved.
It's important to use the [[]]
operator for the insert to avoid the " number of items to replace is not a multiple of replacement length" error. The length(output_list)+1
construct simply means "one past the current end of the array" and in effect does an append for you without needing to maintain a separate counter.
Here's an example
> f<-function() { data.frame(x=rnorm(5), y=rnorm(5)) }
> output_list <- list()
> for (i in 1:5) output_list[[length(output_list)+1]] <- f()
> length(output_list)
[1] 5
> str(output_list)
List of 5
$ :'data.frame': 5 obs. of 2 variables:
..$ x: num [1:5] -0.347 0.194 -0.406 -0.384 2.24
..$ y: num [1:5] -0.756 0.3417 -0.7542 0.1612 -0.0494
$ :'data.frame': 5 obs. of 2 variables:
..$ x: num [1:5] 0.667 -0.186 0.602 -0.239 1.516
..$ y: num [1:5] 0.263 -1.322 0.604 -0.135 -0.339
$ :'data.frame': 5 obs. of 2 variables:
..$ x: num [1:5] 1.064 -0.365 -1.584 0.163 0.142
..$ y: num [1:5] -0.0782 1.3314 0.0797 -0.4096 0.4819
$ :'data.frame': 5 obs. of 2 variables:
..$ x: num [1:5] -2.0448 -0.4228 -0.5305 -0.0611 0.4114
..$ y: num [1:5] -0.608 -0.74 -0.196 -0.957 0.653
$ :'data.frame': 5 obs. of 2 variables:
..$ x: num [1:5] 0.582 -1.029 -1.222 1.755 0.259
..$ y: num [1:5] 1.733 0.319 -0.597 -1.814 0.446
> output_list
[[1]]
x y
1 -0.3474823 -0.75595301
2 0.1941049 0.34170577
3 -0.4055180 -0.75424689
4 -0.3838479 0.16122522
5 2.2397387 -0.04936943
[[2]]
x y
1 0.6674517 0.2625242
2 -0.1859460 -1.3219586
3 0.6020241 0.6042548
4 -0.2387514 -0.1345904
5 1.5158875 -0.3392787
[[3]]
x y
1 1.0639814 -0.07823834
2 -0.3645768 1.33144410
3 -1.5839606 0.07973743
4 0.1630311 -0.40957609
5 0.1420562 0.48187377
[[4]]
x y
1 -2.04475082 -0.6083283
2 -0.42280601 -0.7396052
3 -0.53048188 -0.1961052
4 -0.06107144 -0.9571272
5 0.41136718 0.6526753
[[5]]
x y
1 0.5821866 1.7325293
2 -1.0289847 0.3186825
3 -1.2218606 -0.5971967
4 1.7548963 -1.8136810
5 0.2592219 0.4463977
>
这篇关于如何在列表中保存数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!