将大数据帧拆分为更小的段 [英] Splitting a large data frame into smaller segments

查看:26
本文介绍了将大数据帧拆分为更小的段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据框,我想将其分解为 10 个不同的数据框.我想将最初的 100 行数据帧分解为 10 行的 10 个数据帧.我可以执行以下操作并获得所需的结果.

I have the following data frame and I want to break it up into 10 different data frames. I want to break the initial 100 row data frame into 10 data frames of 10 rows. I could do the following and get the desired results.

df = data.frame(one=c(rnorm(100)), two=c(rnorm(100)), three=c(rnorm(100)))

df1 = df[1:10,]
df2 = df[11:20,]
df3 = df[21:30,]
df4 = df[31:40,]
df5 = df[41:50,]
...

当然,当初始数据帧较大或没有容易将其分解成的段数时,这不是执行此任务的优雅方式.

Of course, this isn't an elegant way to perform this task when the initial data frames are larger or if there aren't an easy number of segments that it can be broken down into.

鉴于上述情况,假设我们有以下数据框.

So given the above, let's say we have the following data frame.

df = data.frame(one=c(rnorm(1123)), two=c(rnorm(1123)), three=c(rnorm(1123)))

现在我想将其拆分为由 200 行组成的新数据帧,以及包含剩余行的最终数据帧.执行此任务的更优雅(又名快速")方式是什么?

Now I want to split it into new data frames comprised of 200 rows, and the final data frame with the remaining rows. What would be a more elegant (aka 'quick') way to perform this task.

推荐答案

 > str(split(df, (as.numeric(rownames(df))-1) %/% 200))
List of 6
 $ 0:'data.frame':  200 obs. of  3 variables:
  ..$ one  : num [1:200] -1.592 1.664 -1.231 0.269 0.912 ...
  ..$ two  : num [1:200] 0.639 -0.525 0.642 1.347 1.142 ...
  ..$ three: num [1:200] -0.45 -0.877 0.588 1.188 -1.977 ...
 $ 1:'data.frame':  200 obs. of  3 variables:
  ..$ one  : num [1:200] -0.0017 1.9534 0.0155 -0.7732 -1.1752 ...
  ..$ two  : num [1:200] -0.422 0.869 0.45 -0.111 0.073 ...
  ..$ three: num [1:200] -0.2809 1.31908 0.26695 0.00594 -0.25583 ...
 $ 2:'data.frame':  200 obs. of  3 variables:
  ..$ one  : num [1:200] -1.578 0.433 0.277 1.297 0.838 ...
  ..$ two  : num [1:200] 0.913 0.378 0.35 -0.241 0.783 ...
  ..$ three: num [1:200] -0.8402 -0.2708 -0.0124 -0.4537 0.4651 ...
 $ 3:'data.frame':  200 obs. of  3 variables:
  ..$ one  : num [1:200] 1.432 1.657 -0.72 -1.691 0.596 ...
  ..$ two  : num [1:200] 0.243 -0.159 -2.163 -1.183 0.632 ...
  ..$ three: num [1:200] 0.359 0.476 1.485 0.39 -1.412 ...
 $ 4:'data.frame':  200 obs. of  3 variables:
  ..$ one  : num [1:200] -1.43 -0.345 -1.206 -0.925 -0.551 ...
  ..$ two  : num [1:200] -1.343 1.322 0.208 0.444 -0.861 ...
  ..$ three: num [1:200] 0.00807 -0.20209 -0.56865 1.06983 -0.29673 ...
 $ 5:'data.frame':  123 obs. of  3 variables:
  ..$ one  : num [1:123] -1.269 1.555 -0.19 1.434 -0.889 ...
  ..$ two  : num [1:123] 0.558 0.0445 -0.0639 -1.934 -0.8152 ...
  ..$ three: num [1:123] -0.0821 0.6745 0.6095 1.387 -0.382 ...

如果某些代码可能更改了行名,则使用会更安全:

If some code might have changed the rownames it would be safer to use:

 split(df, (seq(nrow(df))-1) %/% 200) 

这篇关于将大数据帧拆分为更小的段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆