按行数拆分数据框 [英] Split up a dataframe by number of rows

查看：217 发布时间：2017/3/25 23:55:53 r split dataframe

本文介绍了按行数拆分数据框的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个由400,000行和大约50列组成的数据框。由于这个数据框太大，所以计算量太大了。
我想将这个数据框分成较小的数据框，之后我将运行我想运行的函数，然后在最后重新组合数据框。

我没有使用分组变量来分割这个数据框。我只想把它拆分成行数。例如，我想将这个400'000行的表拆分成400个1千行数据帧。
我该怎么做？

解决方案

设置自己的分组变量

  d<  -  split（my_data_frame，rep（1：400，each = 1000））

您还应该考虑 plyr 包中的 ddply 函数，或 group_by（）函数从 dplyr 。

编辑，以便在Hadley的评论之后。

如果您不知道数据帧中有多少行，或者数据框架可能是您想要的块大小不等长的长度，您可以执行

  chunk <$ 1000 
n< -  nrow（my_data_frame）
r<  -  rep（1：ceiling（n / chunk），each = chunk）[1：n] 
d<  -  split（my_data_frame，r）

您还可以使用

  r<  -  ggplot2 :: cut_width（1：n，chunk，boundary = 0）

对于未来的读者来说，基于<$ c $的方法c> dplyr 和 data.table 软件包在数据帧上进行群组操作可能会更快。

I have a dataframe made up of 400'000 rows and about 50 columns. As this dataframe is so large, it is too computationally taxing to work with. I would like to split this dataframe up into smaller ones, after which I will run the functions I would like to run, and then reassemble the dataframe at the end.

There is no grouping variable that I would like to use to split up this dataframe. I would just like to split it up by number of rows. For example, I would like to split this 400'000-row table into 400 1'000-row dataframes. How might I do this?

解决方案

Make your own grouping variable.

d <- split(my_data_frame,rep(1:400,each=1000))

You should also consider the ddply function from the plyr package, or the group_by() function from dplyr.

edited for brevity, after Hadley's comments.

If you don't know how many rows are in the data frame, or if the data frame might be an unequal length of your desired chunk size, you can do

chunk <- 1000
n <- nrow(my_data_frame)
r  <- rep(1:ceiling(n/chunk),each=chunk)[1:n]
d <- split(my_data_frame,r)

You could also use

r <- ggplot2::cut_width(1:n,chunk,boundary=0)

For future readers, methods based on the dplyr and data.table packages will probably be (much) faster for doing group-wise operations on data frames.

这篇关于按行数拆分数据框的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

按行数拆分数据框 [英] Split up a dataframe by number of rows

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

按行数拆分数据框 [英] Split up a dataframe by number of rows

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭