如何在R中将数据分割成几段? [英] How to cut data in even pieces in R?
问题描述
问题出在这里:我有一个数据集,例如:
Here is the problem: I have a dataset, let's say:
a <- c(0,0,0,0,1,1,1,1,1,1)
我想剪掉它切成均匀的碎片(例如5件)。问题是我无法使用分位数或剪切,因为某些值会重复,因此无法设置不同的断点。
I want to cut it into even pieces (e.g. 5 pieces ). The problem is I cannot use quantiles or cut because some values repeat, so you cannot set distinct breakpoints.
> quantile(a)
0% 25% 50% 75% 100%
0 0 1 1 1
(重复的断点)
> cut(a, 5)
[1] (-0.001,0.199] (-0.001,0.199] (-0.001,0.199] (-0.001,0.199] (0.801,1]
[6] (0.801,1] (0.801,1] (0.801,1] (0.801,1] (0.801,1]
Levels: (-0.001,0.199] (0.199,0.4] (0.4,0.6] (0.6,0.801] (0.801,1]
(仅使用两个级别)
我知道我可以产生这样的向量:
I know I can produce a vector like this:
b <- c(1,1,2,2,3,3,4,4,5,5)
并将其用于采样或者我可以使用for循环和计数实例,但这需要循环和一些笨拙的编码。我正在寻找一种简单高效的(R样式)函数,其功能要比这更好。
and use it for sampling. Or I can use for loop and count instances. But this needs loops and some clumsy coding. I am looking for a simple and efficient (R-style) function that does better than this.
(我可以编写它,但我不想重新发明轮子。)
(I can write it but I don't want to reinvent the wheel.)
推荐答案
您可以使用 cut
,但必须在向量的数字索引上使用它,即 seq(a)
,而不是向量本身。
You can use cut
, but you have to use it on the numerical indices of the vector, i.e., seq(a)
, not the vector itself.
然后将向量分成等长的 split
:
Then you split the vector into pieces of equal length with split
:
split(a, cut(seq(a), 5, labels = FALSE))
这将返回五个短向量的列表。
This returns a list of five short vectors.
另一种没有切割
的方式是
split(a, rep(seq(5), each = length(a) / 5))
这篇关于如何在R中将数据分割成几段?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!