按百分位数分割向量 [英] split a vector by percentile

查看:33
本文介绍了按百分位数分割向量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将 R 中已排序的未知长度向量拆分为前 10%,...,后 10%"因此,例如,如果我有 vector <- order(c(1:98928)),我想将其拆分为 10 个不同的向量,每个向量大约占总长度的 10%.

I need to split a sorted unknown length vector in R into "top 10%,..., bottom 10%" So, for example if I have vector <- order(c(1:98928)), I want to split it into 10 different vectors, each one representing approximately 10% of the total length.

我试过使用 split <- split(vector, 1:10) 但由于我不知道向量的长度,如果不是多个,我会得到这个错误

Ive tried using split <- split(vector, 1:10) but as I dont know the length of the vector, I get this error if its not multiple

数据长度不是拆分变量的倍数

data length is not a multiple of split variable

即使它的倍数和函数有效,split() 也不会保持我原始向量的顺序.这就是 split 给出的结果:

And even if its multiple and the function works, split() does not keep the order of my original vector. This is what split gives:

split(c(1:10) , 1:2)
$`1`
[1] 1 3 5 7 9

$`2`
[1]  2  4  6  8 10

这就是我想要的:

$`1`
[1] 1 2 3 4 5

$`2`
[1]  6  7  8  9 10

我是 R 的新手,我尝试了很多东西都没有成功,有人知道怎么做吗?

Im newbie in R and Ive been trying lots of things without success, does anyone knows how to do this?

推荐答案

问题说明

将一个已排序的向量 x 每 10% 分成 10 个块.

Problem statement

Break a sorted vector x every 10% into 10 chunks.

请注意,对此有两种解释:

Note there are two interpretation for this:

  1. 按矢量索引切割:

split(x, floor(10 * seq.int(0, length(x) - 1) / length(x)))

  • 按向量值(例如分位数)切割:

    split(x, cut(x, quantile(x, prob = 0:10 / 10, names = FALSE), include = TRUE))
    

  • 下面我会用数据做示范:

    In the following, I will make demonstration using data:

    set.seed(0); x <- sort(round(rnorm(23),1))
    

    特别是,我们的示例数据是正态分布而不是均匀分布,因此按索引切割和按值切割有很大不同.

    Particularly, our example data are Normally distributed rather than uniformly distributed, so cutting by index and cutting by value are substantially different.

    按索引切割

    #$`0`
    #[1] -1.5 -1.2 -1.1
    #
    #$`1`
    #[1] -0.9 -0.9
    #
    #$`2`
    #[1] -0.8 -0.4
    #
    #$`3`
    #[1] -0.3 -0.3 -0.3
    #
    #$`4`
    #[1] -0.3 -0.2
    #
    #$`5`
    #[1] 0.0 0.1
    #
    #$`6`
    #[1] 0.3 0.4 0.4
    #
    #$`7`
    #[1] 0.4 0.8
    #
    #$`8`
    #[1] 1.3 1.3
    #
    #$`9`
    #[1] 1.3 2.4
    

    按分位数切割

    #$`[-1.5,-1.06]`
    #[1] -1.5 -1.2 -1.1
    #
    #$`(-1.06,-0.86]`
    #[1] -0.9 -0.9
    #
    #$`(-0.86,-0.34]`
    #[1] -0.8 -0.4
    #
    #$`(-0.34,-0.3]`
    #[1] -0.3 -0.3 -0.3 -0.3
    #
    #$`(-0.3,-0.2]`
    #[1] -0.2
    #
    #$`(-0.2,0.14]`
    #[1] 0.0 0.1
    #
    #$`(0.14,0.4]`
    #[1] 0.3 0.4 0.4 0.4
    #
    #$`(0.4,0.64]`
    #numeric(0)
    #
    #$`(0.64,1.3]`
    #[1] 0.8 1.3 1.3 1.3
    #
    #$`(1.3,2.4]`
    #[1] 2.4
    

    这篇关于按百分位数分割向量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆