计算数据集列的百分位数 [英] Calculating percentile of dataset column

查看:56
本文介绍了计算数据集列的百分位数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给你一个快速的,最亲爱的 R 大师:

A quick one for you, dearest R gurus:

我正在做一个作业,在这个练习中,我被要求从 infert 数据集(它是内置的),特别是它的一列中获取基本统计数据,推断$age.

I'm doing an assignment and I've been asked, in this exercise, to get basic statistics out of the infert dataset (it's in-built), and specifically one of its columns, infert$age.

对于不熟悉数据集的人:

For anyone not familiar with the dataset:

> table_ages     # Which is just subset(infert, select=c("age"));
    age
1    26
2    42
3    39
4    34
5    35
6    36
7    23
8    32
9    21
10   28
11   29
...
246  35
247  29
248  23

我必须找到列的中值、方差、偏度、标准偏差,这些都没有问题,直到我被要求找到百分位数"列.

I've had to find median values of the column, variance, skewness, standard deviation which were all okay, until I was asked to find the column "percentiles".

到目前为止我还没有找到任何东西,也许我从希腊语(作业的语言)中错误地翻译了它.它是ποσοστημόρια",谷歌翻译指出英文术语是百分位数".

I haven't been able to find anything so far, and maybe I've translated it incorrectly from greek, the language of the assignment. It was "ποσοστημόρια", Google Translate pointed the English term to be "percentiles".

关于找到 infert$age 的那些百分位数"的任何教程或想法?

Any tutorials or ideas on finding those "percentiles" of infert$age?

推荐答案

如果您对向量 x 进行排序,并找到向量中途的值,您只是找到了一个中位数,或者第 50 个百分位.相同的逻辑适用于任何百分比.这里有两个例子.

If you order a vector x, and find the values that is half way through the vector, you just found a median, or 50th percentile. Same logic applies for any percentage. Here are two examples.

x <- rnorm(100)
quantile(x, probs = c(0, 0.25, 0.5, 0.75, 1)) # quartile
quantile(x, probs = seq(0, 1, by= 0.1)) # decile

这篇关于计算数据集列的百分位数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆