计算级别内的值 [英] Counting values within levels

查看:86
本文介绍了计算级别内的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一组用cut生成的R中的电平,例如例如说介于0和1之间的小数值,细分为0.1个bin:

I have a set of levels in R that I generate with cut, e.g. say fractional values between 0 and 1, broken down into 0.1 bins:

> frac <- cut(c(0, 1), breaks=10)
> levels(frac)
[1] "(-0.001,0.1]" "(0.1,0.2]"    "(0.2,0.3]"    "(0.3,0.4]"    "(0.4,0.5]"
[6] "(0.5,0.6]"    "(0.6,0.7]"    "(0.7,0.8]"    "(0.8,0.9]"    "(0.9,1]"

给定一个向量v,该向量包含在[0.0, 1.0]之间的连续值,我如何计算v中属于levels(frac)每个级别的元素的频率?

Given a vector v containing continuous values between [0.0, 1.0], how do I count the frequency of elements in v that fall within each level in levels(frac)?

我可以自定义间隔的数量和/或间隔,因此我正在寻找一种使用标准R命令执行此操作的方法,以便可以构建一个两列的数据帧:

I could customize the number of breaks and/or the interval from which I am making levels, so I'm looking for a way to do this with standard R commands, so that I can build a two-column data frame: one column for the levels as factors, and the second column for a fractional or percentage value of total elements in v over the level.

注意:以下内容无效:

> table(frac)
frac
(-0.001,0.1]    (0.1,0.2]    (0.2,0.3]    (0.3,0.4]    (0.4,0.5]    (0.5,0.6]
           1            0            0            0            0            0
   (0.6,0.7]    (0.7,0.8]    (0.8,0.9]      (0.9,1]
           0            0            0            1

如果直接在v上使用cut,则在不同向量上运行cut时,我不会获得相同的电平,因为值的范围(最小值和最大值)之间将有所不同任意向量,因此虽然我可能具有相同的中断次数,但电平间隔将不同.

If I use cut on v directly, then I do not get the same levels when I run cut on different vectors, because the range of values — their minimum and maximum — is going to be different between arbitrary vectors, and so while I may have the same number of breaks, the level intervals will not be the same.

我的目标是采用不同的向量并将它们分类为同一组水平.希望这有助于澄清我的问题.感谢您的协助.

My goal is to take different vectors and bin them to the same set of levels. Hopefully this helps clarify my question. Thanks for any assistance.

推荐答案

frac = seq(0,1,by=0.1)

ranges = paste(head(frac,-1), frac[-1], sep=" - ")
freq   = hist(v, breaks=frac, include.lowest=TRUE, plot=FALSE)

data.frame(range = ranges, frequency = freq$counts)

这篇关于计算级别内的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆