计算 R 中高于阈值范围的每列值的数量 [英] Count the number of values per column above a range of thresholds in R

查看:31
本文介绍了计算 R 中高于阈值范围的每列值的数量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何计算高于阈值序列的每列值的数量?

即:计算每一列的值的数量,大于 100,然后大于 150,然后大于......并将结果存储在数据框中?

i.e.: calculate for each column, the number of values above 100, then above 150, then above ... and store the results in a data frame ?

# Reproductible data
# (Original data is daily streamflow values organized in columns per year)

set.seed(1234)
data = data.frame("1915" = runif(365, min = 60, max = 400),
                  "1916" = runif(365, min = 60, max = 400),
                  "1917" = runif(365, min = 60, max = 400))

# my code chunck

mymin = 75
mymax = 400
my step = 25

apply(data, 2, function (x) {
  for(i in seq(mymin,mymax,mystep)) {
    res = (sum(x > i)) # or nrow(data[x > i,])
    return(res)
  }
})

此代码适用于一次迭代,但我无法将每次迭代的结果存储在数据框中.

This code works well for one iteration, but I can't store the result of each iteration in a data frame.

我也尝试过诸如:

for (i in 1:n){
  seuil = seq(mymin, mymax, my step)
  lapply(data, function(x) {
    res [[i]] = nrow(data[ x > seuil[i], ])
    return(res)}
})

效果不佳...

输出将类似于:

<头>
n 值大于 75n 个大于 100 的值n 值以上...
1915348329...
1916351325...
............

感谢您的意见和建议:)

Thanks for your comments and suggestions :)

推荐答案

myseq <- seq(75, 400, by=25)
as.data.frame(do.call(rbind, lapply(data, function(z) table(findInterval(z, myseq)))))
#        0  1  2  3  4  5  6  7  8  9 10 11 12 13
# X1915 17 19 26 27 41 23 26 33 27 22 30 25 21 28
# X1916 14 26 20 28 25 26 22 23 35 28 26 30 22 40
# X1917 20 30 24 31 24 28 22 25 28 34 18 21 26 34

或者如果您喜欢 R 使用 cut 提出的 factor 级别,那么

or if you like the factor levels that R will come up with using cut, then

as.data.frame(do.call(rbind, lapply(data, function(z) table(cut(z, myseq)))))
#       (75,100] (100,125] (125,150] (150,175] (175,200] (200,225] (225,250] (250,275] (275,300] (300,325] (325,350] (350,375] (375,400]
# X1915       19        26        27        41        23        26        33        27        22        30        25        21        28
# X1916       26        20        28        25        26        22        23        35        28        26        30        22        40
# X1917       30        24        31        24        28        22        25        28        34        18        21        26        34

这篇关于计算 R 中高于阈值范围的每列值的数量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆