计算熵 [英] Calculating Entropy

查看:115
本文介绍了计算熵的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经尝试了几个小时来计算熵,并且我知道我遗漏了一些东西.希望这里有人可以给我一个主意!

I've tried for several hours to calculate the Entropy and I know I'm missing something. Hopefully someone here can give me an idea!

我认为我的公式是错误的!

I think my formula is wrong!

代码:

 info <- function(CLASS.FREQ){
      freq.class <- CLASS.FREQ
      info <- 0
      for(i in 1:length(freq.class)){
        if(freq.class[[i]] != 0){ # zero check in class
          entropy <- -sum(freq.class[[i]] * log2(freq.class[[i]]))  #I calculate the entropy for each class i here
        }else{ 
          entropy <- 0
        } 
        info <- info + entropy # sum up entropy from all classes
      }
      return(info)
    }

我希望我的帖子很清楚,因为这是我第一次真正在此处发布.

I hope my post is clear, since it's the first time I actually post here.

这是我的数据集:

buys <- c("no", "no", "yes", "yes", "yes", "no", "yes", "no", "yes", "yes", "yes", "yes", "yes", "no")

credit <- c("fair", "excellent", "fair", "fair", "fair", "excellent", "excellent", "fair", "fair", "fair", "excellent", "excellent", "fair", "excellent")

student <- c("no", "no", "no","no", "yes", "yes", "yes", "no", "yes", "yes", "yes", "no", "yes", "no")

income <- c("high", "high", "high", "medium", "low", "low", "low", "medium", "low", "medium", "medium", "medium", "high", "medium")

age <- c(25, 27, 35, 41, 48, 42, 36, 29, 26, 45, 23, 33, 37, 44) # we change the age from categorical to numeric

推荐答案

最终我发现您的代码没有错误,因为它运行时没有错误.我认为您缺少的部分是班级频率的计算,您将得到答案.快速浏览提供的不同对象,我怀疑您正在查看buys.

Ultimately I find no error in your code as it runs without error. The part I think you are missing is the calculation of the class frequencies and you will get your answer. Quickly running through the different objects you provide I suspect you are looking at buys.

buys <- c("no", "no", "yes", "yes", "yes", "no", "yes", "no", "yes", "yes", "yes", "yes", "yes", "no")
freqs <- table(buys)/length(buys)
info(freqs)
[1] 0.940286

作为改进代码的问题,如果提供了类频率的向量,则不需要循环,因此可以大大简化此过程.

As a matter of improving your code, you can simplify this dramatically as you don't need a loop if you are provided a vector of class frequencies.

例如:

# calculate shannon-entropy
-sum(freqs * log2(freqs))
[1] 0.940286

作为附带说明,功能entropy.empirical位于entropy包中,在其中您将单位设置为log2,从而提供了更大的灵活性.示例:

As a side note, the function entropy.empirical is in the entropy package where you set the units to log2 allowing some more flexibility. Example:

entropy.empirical(freqs, unit="log2")
[1] 0.940286

这篇关于计算熵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆