计算为NaN的纯裂隙的熵 [英] Entropy of pure split caculated to NaN

查看:75
本文介绍了计算为NaN的纯裂隙的熵的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个函数来计算向量的熵,其中每个元素代表一个类的元素数量.

function x = Entropy(a)
    t = sum(a);
    t = repmat(t, [1, size(a, 2)]);
    x = sum(-a./t .* log2(a./t));
end

例如:a = [4 0],然后是entropy = -(0/4)*log2(0/4) - (4/4)*log2(4/4)

但是对于上述函数,由于log2(0),当拆分为纯时,熵为NaN,如上例所示.纯分裂的熵应该为零.

由于数据量很大,我该如何解决对性能影响最小的问题?谢谢

解决方案

我建议您创建自己的log2函数

function res=mylog2(a)
   res=log2(a);
   res(isinf(res))=0;
end

此功能在破坏log2行为的同时,可以在您的特定示例中使用,因为您要将结果与日志内部相乘,从而使其为零.这不是数学上正确的",但我相信这就是您要寻找的.

I have written a function to calculate entropy of a vector where each element represents number of elements of a class.

function x = Entropy(a)
    t = sum(a);
    t = repmat(t, [1, size(a, 2)]);
    x = sum(-a./t .* log2(a./t));
end

e.g: a = [4 0], then entropy = -(0/4)*log2(0/4) - (4/4)*log2(4/4)

But for above function, the entropy is NaN when the split is pure because of log2(0), as in above example. The entropy of pure split should be zero.

How should I solve the problem with least effect on performance as data is very large? Thanks

解决方案

I would suggest you create your own log2 function

function res=mylog2(a)
   res=log2(a);
   res(isinf(res))=0;
end

This function, while breaking the log2 behaviour, can be used in your specific example because you are multiplying the result with the inside of the log, thus making it zero. It is not "mathematically correct", but I believe that's what you are looking for.

这篇关于计算为NaN的纯裂隙的熵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆