如何归一化计数器并结合两个归一化计数器? - Python [英] How to normalize a Counter and combine 2 normalized Counters? - python

查看:93
本文介绍了如何归一化计数器并结合两个归一化计数器? - Python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,我有两个字符串列表:

Firstly, I have two list of strings:

['abc','abc','def','jkl']
['abc','def','def','pqr', 'pr', 'foo', 'bar']

然后,我需要对列表进行标准化处理,以使每个计数器中的值之和等于1:

And then I need counters of the lists that are normalized such that the sum of the values in each counter equals 1:

Counter({'abc': 0.8164965809277261, 'jkl': 0.4082482904638631, 'def': 0.4082482904638631})
Counter({'abc': 1.1498299142610595, 'def': 1.0749149571305296, 'jkl': 0.4082482904638631, 'pr': 0.3333333333333333, 'bar': 0.3333333333333333, 'pqr': 0.3333333333333333, 'foo': 0.3333333333333333})

归一化因子是

math.sqrt(sum(i*i for i in counter.values()))

我已经尝试通过反复抛出计数器键来尝试以下操作,但是还有其他方法可以实现x+y计数器吗?

I've tried the following by iterating throw the counter keys but is there any other way of achieving the say x+y Counter?

>>> from collections import Counter
>>> import math
>>> x = Counter(['abc','abc','def','jkl'])
>>> denominator = 1/math.sqrt(sum(math.pow(i,2) for i in x.values()))
>>> for i in x:
...     x[i]*=denominator
... 
>>> x
Counter({'abc': 0.8164965809277261, 'jkl': 0.4082482904638631, 'def': 0.4082482904638631})
>>> y = Counter(['abc','def','def','pqr', 'pr', 'foo', 'bar'])
>>> denominator2 = 1/math.sqrt(sum(math.pow(i,2) for i in y.values()))
>>> for i in y:
...     y[i]*=denominator2
... 
>>> y
Counter({'def': 0.6666666666666666, 'pr': 0.3333333333333333, 'abc': 0.3333333333333333, 'bar': 0.3333333333333333, 'pqr': 0.3333333333333333, 'foo': 0.3333333333333333})
>>> x+y
Counter({'abc': 1.1498299142610595, 'def': 1.0749149571305296, 'jkl': 0.4082482904638631, 'pr': 0.3333333333333333, 'bar': 0.3333333333333333, 'pqr': 0.3333333333333333, 'foo': 0.3333333333333333})

推荐答案

您需要对值求和,然后将每个计数除以和:

You need to sum the values, then divide each count by the sum:

total = sum(x.values(), 0.0)
for key in x:
    x[key] /= total

通过从0.0开始求和,我们确保total是浮点值,避免了/具有整数操作数的Python 2地板分割行为.

By starting the sum with 0.0 we make sure total is a floating point value, avoiding the Python 2 floor division behaviour of / with integer operands.

演示:

>>> from collections import Counter
>>> x = Counter(['abc','abc','def','jkl'])
>>> total = sum(x.values(), 0.0)
>>> for key in x:
...     x[key] /= total
... 
>>> x
Counter({'abc': 0.5, 'jkl': 0.25, 'def': 0.25})
>>> y = Counter(['abc','def','def','pqr', 'pr', 'foo', 'bar'])
>>> total = sum(y.values(), 0.0)
>>> for key in y:
...     y[key] /= total
... 
>>> y
Counter({'def': 0.2857142857142857, 'pr': 0.14285714285714285, 'abc': 0.14285714285714285, 'bar': 0.14285714285714285, 'pqr': 0.14285714285714285, 'foo': 0.14285714285714285})

如果您需要对计数器求和,则需要分别重新对结果计数器进行归一化;将两个规范化的计数器相加就意味着您有了一个新的计数器,例如,整个值的总和为2.

If you need to sum the counters, you'd need to re-normalize the resulting counter separately; summing two normalized counters means you have a new counter whole values sum to 2, for example.

这篇关于如何归一化计数器并结合两个归一化计数器? - Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆