Python直方图一线 [英] python histogram one-liner
问题描述
有很多方法可以编写用于计算直方图的Python程序。
There are many ways to write a Python program that computes a histogram.
通过直方图,我的意思是一个函数可以计算<$ c中对象的出现$ c> iterable 并在字典中输出计数。例如:
By histogram, I mean a function that counts the occurrence of objects in an iterable
and outputs the counts in a dictionary. For example:
>>> L = 'abracadabra'
>>> histogram(L)
{'a': 5, 'b': 2, 'c': 1, 'd': 1, 'r': 2}
编写此功能的一种方法是:
One way to write this function is:
def histogram(L):
d = {}
for x in L:
if x in d:
d[x] += 1
else:
d[x] = 1
return d
有没有更简洁的编写此函数的方法?
Are there more concise ways of writing this function?
如果我们在Python中具有字典理解功能,则可以编写:
If we had dictionary comprehensions in Python, we could write:
>>> { x: L.count(x) for x in set(L) }
但自Python 2.6起没有它们,我们必须写:
but since Python 2.6 doesn't have them, we have to write:
>>> dict([(x, L.count(x)) for x in set(L)])
尽管此方法可能可读,但效率不高:L被多次遍历。此外,这不适用于单寿命发电机;函数对于迭代器生成器应该同样有效,例如:
Although this approach may be readable, it is not efficient: L is walked-through multiple times. Furthermore, this won't work for single-life generators; the function should work equally well for iterator generators such as:
def gen(L):
for x in L:
yield x
我们可能会尝试使用 reduce
函数(RIP):
We might try to use the reduce
function (R.I.P.):
>>> reduce(lambda d,x: dict(d, x=d.get(x,0)+1), L, {}) # wrong!
糟糕,此操作无效:键名称为'x'
,而不是 x
。 :(
Oops, this does not work: the key name is 'x'
, not x
. :(
我以:
>>> reduce(lambda d,x: dict(d.items() + [(x, d.get(x, 0)+1)]), L, {})
(在Python 3中,我们必须编写 list(d.items())
的 d.items()
,但这是假设的,因为那里没有 reduce
。)
(In Python 3, we would have to write list(d.items())
instead of d.items()
, but it's hypothethical, since there is no reduce
there.)
请用更好的,更具可读性的单线打败我!;)
Please beat me with a better, more readable one-liner! ;)
推荐答案
Python 3.x确实具有 reduce
,您只需从functools import reduce 中进行即可。它还具有 dict理解,其语法与您的示例中的语法完全相同。
Python 3.x does have reduce
, you just have to do a from functools import reduce
. It also has "dict comprehensions", which have exactly the syntax in your example.
Python 2.7和3.x也具有 Counter 类,它确实满足您的要求:
Python 2.7 and 3.x also have a Counter class which does exactly what you want:
from collections import Counter
cnt = Counter("abracadabra")
在Python 2.6或更早版本中,我个人使用 defaultdict 并分两行进行:
In Python 2.6 or earlier, I'd personally use a defaultdict and do it in 2 lines:
d = defaultdict(int)
for x in xs: d[x] += 1
这是干净,高效,Python式的,对大多数人来说更容易比涉及到 reduce
的任何事物都要理解。
That's clean, efficient, Pythonic, and much easier for most people to understand than anything involving reduce
.
这篇关于Python直方图一线的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!