Python：优雅地将词典与值（sum）合并 [英] Python: Elegantly merge dictionaries with sum() of values

查看：98 发布时间：2017/5/21 14:59:44 python dictionary

本文介绍了Python：优雅地将词典与值（sum）合并的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试合并来自多个服务器的日志。每个日志都是元组列表（ date ， count ）。 日期可能会出现多次，我希望结果字典保存所有服务器的所有计数的总和。

这是我的尝试，有一些数据，例如：

 从集合import defaultdict 
 
a = [（13.5，100）] 
b = [（14.5，100），（15.5，100）] 
c = [（15.5，100） ，（16.5，100）] 
 input = [a，b，c] 
 
 output = defaultdict（int）
 for d in input：
 for项目d：
输出[item [0]] + = item [1] 
 print dict（output）

其中：

  {'14.5'：100，'16 .5'：100， 13.5'：100，'15.5'：200}

如预期。

我要去香蕉，因为看到代码的同事。她坚持认为，如果没有这些嵌套的循环，那么必须有一种更加优雅的方式来做到这一点。任何想法？

解决方案

不会比这更简单，我想：

  a = [（13.5，100）] 
b = [（14.5，100），（15.5，100）] 
c = [（15.5，100），（16.5，100）] 
 input = [a，b，c] 
 
 from collections import Counter 
 
打印总和（
（Counter（dict（x））for x in input），
 Counter（））
   Counter （也称为多集）是您的数据最自然的数据结构（一种类型的集合哪些元素可以属于多于一个，或者等效地 - 具有语义元素 - > OccurrenceCount的地图，您可以首先使用它，而不是元组列表。
 
 
 < hr> 
 
 也可能：
 从集合导入计数器
从运算符导入添加
 
 print（add（（（x））（for（x（x））for $）
  
使用 reduce（add，seq）而不是o f  sum（seq，initialValue）通常更灵活，可以跳过传递冗余初始值。
 
 
 请注意，您还可以使用 operator.and _ 来查找多重数据集的交集而不是总和。
 
 
 
 
 
 上面的变体是非常慢的，因为在每个步骤都创建了一个新的计数器。我们来解决这个问题。
 
 
 我们知道 Counter + Counter 返回一个新的 Counter 合并数据。这是可以的，但是我们想避免额外的创建。我们来使用 Counter.update  
 
  update（self，iterable = None， ** kwds）unbound collections.Counter方法
 
 
 像dict.update（），但添加计数，而不是替换它们。 
源可以是一个可迭代的，一个字典或另一个Counter实例。
 
 
这就是我们想要的。让我们用与 reduce 兼容的功能来包装，看看会发生什么。
  def updateInPlace（a，b）：
 a.update（b）
返回一个
 
 print reduce（updateInPlace，（Counter（dict（x））for x in input ））
  
这比OP的解决方案稍微慢一些。
 
 
  基准： http://ideone.com/7IzSx  < （更新为另一个解决方案，感谢 astynax ） 
 
 
  （另外：如果你拼命想要一个一线，您可以通过替换 updateInPlace  lambda x，y：x.update（y）或x 相同的方式，甚至证明是更快的分秒，但无法阅读。不要： - ）） 
 
I'm trying to merge logs from several servers. Each log is a list of tuples (date, count). date may appear more than once, and I want the resulting dictionary to hold the sum of all counts from all servers.

Here's my attempt, with some data for example:
from collections import defaultdict

a=[("13.5",100)]
b=[("14.5",100), ("15.5", 100)]
c=[("15.5",100), ("16.5", 100)]
input=[a,b,c]

output=defaultdict(int)
for d in input:
        for item in d:
           output[item[0]]+=item[1]
print dict(output)
Which gives:
{'14.5': 100, '16.5': 100, '13.5': 100, '15.5': 200}
As expected.

I'm about to go bananas because of a colleague who saw the code. She insists that there must be a more Pythonic and elegant way to do it, without these nested for loops. Any ideas?
 解决方案 
Doesn't get simpler than this, I think:
a=[("13.5",100)]
b=[("14.5",100), ("15.5", 100)]
c=[("15.5",100), ("16.5", 100)]
input=[a,b,c]

from collections import Counter

print sum(
    (Counter(dict(x)) for x in input),
    Counter())
Note that Counter (also known as a multiset) is the most natural data structure for your data (a type of set to which elements can belong more than once, or equivalently - a map with semantics Element -> OccurrenceCount. You could have used it in the first place, instead of lists of tuples.



Also possible:
from collections import Counter
from operator import add

print reduce(add, (Counter(dict(x)) for x in input))
Using reduce(add, seq) instead of sum(seq, initialValue) is generally more flexible and allows you to skip passing the redundant initial value.

Note that you could also use operator.and_ to find the intersection of the multisets instead of the sum.



The above variant is terribly slow, because a new Counter is created on every step. Let's fix that.

We know that Counter+Counter returns a new Counter with merged data. This is OK, but we want to avoid extra creation. Let's use Counter.update instead:

  update(self, iterable=None, **kwds) unbound collections.Counter method
  
  Like dict.update() but add counts instead of replacing them.
  Source can be an iterable, a dictionary, or another Counter instance.
That's what we want. Let's wrap it with a function compatible with reduce and see what happens.
def updateInPlace(a,b):
    a.update(b)
    return a

print reduce(updateInPlace, (Counter(dict(x)) for x in input))
This is only marginally slower than the OP's solution.

Benchmark: http://ideone.com/7IzSx (Updated with yet another solution, thanks to astynax)

(Also: If you desperately want an one-liner, you can replace updateInPlace by lambda x,y: x.update(y) or x which works the same way and even proves to be a split second faster, but fails at readability. Don't :-))

                        这篇关于Python：优雅地将词典与值（sum）合并的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

Python：优雅地将词典与值（sum）合并 [英] Python: Elegantly merge dictionaries with sum() of values

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python：优雅地将词典与值（sum）合并 [英] Python: Elegantly merge dictionaries with sum() of values

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭