合并 n 字典并在 2.6 上添加值的最快方法 [英] Fastest way to merge n-dictionaries and add values on 2.6

查看:41
本文介绍了合并 n 字典并在 2.6 上添加值的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个字典列表,我想将这些字典合并到一个字典中,然后将每个字典中的值添加到列表中.例如:

I have a list of dictionaries that I would like to combine into one dictionary and add the values from each dictionary in the list. For example:

ds = [{1: 1, 2: 0, 3: 0}, {1: 2, 2: 1, 3: 0}, {1: 3, 2: 2, 3: 1, 4: 5}]

最终结果应该是一个字典:

The final results should be a single dictionary:

merged = {1: 6, 2: 3, 3: 1, 4: 5}

我对性能感兴趣,并且正在寻找可以将 n 个字典列表合并到一个字典中并对值求和的最快实现.一个明显的实现是:

I'm interested in performance and am looking for the fastest implementation that can merge a list of n-dictionaries into one dictionary and sum the values. An obvious implementation is:

from collections import defaultdict

merged = defaultdict(int)

for d in ds:
    for k, v in d.items():
        merged[k] += v

在 Python 2.6 中有更快的方法吗?

Is there a faster way to do this in Python 2.6?

推荐答案

defaultdict 仍然是最快的,我找到了一些通过缓存函数名称来加快速度的方法,现在又找到了另一种加快速度的方法显着提高,只需迭代 for k in d 而不是使用 d.items()d.iteritems()

defaultdict is still fastest, I found a few ways to speed it up by caching function names and now just found another way that sped it up significantly, by just iterating for k in d instead of using d.items() or d.iteritems()

到目前为止的一些时间:

Some timings so far:

from random import randrange
ds = [dict((randrange(1, 1000), randrange(1, 1000)) for i in xrange(500))
      for i in xrange(10000)]

# 10000 dictionaries of approx. length 500

from collections import defaultdict

def merge1(dicts, defaultdict=defaultdict, int=int):
    merged = defaultdict(int)
    for d in dicts:
        for k in d:
            merged[k] += d[k]
    return merged

def merge2(dicts):
    merged = {}
    merged_get = merged.get
    for d in dicts:
        for k in d:
            merged[k] = merged_get(k, 0) + d[k]
    return merged


def merge3(dicts):
    merged = {}
    for d in dicts:
        for k in d:
            merged[k] = merged[k] + d[k] if k in merged else 0
    return merged


from timeit import timeit
for func in ('merge1', 'merge2', 'merge3'):
    print func, timeit(stmt='{0}(ds)'.format(func),
                       setup='from __main__ import merge1, merge2, merge3, ds',
                       number=1)

<小时>

merge1 0.992541510164
merge2 1.40478747997
merge3 1.23502204889

这篇关于合并 n 字典并在 2.6 上添加值的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆