优化与dict.update（）类似的功能，但添加了常用值 [英] Optimize function similiar to dict.update() but adds common values

查看：56 发布时间：2019/6/6 16:47:22 python

本文介绍了优化与dict.update（）类似的功能，但添加了常用值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

嘿伙计们，

我以为我会把它扔出去，因为每个人都喜欢优化

问题（这是真的，结账对这类

问题的回复数量！）

现在我有一个如下所示的功能。我想结合两个

词典，并将值加在一起，哪里有重叠。

我认为这个功能会相当快，但我在每个字典中都有大约1

百万的键（约80％重叠）并且它只运行所有

晚上。

所以也许我的功能不是O（n）？我真的认为它可能是

字典查找是O（nlogn）？

无论如何这里是函数。任何关于完成这项任务的想法很快就会被贬值。

谢谢！

< code apology =抱歉，如果gmail杀死我的标签>

def add_freqs（freq1，freq2）：

"""添加两个单词freq dicts" ;"

newfreq = {}

为密钥，值为freq1.items（）：

newfreq [key] = value + freq2 .get（key，0）

表示密钥，值为freq2.items（）：

newfreq [key] = value + freq1.get（key，0）

返回newfreq

freq1 = {

''dog''：1，

''猫''：2，

''人类''：3

}

freq2 = {

' 'perro''：1，

''gato''：1，

''human''：2

}

print add_freqs（freq1，freq2）

answer_I_want = {

''dog''：1，

''猫''：2，

''perro''：1，

''gato''：1，

' '人''：5 <无线电通信/>
}

< / code>

-

Gregory Pi？ero

首席创新官

混合技术

（ www。 blendedtechnologies.com ）

解决方案

Gregory Pi？ero写道：

def add_freqs（ freq1，freq2）：
****""" Add * two * word * freq * dicts""
**** newfreq = {}
* *** for * key，value * in * freq1.items（）：
******** newfreq [key] = value + freq2.get（key，0）
** ** for * key，value * in * freq2.items（）：
******** newfreq [key] = value + freq1.get（key，0）
*** * return * newfreq
* * * * * * * * * * * * * * * * * * * $ * $ $ $ $ $ $ $ $ $ $ $ $ $
将整个字典放入元组列表中;

iteritems（）只需遍历现有字典并一次创建一个元组

。

在80％重叠的情况下，你正在查找并设置五个值中的四个

在你的for循环中两次。

转储对称并尝试其中之一：

def add_freqs2（freq1，freq2）：

total = dict（freq1）

for key，freq2.iteritems（）中的值：

如果键在freq1中：

总计[键] + =值

else：

总计[key] = value

返回总额

def add_freqs3（freq1，freq2）：

total = dict（freq1）

for key，value in freq2.iteritems（）：

try：

total [key] + = value

除了KeyError：

总计[key] = value

返回总额

我的猜测是add_freqs3（）表现最好。

Peter

谢谢Peter，这些都是非常好的想法。我等不及试试今晚他们出去了。

这是关于你的功能的问题。如果我只看看

freq2中的键，那么我不会错过freq1中的任何键而不是freq2中的键吗？

这就是为什么我的原始功能中有两个循环。

-Greg

12/14 / 05，Peter Otten< __ ******* @ web.de>写道：
Gregory Pi？ero写道：

def add_freqs（freq1，freq2）：
"""" Addtwowordfreqdicts""
newfreq = {}
forkey，valueinfreq1.items（）：
newfreq [key] = value + freq2.get（key，0）
forkey，valueinfreq2.items （）：
newfreq [key] = value + freq1.get（key，0）
returnnewfreq

Anyideasondoingthistaskalot更快将被appriciated。 / blockquote>

使用items（）将整个字典复制到元组列表中;
iteritems（）只是遍历现有字典并一次创建一个元组
。

80％重叠，你在for循环中查找并设置五个值中的四个

转储对称性并尝试其中一个：

def add_freqs2（freq1，freq2）：
total = dict（freq1）
for key，freq2.iteritems（）中的值：
如果键入f req1：
总[key] + = value
否则：
总[key] =值
返回总数

def add_freqs3（freq1，freq2）：
total = dict（freq1）
for key，value in freq2.iteritems（）：
尝试：
total [key] + = value
除了KeyError：
总共[key] =值
返回总数

我的猜测是add_freqs3（）表现最好。

彼得
-
http://mail.python.org/mailman/ listinfo / python-list

-

Gregory Pi？ero

首席创新官

混合技术

（ www.blendedtechnologies.com ）

Gregory Pi？ero写道：

[top-posting rearranged]
2005年12月14日，Peter Otten< __ ******* @ web.de>写道：

Gregory Pi？ero写道：

def add_freqs（freq1，freq2）：
" "" Addtwowordfreqdicts"""
newfreq = {}
forkey，valueinfreq1.items（）：
newfreq [key] = value + freq2.get（key，0）
forkey，valueinfreq2.items（）：
newfreq [key] = value + freq1.get（key，0）
returnnewfreq

Anyideasondoingthistaskalot更快将被appriciated。

使用items（）您将整个字典复制到元组列表;
iteritems（）只是遍历现有字典并创建一个元组
一次。

80％重叠，你正在查找并在你的for循环中设置两个五个值中的四个。

转储对称并尝试以下方法之一：

def add_freqs2（freq1，freq2）：
total = dict（freq1）
for key，freq2中的值.iteritems（）：
如果键在freq1：
总[键] + =值
否则：
总[键] =值
返回总数

def add_freqs3（freq1，freq2）：
total = dict（freq1）
for key，value in freq2.iteritems（）：
尝试：
total [key] + = value
除了KeyError：
总计[key] = value
返回总数

我的猜测是add_freqs3（）表现最好。

谢谢Peter，这些都是非常好的想法。我迫不及待地想在今晚试一试。

这是关于你的功能的问题。如果我只看了
freq2中的键，那么我不会错过freq1中的任何键而不是freq2中的键吗？
这就是为什么我在原始函数中有两个循环的原因。

不，因为声明

total = dict（freq1）将total创建为freq1的浅表副本。因此

剩下要做的就是从freq2添加物品。

问候

史蒂夫

-

Steve Holden +44 150 684 7255 +1 800 494 3119

Holden Web LLC www.holdenweb.com

PyCon TX 2006 www.python.org/pycon/

Hey guys,

I thought I''d throw this out there since everyone loves optimization
questions (it''s true, check out the number of replies to those type of
questions!)

Right now I have a function shown below. I want to combine two
dictionaries and add the values together where-ever there is overlap.
I figured this function would be reasonably fast, but I have around 1
million keys in each dictionary (~80% overlap) and it just runs all
night.

So maybe my function is not O(n)? I really figured it was but maybe
dictionary lookups are O(nlogn)?

Anyway here is the function. Any ideas on doing this task a lot
faster would be appriciated.

Thanks!

<code apology=sorry if gmail kills my tabs>

def add_freqs(freq1,freq2):
"""Add two word freq dicts"""
newfreq={}
for key,value in freq1.items():
newfreq[key]=value+freq2.get(key,0)
for key,value in freq2.items():
newfreq[key]=value+freq1.get(key,0)
return newfreq

freq1={
''dog'':1,
''cat'':2,
''human'':3
}
freq2={
''perro'':1,
''gato'':1,
''human'':2
}
print add_freqs(freq1,freq2)

answer_I_want={
''dog'':1,
''cat'':2,
''perro'':1,
''gato'':1,
''human'':5
}
</code>
--
Gregory Pi?ero
Chief Innovation Officer
Blended Technologies
(www.blendedtechnologies.com)
解决方案
Gregory Pi?ero wrote:

def add_freqs(freq1,freq2):
****"""Add*two*word*freq*dicts"""
****newfreq={}
****for*key,value*in*freq1.items():
********newfreq[key]=value+freq2.get(key,0)
****for*key,value*in*freq2.items():
********newfreq[key]=value+freq1.get(key,0)
****return*newfreq Any*ideas*on*doing*this*task*a*lot faster would be appriciated.

With items() you copy the whole dictionary into a list of tuples;
iteritems() just walks over the existing dictionary and creates one tuple
at a time.

With "80% overlap", you are looking up and setting four out of five values
twice in your for-loops.

Dump the symmetry and try one of these:

def add_freqs2(freq1, freq2):
total = dict(freq1)
for key, value in freq2.iteritems():
if key in freq1:
total[key] += value
else:
total[key] = value
return total

def add_freqs3(freq1, freq2):
total = dict(freq1)
for key, value in freq2.iteritems():
try:
total[key] += value
except KeyError:
total[key] = value
return total

My guess is that add_freqs3() will perform best.

Peter

Thanks Peter, those are some really good ideas. I can''t wait to try
them out tonight.

Here''s a question about your functions. if I only look at the keys in
freq2 then won''t I miss any keys that are in freq1 and not in freq2?
That''s why I have the two loops in my original function.

-Greg
On 12/14/05, Peter Otten <__*******@web.de> wrote:
Gregory Pi?ero wrote:
def add_freqs(freq1,freq2):
"""Addtwowordfreqdicts"""
newfreq={}
forkey,valueinfreq1.items():
newfreq[key]=value+freq2.get(key,0)
forkey,valueinfreq2.items():
newfreq[key]=value+freq1.get(key,0)
returnnewfreq

Anyideasondoingthistaskalot faster would be appriciated.

With items() you copy the whole dictionary into a list of tuples;
iteritems() just walks over the existing dictionary and creates one tuple
at a time.

With "80% overlap", you are looking up and setting four out of five values
twice in your for-loops.

Dump the symmetry and try one of these:

def add_freqs2(freq1, freq2):
total = dict(freq1)
for key, value in freq2.iteritems():
if key in freq1:
total[key] += value
else:
total[key] = value
return total

def add_freqs3(freq1, freq2):
total = dict(freq1)
for key, value in freq2.iteritems():
try:
total[key] += value
except KeyError:
total[key] = value
return total

My guess is that add_freqs3() will perform best.

Peter
--
http://mail.python.org/mailman/listinfo/python-list

--
Gregory Pi?ero
Chief Innovation Officer
Blended Technologies
(www.blendedtechnologies.com)

Gregory Pi?ero wrote:
[top-posting rearranged]
On 12/14/05, Peter Otten <__*******@web.de> wrote:
Gregory Pi?ero wrote:

def add_freqs(freq1,freq2):
"""Addtwowordfreqdicts"""
newfreq={}
forkey,valueinfreq1.items():
newfreq[key]=value+freq2.get(key,0)
forkey,valueinfreq2.items():
newfreq[key]=value+freq1.get(key,0)
returnnewfreq

Anyideasondoingthistaskalot faster would be appriciated.

With items() you copy the whole dictionary into a list of tuples;
iteritems() just walks over the existing dictionary and creates one tuple
at a time.

With "80% overlap", you are looking up and setting four out of five values
twice in your for-loops.

Dump the symmetry and try one of these:

def add_freqs2(freq1, freq2):
total = dict(freq1)
for key, value in freq2.iteritems():
if key in freq1:
total[key] += value
else:
total[key] = value
return total

def add_freqs3(freq1, freq2):
total = dict(freq1)
for key, value in freq2.iteritems():
try:
total[key] += value
except KeyError:
total[key] = value
return total

My guess is that add_freqs3() will perform best.

Thanks Peter, those are some really good ideas. I can''t wait to try
them out tonight.

Here''s a question about your functions. if I only look at the keys in
freq2 then won''t I miss any keys that are in freq1 and not in freq2?
That''s why I have the two loops in my original function.

No, because the statement

total = dict(freq1) creates total as a shallow copy of freq1. Thus
all that remains to be done is to add the items from freq2.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/

这篇关于优化与dict.update（）类似的功能，但添加了常用值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

优化与dict.update（）类似的功能，但添加了常用值 [英] Optimize function similiar to dict.update() but adds common values

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

优化与dict.update（）类似的功能，但添加了常用值 [英] Optimize function similiar to dict.update() but adds common values

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭