默认值的变体只能分配一次值 [英] Variant of defaultdict for assigning value only once
问题描述
我正在尝试创建一个单词的词典,并将其散列成一个整数进行进一步处理。是否有一个 defaultdict
的变体,我可以使用它来避免检查如果wordid不在wordid
中。这是一个非常大的文件,需要时间有效的方法。
wordid = defaultdict(int)
/ pre>
totaluniquewords = 0
在句子中的单词:
如果word不在wordid中:
totaluniquewords + = 1
wordid [word] = totaluniquewords
解决方案这是一个更简单快捷的方式来获得所需的东西:
from itertools import count
wordid = dict(zip(set(sentencewords),count(1)))
这使用
集
来获取唯一字在句子
中,将每个独特单词与count()
(向上计数)的下一个值进行配对,从结果构建一个字典。I am trying to create a dictionary of words with words hashed to an integer for further processing. Is there a variant of
defaultdict
that i can use to avoid the checkif word not in wordid
. This is a very large file and need time efficient way of doing this.wordid=defaultdict(int) totaluniquewords = 0 for word in sentencewords: if word not in wordid: totaluniquewords+=1 wordid[word]=totaluniquewords
解决方案Here's a simpler and faster way to get what you want:
from itertools import count wordid = dict(zip(set(sentencewords), count(1)))
This uses a
set
to obtain the unique words insentencewords
, pairs each of those unique words with the next value fromcount()
(which counts upwards), and constructs a dictionary from the results.这篇关于默认值的变体只能分配一次值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!