默认值的变体只能分配一次值 [英] Variant of defaultdict for assigning value only once

查看:98
本文介绍了默认值的变体只能分配一次值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个单词的词典,并将其散列成一个整数进行进一步处理。是否有一个 defaultdict 的变体,我可以使用它来避免检查如果wordid不在wordid 中。这是一个非常大的文件,需要时间有效的方法。

  wordid = defaultdict(int)
totaluniquewords = 0
在句子中的单词:
如果word不在wordid中:
totaluniquewords + = 1
wordid [word] = totaluniquewords
/ pre>

解决方案

这是一个更简单快捷的方式来获得所需的东西:

  from itertools import count 

wordid = dict(zip(set(sentencewords),count(1)))

这使用来获取唯一字在句子中,将每个独特单词与 count()(向上计数)的下一个值进行配对,从结果构建一个字典。


I am trying to create a dictionary of words with words hashed to an integer for further processing. Is there a variant of defaultdict that i can use to avoid the check if word not in wordid. This is a very large file and need time efficient way of doing this.

 wordid=defaultdict(int)
 totaluniquewords = 0
 for word in sentencewords:
    if word not in wordid:
        totaluniquewords+=1
        wordid[word]=totaluniquewords

解决方案

Here's a simpler and faster way to get what you want:

from itertools import count

wordid = dict(zip(set(sentencewords), count(1)))

This uses a set to obtain the unique words in sentencewords, pairs each of those unique words with the next value from count() (which counts upwards), and constructs a dictionary from the results.

这篇关于默认值的变体只能分配一次值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆