字典共享对象之间没有理由? [英] dictionary shared between objects for no reason?

查看:123
本文介绍了字典共享对象之间没有理由?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面的代码应该创建一个新的(修改)版本的频率分布(nltk.FreqDist)。两个变量应该是相同的长度。

The following code is supposed to create a new (modified) version of a frequency distribution (nltk.FreqDist). Both variables should then be the same length.

当创建WebText的单个实例时,它可以正常工作。但是当创建多个WebText实例时,新变量似乎被所有对象共享。

It works fine when a single instance of WebText is created. But when multiple WebText instances are created, then the new variable seems to be shared by all the objects.

例如:

import nltk
from operator import itemgetter

class WebText:

    freq_dist_weighted = {}

    def __init__(self, text):
        tokens = nltk.wordpunct_tokenize(text) #tokenize
        word_count = len(tokens)
        freq_dist = nltk.FreqDist(tokens)


        for word,frequency in freq_dist.iteritems():
            self.freq_dist_weighted[word] = frequency/word_count*frequency
        print len(freq_dist), len(self.freq_dist_weighted)

text1 = WebText("this is a test")
text2 = WebText("this is another test")
text3 = WebText("a final sentence")

会导致

4 4
4 5
3 7

这不正确。因为我只是转置和修改值,每列中应该有相同的数字。
如果我在循环之前重置freq_dist_weighted,它工作正常:

Which is incorrect. Since I am just transposing and modifying values, there should be the same numbers in each column. If I reset the freq_dist_weighted just before the loop, it works fine:

import nltk
from operator import itemgetter

class WebText:

    freq_dist_weighted = {} 

    def __init__(self, text):
        tokens = nltk.wordpunct_tokenize(text) #tokenize
        word_count = len(tokens)
        freq_dist = nltk.FreqDist(tokens)
        self.freq_dist_weighted = {}

        for word,frequency in freq_dist.iteritems():
            self.freq_dist_weighted[word] = frequency/word_count*frequency
        print len(freq_dist), len(self.freq_dist_weighted)

text1 = WebText("this is a test")
text2 = WebText("this is another test")
text3 = WebText("a final sentence")

导致(正确):

4 4
4 4
3 3

这对我来说没有意义。

This doesn't make sense to me.

我不明白为什么我必须重置它,因为它被隔离在对象中。我做错了什么?

I don't see why I would have to reset it, since it's isolated within the objects. Am I doing something wrong?

推荐答案

您的评论是明显错误的。类范围中的对象仅在创建类时初始化;如果你想为每个实例一个不同的对象,那么你需要将它移动到初始化器。

Your comment is blatantly wrong. Objects in a class scope are only initialized when the class is created; if you want a different object per instance then you need to move it into the initializer.

class WebText:
    def __init__(self, text):
        self.freq_dist_weighted = {} #### RESET the dictionary HERE ####
         ...

这篇关于字典共享对象之间没有理由?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆