如何创建频率矩阵? [英] How to create a frequency matrix?

查看:197
本文介绍了如何创建频率矩阵?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚开始使用Python,但是遇到了以下问题:

I just started using Python and I just came across the following problem:

想象一下,我有以下列表列表:

Imagine I have the following list of lists:

list = [["Word1","Word2","Word2","Word4566"],["Word2", "Word3", "Word4"], ...]

我想要得到的结果(矩阵)应如下所示:

The result (matrix) i want to get should look like this:

显示的列和行都是出现的单词(无论是哪个列表).

我想要的是一个程序,该程序计算每个列表(按列表)中单词的出现.

The Displayed Columns and Rows are all appearing words (no matter which list).

The thing that I want is a programm that counts the appearence of words in each list (by list).

图片是第一个列表之后的结果.

The picture is the result after the first list.

有没有简单的方法可以实现类似或类似的目的?

Is there an easy way to achieve something like this or something similar?


基本上,我想要一个列表/矩阵,它告诉我当单词1也出现在列表中时,单词2-4566出现了多少次,依此类推.


Basically I want a List/Matrix that tells me how many times words 2-4566 appeared when word 1 was also in the list, and so on.

因此,我将为每个单词得到一个列表,该列表显示与此单词相关的所有其他4555个单词的绝对频率.

So I would get a list for each word that displays the absolute frequency of all other 4555 words in relationship with this word.


因此,我需要一种算法来遍历所有这些单词列表并构建结果列表


So I would need an algorithm that iterates through all this lists of words and builts the result lists

推荐答案

我设法为自己的问题提出了正确的答案:

I managed to come up with the right answer to my own question:

list = [["Word1","Word2","Word2"],["Word2", "Word3", "Word4"],["Word2","Word3"]]

#Names of all dicts
all_words = sorted(set([w for sublist in list for w in sublist]))

#Creating the dicts
dicts = []
for i in all_words:
    dicts.append([i, dict.fromkeys([w for w in all_words if w != i],0)])

#Updating the dicts
for l in list:
    for word in sorted(set(l)):
        tmpL = [w for w in l if w != word]
        ind = ([w[0] for w in dicts].index(word))

        for w in dicts[ind][1]:
            dicts[ind][1][w] += l.count(w)

print dicts

获取结果:

['Word1',{'Word4':0,'Word3':0,'Word2':2}],['Word2',{'Word4':1,'Word1':1,'Word3' :2}],['Word3',{'Word4':1,'Word1':0,'Word2':2}],['Word4',{'Word1':0,'Word3':1,' Word2':1}]]

['Word1', {'Word4': 0, 'Word3': 0, 'Word2': 2}], ['Word2', {'Word4': 1, 'Word1': 1, 'Word3': 2}], ['Word3', {'Word4': 1, 'Word1': 0, 'Word2': 2}], ['Word4', {'Word1': 0, 'Word3': 1, 'Word2': 1}]]

这篇关于如何创建频率矩阵?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆