嵌套单词列表中的共现矩阵 [英] Co-occurrence matrix from nested list of words

查看：59 发布时间：2021/5/30 18:46:40 python pandas list matrix networkx

本文介绍了嵌套单词列表中的共现矩阵的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个名字列表，例如:

I have a list of names like:

names = ['A', 'B', 'C', 'D']

和文档列表，在每个文档中都提到了其中一些名称.

and a list of documents, that in each documents some of these names are mentioned.

document =[['A', 'B'], ['C', 'B', 'K'],['A', 'B', 'C', 'D', 'Z']]

我想获得以共现矩阵形式显示的输出，例如:

I would like to get an output as a matrix of co-occurrences like:

  A  B  C  D
A 0  2  1  1
B 2  0  2  1
C 1  2  0  1
D 1  1  1  0

针对R中的此问题，有一种解决方案(创建共现矩阵)，但我无法在Python中做到这一点.我正在考虑在Pandas中进行此操作，但是还没有任何进展！

There is a solution (Creating co-occurrence matrix) for this problem in R, but I couldn't do it in Python. I am thinking of doing it in Pandas, but yet no progress!

推荐答案

很明显，可以出于您的目的对其进行扩展，但是它会记住以下常规操作:

Obviously this can be extended for your purposes, but it performs the general operation in mind:

import math

for a in 'ABCD':
    for b in 'ABCD':
        count = 0

        for x in document:
            if a != b:
                if a in x and b in x:
                    count += 1

            else:
                n = x.count(a)
                if n >= 2:
                    count += math.factorial(n)/math.factorial(n - 2)/2

        print '{} x {} = {}'.format(a, b, count)

这篇关于嵌套单词列表中的共现矩阵的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

嵌套单词列表中的共现矩阵 [英] Co-occurrence matrix from nested list of words

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

嵌套单词列表中的共现矩阵 [英] Co-occurrence matrix from nested list of words

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭