python中的单词聚类列表 [英] clustering list of words in python

查看：154 发布时间：2020/5/18 0:41:48 python nlp cluster-analysis text-mining

本文介绍了python中的单词聚类列表的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我是文本挖掘的新手，这是我的情况. 假设我有一个单词列表[[car]，'dog'，'puppy'，'vehicle']，我想将单词聚类为k组，我希望输出为[['car'，'vehicle' ]，['dog'，'puppy']]. 我首先计算每个成对单词的相似度得分，以获得4x4矩阵(在这种情况下)M，其中Mij是单词i和j的相似度得分. 将单词转换为数值数据后，我利用了不同的聚类库(例如sklearn)或自己实现了聚类.

I am a newbie in text mining, here is my situation. Suppose i have a list of words ['car', 'dog', 'puppy', 'vehicle'], i would like to cluster words into k groups, I want the output to be [['car', 'vehicle'], ['dog', 'puppy']]. I first calculate similarity score of each pairwise word to obtain a 4x4 matrix(in this case) M, where Mij is the similarity score of word i and j. After transforming the words into numeric data, i utilize different clustering library(such as sklearn) or implement it by myself to get the word clusters.

我想知道这种方法是否有意义?此外，如何确定k的值?更重要的是，我知道存在不同的聚类技术，我在考虑是否应该使用k-means或k-medoids进行单词聚类?

I want to know does this approach makes sense? Besides, how do I determine the value of k? More importantly, i know that there exist different clustering technique, i am thinking whether i should use k-means or k-medoids for word clustering?

python中的单词聚类列表 [英] clustering list of words in python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

python中的单词聚类列表 [英] clustering list of words in python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭