可视化数据和集群 [英] Visualize data and clustering

查看：90 发布时间：2020/10/3 2:11:03 python cluster-analysis visualization

本文介绍了可视化数据和集群的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我目前正在编写一个python脚本来查找文档之间的相似性。我已经计算了每个文档对的相似性得分并将其存储在字典中。看起来像这样：

i am currently writing a python script to find the similarity between documents.I have already calculated the similarities score for each document pairs and store them in dictionaries. It looks something like this:

{（8328，8327）：1.0，（8313，8306）：0.12405229825691289，（8329，8328）：1.0，（8322，8321 ）：0.99999999999999989，（8328，8329）：1.0，（8306，8316）：0.12405229825691289，（8320，8319）：0.67999999999999989，（8337，8336）：1.0000000000000002，（8319，8320）：0.67999999999999989，（8313，8316）： 0.99999999999999989，（8321，8322）：0.99999999999999989，（8330，8328）：1.0}

{(8328, 8327): 1.0, (8313, 8306): 0.12405229825691289, (8329, 8328): 1.0, (8322, 8321): 0.99999999999999989, (8328, 8329): 1.0, (8306, 8316): 0.12405229825691289, (8320, 8319): 0.67999999999999989, (8337, 8336): 1.0000000000000002, (8319, 8320): 0.67999999999999989, (8313, 8316): 0.99999999999999989, (8321, 8322): 0.99999999999999989, (8330, 8328): 1.0}

我的最终目标是将相似的文档聚集在一起。上面的数据可以用其他方式查看。假设文档对（8313,8306）。相似度分数是0.12405。我可以指定分数的倒数是文档8313和8306之间的距离。因此，相似的文档将聚集在一起，而不太相似的文档将基于它们的距离分开。

My final goal is to cluster the similar documents together. The data above can be viewed in another way. Let's say the document pair (8313,8306). The similarity score is 0.12405. I can specified that the inverse of the score will be the distance between document 8313 and 8306. Therefore, similar documents will cluster closer together while not-so-similar documents will be further apart based on their distance.

我的问题是，是否有任何开源可视化工具可以帮助我实现这一目标？

My question is, IS there any open source visualization tool that can help me to achieve this?

可视化数据和集群 [英] Visualize data and clustering

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

可视化数据和集群 [英] Visualize data and clustering

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭