如何将文本数据聚类成多列? [英] How can I cluster text data with multiple columns?

查看：68 发布时间：2021/2/15 19:03:36 cluster-analysis k-means data-science tfidfvectorizer

本文介绍了如何将文本数据聚类成多列?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想做一个k表示用具有标题"，类型"，评论"和摘要"列的书本文本数据进行聚类.

I'd like to do a k means clustering with book text data that has 'title', 'genre', 'review', and 'synopsis' columns.

我想使用标题"作为指示符或主键进行聚类，但是我不确定如何为此使用多列.

I want to use the 'title' as the indicator, or primary key, for clustering, but I'm not sure how to use multiple columns for this.

我知道我首先必须对数据进行矢量化，但是矢量化需要输入系列数据，而不是数据帧值.所以在这里，我又一次不知道如何使用所有列.

I know that I first have to vectorize the data, but vectorization takes in series data and not dataframe values; so here, again, I don't know how to use all the columns as I want to.

如何将文本数据聚类成多列? [英] How can I cluster text data with multiple columns?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何将文本数据聚类成多列? [英] How can I cluster text data with multiple columns?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭