显示k-modes集群功能 [英] Reveal k-modes cluster features

查看:140
本文介绍了显示k-modes集群功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在对分类数据执行聚类分析,因此使用k模式方法.

I'm performing a cluster analysis on categorical data, hence using k-modes approach.

我的数据构成为偏好调查: 您如何看待头发和眼睛?

My data is shaped as a preference survey: How do you like hair and eyes?

受访者可以从固定的(多项选择)的4种可能性中选择答案.

The respondent can pick up an answers from a fixed (multiple choice) set of 4 possibility.

因此,我得到了虚拟对象,应用了k模式,将这些簇附加到初始df上,然后使用pca在2D中进行绘制.

I therefore get the dummies, apply k-modes, attach the clusters back to the initial df and then plot them in 2D with pca.

我的代码如下:

import numpy as np
import pandas as pd
from kmodes import kmodes

df_dummy = pd.get_dummies(df)

#transform into numpy array
x = df_dummy.reset_index().values

km = kmodes.KModes(n_clusters=3, init='Huang', n_init=5, verbose=0)
clusters = km.fit_predict(x)
df_dummy['clusters'] = clusters


import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
pca = PCA(2)

# Turn the dummified df into two columns with PCA
plot_columns = pca.fit_transform(df_dummy.ix[:,0:12])

# Plot based on the two dimensions, and shade by cluster label
plt.scatter(x=plot_columns[:,1], y=plot_columns[:,0], c=df_dummy["clusters"], s=30)
plt.show()

我可以看到:

现在我的问题是: 能否以某种方式揭示每个集群的独特特征? 即,散点图中的绿点组的主要特征(也许是金发和蓝眼睛)是什么?

Now my problem is: Can somehow reveal the distinctive feature of each cluster? ie, what are the main characteristics (maybe blond hair and blue eyes) of the group of green dots in the scatterplot?

我知道集群已经发生了,但是我找不到一种方法来翻译集群的实际含义.

I get the clustering has happened, but I can't find a way to translate what the clustering actually means.

我应该使用.labels_对象吗?

Should I play with the .labels_ object?

推荐答案

看看km.cluster_centroids_.这将给出每个群集的每个变量的模式.

Take a look at km.cluster_centroids_. This will give the mode of each variable for each cluster.

这篇关于显示k-modes集群功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆