Scikit-learn:如何在一维数组上运行KMeans? [英] Scikit-learn: How to run KMeans on a one-dimensional array?
问题描述
我有一个介于0和1之间的13.876(13,876)个值的数组.我想仅将sklearn.cluster.KMeans
应用于此向量,以找到将值分组的不同簇.但是,似乎KMeans适用于多维数组,而不适用于一维数组.我想有一个使它起作用的技巧,但我不知道如何做.我看到了 KMeans.fit()接受"X:类似数组或稀疏矩阵,形状=(n_samples,n_features)" ,但它希望n_samples
大于1
I have an array of 13.876(13,876) values between 0 and 1. I would like to apply sklearn.cluster.KMeans
to only this vector to find the different clusters in which the values are grouped. However, it seems KMeans works with a multidimensional array and not with one-dimensional ones. I guess there is a trick to make it work but I don't know how. I saw that KMeans.fit() accepts "X : array-like or sparse matrix, shape=(n_samples, n_features)", but it wants the n_samples
to be bigger than one
我尝试将数组放在np.zeros()矩阵上并运行KMeans,但随后将所有非空值放在类1上,其余都放在类0上.
I tried putting my array on a np.zeros() matrix and run KMeans, but then is putting all the non-null values on class 1 and the rest on class 0.
有人可以帮助在一维数组上运行此算法吗? 非常感谢!
Can anyone help in running this algorithm on a one-dimensional array? Thanks a lot!
推荐答案
You have many samples of 1 feature, so you can reshape the array to (13,876, 1) using numpy's reshape:
from sklearn.cluster import KMeans
import numpy as np
x = np.random.random(13876)
km = KMeans()
km.fit(x.reshape(-1,1)) # -1 will be calculated to be 13876 here
这篇关于Scikit-learn:如何在一维数组上运行KMeans?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!