稀疏矩阵的sklearn tsne [英] sklearn tsne with sparse matrix

查看:247
本文介绍了稀疏矩阵的sklearn tsne的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在具有预先计算的距离值的非常稀疏的矩阵上显示tsne,但遇到了麻烦.

I'm trying to display tsne on a very sparse matrix with precomputed distances values but I'm having trouble with it.

归结为:

row = np.array([0, 2, 2, 0, 1, 2])
col = np.array([0, 0, 1, 2, 2, 2])
distances = np.array([.1, .2, .3, .4, .5, .6])
X = csc_matrix((distances, (row, col)), shape=(3, 3))
Y = TSNE(metric='precomputed').fit_transform(X)

但是,出现此错误:

TypeError:传递了稀疏矩阵,但是需要密集数据 method ="barnes_hut".使用X.toarray()转换为密集的numpy数组 如果数组足够小以适合内存.否则 考虑降维技术(例如TruncatedSVD)

TypeError: A sparse matrix was passed, but dense data is required for method="barnes_hut". Use X.toarray() to convert to a dense numpy array if the array is small enough for it to fit in memory. Otherwise consider dimensionality reduction techniques (e.g. TruncatedSVD)

由于我已经计算了距离,所以我不想执行TruncatedSVD.

I don't want to perform TruncatedSVD since I already computed distances.

如果我更改method='exact',则会收到另一个错误(这有点可疑):

If I change the method='exact', I get another error (which is somewhat questionable):

NotImplementedError:> =和< =不适用于0.

NotImplementedError: >= and <= don't work with 0.

注意:我的距离矩阵约为100k x 100k,大约有1M个非零值.

NOTE: my distance matrix is about 100k x 100k with approximately 1M non zero values.

有什么想法吗?

推荐答案

我认为这应该可以解决您的问题:

I think this should solve your problem:

X = csr_matrix((distances, (row, col)), shape=(3, 3)).todense()

如果您真的对csr_matrix而不是csc_matrix进行了说明

If you really ment csr_matrix instead of csc_matrix

这篇关于稀疏矩阵的sklearn tsne的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆