Scipy:稀疏矩阵给出不正确的值 [英] Scipy: Sparse Matrix giving incorrect values

查看:97
本文介绍了Scipy:稀疏矩阵给出不正确的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面是我用于生成稀疏矩阵的代码:

Below is my code for generating my sparse matrix:

import numpy as np
import scipy

def sparsemaker(X, Y, Z):
    'X, Y, and Z are 2D arrays of the same size'
    x_, row = np.unique(X, return_inverse=True)
    y_, col = np.unique(Y, return_inverse=True)
    return scipy.sparse.csr_matrix( (Z.flat,(row,col)), shape=(x_.size, y_.size) )

>>> print sparsemaker(A, B, C) #A, B, and C are (220, 256) sized arrays.
(0, 0)  167064.269831
(0, 2)  56.6146564629
(0, 9)  53.8660340698
(0, 23) 80.6529717039
(0, 28) 0.0
(0, 33) 53.2379218326
(0, 40) 54.3868995375
 :          :

现在我的输入数组有点大,所以我不知道如何将它们发布到这里(除非任何人有任何想法);但即使看着第一个值,我也已经知道出了什么问题:

Now my input arrays are a bit large, so i don't know how to post them here (unless anyone has any ideas); but even looking at the first value, i can already tell something is wrong:

>>> test = sparsemaker(A, B, C)
>>> np.max(test.toarray())
167064.26983076424

>>> np.where(C==np.max(test.toarray()))
(array([], dtype=int64), array([], dtype=int64))

有人知道为什么会这样吗?这种价值从何而来?

Does anyone know why this would happen? Where did that value come from?

推荐答案

您有重复的坐标,并且构造函数正在将它们全部加起来.请执行以下操作:

You have repeated coordinates, and the constructor is adding them all up. Do the following :

x_, row = np.unique(X, return_inverse=True)
y_, col = np.unique(Y, return_inverse=True)
print Z.flat[(row == 0) & (col == 0)].sum()

,您应该将那神秘的167064.26983076424打印出来.

and you should get that mysterious 167064.26983076424 printed out.

编辑下面的丑陋代码可以很好地结合平均示例中的小例子,并从

EDIT The ugly code that follows works fine with small examples in averaging repeated entries, with some code borrowed from this other question, give it a try:

def sparsemaker(X, Y, Z):
    'X, Y, and Z are 2D arrays of the same size'
    x_, row = np.unique(X, return_inverse=True)
    y_, col = np.unique(Y, return_inverse=True)
    indices = np.array(zip(row, col))
    _, repeats = np.unique(indices.view([('', indices.dtype)]*2),
                           return_inverse=True)
    counts = 1. / np.bincount(repeats)
    factor = counts[repeats]

    return scipy.sparse.csr_matrix((Z.flat * factor,(row,col)),
                                   shape=(x_.size, y_.size))

这篇关于Scipy:稀疏矩阵给出不正确的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆