从SciPy稀疏Coo矩阵填充Pandas SparseDataFrame [英] Populate a Pandas SparseDataFrame from a SciPy Sparse Coo Matrix

查看:346
本文介绍了从SciPy稀疏Coo矩阵填充Pandas SparseDataFrame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

(此问题与从中填充熊猫SparseDataFrame一个SciPy稀疏矩阵"..我想从scipy.sparse中填充一个SparseDataFrame. coo _matrix(具体来说)提到的问题是针对另一个SciPy稀疏矩阵( csr )... 所以就这样...)

(This question relates to "populate a Pandas SparseDataFrame from a SciPy Sparse Matrix". I want to populate a SparseDataFrame from a scipy.sparse.coo_matrix (specifically) The mentioned question is for a different SciPy Sparse Matrix (csr)... So here it goes...)

我注意到Pandas现在具有对稀疏矩阵和数组的支持 .目前,我这样创建DataFrame():

I noticed Pandas now has support for Sparse Matrices and Arrays. Currently, I create DataFrame()s like this:

return DataFrame(matrix.toarray(), columns=features, index=observations)

是否可以用scipy.sparse.coo_matrix()coo_matrix()创建SparseDataFrame()?转换为密集格式会严重破坏RAM ...!

Is there a way to create a SparseDataFrame() with a scipy.sparse.coo_matrix() or coo_matrix()? Converting to dense format kills RAM badly...!

推荐答案

http://pandas.pydata.org/pandas-docs/stable/sparse.html#interaction-with-scipy-sparse

实现了一种方便的方法SparseSeries.from_coo(),用于从scipy.sparse.coo_matrix创建SparseSeries.

A convenience method SparseSeries.from_coo() is implemented for creating a SparseSeries from a scipy.sparse.coo_matrix.

scipy.sparse中,有一些方法可以将数据形式相互转换. .tocoo.tocsc等.因此,您可以使用最适合特定操作的格式.

Within scipy.sparse there are methods that convert the data forms to each other. .tocoo, .tocsc, etc. So you can use which ever form is best for a particular operation.

换一种说法,我已经回答了

For going the other way, I've answered

熊猫稀疏dataFrame稀疏矩阵,而不在内存中生成密集矩阵

您从2013年开始的链接答案逐行迭代-使用toarray使行密集.我没有看熊猫from_coo做什么.

Your linked answer from 2013 iterates by row - using toarray to make the row dense. I haven't looked at what the pandas from_coo does.

关于熊猫稀疏的最新SO问题

A more recent SO question on pandas sparse

使用Pandas.SparseSeries的non-NDFFrame对象错误.from_coo()函数

来自 https://github.com/pydata/熊猫/blob/master/pandas/sparse/scipy_sparse.py

def _coo_to_sparse_series(A, dense_index=False):
    """ Convert a scipy.sparse.coo_matrix to a SparseSeries.
    Use the defaults given in the SparseSeries constructor. """
    s = Series(A.data, MultiIndex.from_arrays((A.row, A.col)))
    s = s.sort_index()
    s = s.to_sparse()  # TODO: specify kind?
    # ...
    return s

实际上,它需要使用与建立coo矩阵相同的dataij,进行一系列运算,对其进行排序,然后将其转换为稀疏序列.

In effect it takes the same data, i, j used to build a coo matrix, makes a series, sorts it, and turns it into a sparse series.

这篇关于从SciPy稀疏Coo矩阵填充Pandas SparseDataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆