pandas 中的矩阵乘法 [英] Matrix multiplication in pandas

查看:249
本文介绍了 pandas 中的矩阵乘法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将数字数据存储在两个数据帧 x 和 y 中.numpy 的内积有效,但 Pandas 的点积无效.

I have numeric data stored in two DataFrames x and y. The inner product from numpy works but the dot product from pandas does not.

In [63]: x.shape
Out[63]: (1062, 36)

In [64]: y.shape
Out[64]: (36, 36)

In [65]: np.inner(x, y).shape
Out[65]: (1062L, 36L)

In [66]: x.dot(y)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-66-76c015be254b> in <module>()
----> 1 x.dot(y)

C:\Programs\WinPython-64bit-2.7.3.3\python-2.7.3.amd64\lib\site-packages\pandas\core\frame.pyc in dot(self, other)
    888             if (len(common) > len(self.columns) or
    889                     len(common) > len(other.index)):
--> 890                 raise ValueError('matrices are not aligned')
    891 
    892             left = self.reindex(columns=common, copy=False)

ValueError: matrices are not aligned

这是一个错误还是我使用熊猫错误?

Is this a bug or am I using pandas wrong?

推荐答案

xy 的形状不仅要正确,还要x 的列名必须与 y 的索引名匹配.除此以外pandas/core/frame.py 中的这段代码将引发 ValueError:

Not only must the shapes of x and y be correct, but also the column names of x must match the index names of y. Otherwise this code in pandas/core/frame.py will raise a ValueError:

if isinstance(other, (Series, DataFrame)):
    common = self.columns.union(other.index)
    if (len(common) > len(self.columns) or
        len(common) > len(other.index)):
        raise ValueError('matrices are not aligned')

如果你只是想计算矩阵乘积而不让 x 的列名与 y 的索引名匹配,那么使用 NumPy 点函数:

If you just want to compute the matrix product without making the column names of x match the index names of y, then use the NumPy dot function:

np.dot(x, y)

<小时>

x的列名必须与y的索引名匹配的原因是pandas的dot方法会重新索引xy 这样如果 x 的列顺序和 y 的索引顺序不自然匹配,它们将是在执行矩阵乘积之前进行匹配:


The reason why the column names of x must match the index names of y is because the pandas dot method will reindex x and y so that if the column order of x and the index order of y do not naturally match, they will be made to match before the matrix product is performed:

left = self.reindex(columns=common, copy=False)
right = other.reindex(index=common, copy=False)

NumPy dot 函数不做这样的事情.它只会根据底层数组中的值计算矩阵乘积.

The NumPy dot function does no such thing. It will just compute the matrix product based on the values in the underlying arrays.

这是一个重现错误的示例:

Here is an example which reproduces the error:

import pandas as pd
import numpy as np

columns = ['col{}'.format(i) for i in range(36)]
x = pd.DataFrame(np.random.random((1062, 36)), columns=columns)
y = pd.DataFrame(np.random.random((36, 36)))

print(np.dot(x, y).shape)
# (1062, 36)

print(x.dot(y).shape)
# ValueError: matrices are not aligned

这篇关于 pandas 中的矩阵乘法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆