pandas 中的矩阵乘法 [英] Matrix multiplication in pandas
问题描述
我将数字数据存储在两个数据帧 x 和 y 中.numpy 的内积有效,但 Pandas 的点积无效.
I have numeric data stored in two DataFrames x and y. The inner product from numpy works but the dot product from pandas does not.
In [63]: x.shape
Out[63]: (1062, 36)
In [64]: y.shape
Out[64]: (36, 36)
In [65]: np.inner(x, y).shape
Out[65]: (1062L, 36L)
In [66]: x.dot(y)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-66-76c015be254b> in <module>()
----> 1 x.dot(y)
C:\Programs\WinPython-64bit-2.7.3.3\python-2.7.3.amd64\lib\site-packages\pandas\core\frame.pyc in dot(self, other)
888 if (len(common) > len(self.columns) or
889 len(common) > len(other.index)):
--> 890 raise ValueError('matrices are not aligned')
891
892 left = self.reindex(columns=common, copy=False)
ValueError: matrices are not aligned
这是一个错误还是我使用熊猫错误?
Is this a bug or am I using pandas wrong?
推荐答案
x
和 y
的形状不仅要正确,还要x
的列名必须与 y
的索引名匹配.除此以外pandas/core/frame.py
中的这段代码将引发 ValueError:
Not only must the shapes of x
and y
be correct, but also
the column names of x
must match the index names of y
. Otherwise
this code in pandas/core/frame.py
will raise a ValueError:
if isinstance(other, (Series, DataFrame)):
common = self.columns.union(other.index)
if (len(common) > len(self.columns) or
len(common) > len(other.index)):
raise ValueError('matrices are not aligned')
如果你只是想计算矩阵乘积而不让 x
的列名与 y
的索引名匹配,那么使用 NumPy 点函数:
If you just want to compute the matrix product without making the column names of x
match the index names of y
, then use the NumPy dot function:
np.dot(x, y)
<小时>
x
的列名必须与y
的索引名匹配的原因是pandas的dot
方法会重新索引x
和 y
这样如果 x
的列顺序和 y
的索引顺序不自然匹配,它们将是在执行矩阵乘积之前进行匹配:
The reason why the column names of x
must match the index names of y
is because the pandas dot
method will reindex x
and y
so that if the column order of x
and the index order of y
do not naturally match, they will be made to match before the matrix product is performed:
left = self.reindex(columns=common, copy=False)
right = other.reindex(index=common, copy=False)
NumPy dot
函数不做这样的事情.它只会根据底层数组中的值计算矩阵乘积.
The NumPy dot
function does no such thing. It will just compute the matrix product based on the values in the underlying arrays.
这是一个重现错误的示例:
Here is an example which reproduces the error:
import pandas as pd
import numpy as np
columns = ['col{}'.format(i) for i in range(36)]
x = pd.DataFrame(np.random.random((1062, 36)), columns=columns)
y = pd.DataFrame(np.random.random((36, 36)))
print(np.dot(x, y).shape)
# (1062, 36)
print(x.dot(y).shape)
# ValueError: matrices are not aligned
这篇关于 pandas 中的矩阵乘法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!