乘数据框列时,Pandas v0.20返回NotImplemented [英] Pandas v0.20 returns NotImplemented when multiplying dataframe columns

查看:97
本文介绍了乘数据框列时,Pandas v0.20返回NotImplemented的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为了回答另一个问题,我一直在研究熊猫中的列式乘法运算.

In attempt to answer another question I've been playing around with column-wise multiplication operations in pandas.

A = pd.DataFrame({'Col1' : [1, 2, 3], 'Col2' : [2, 3, 4]})
B = pd.DataFrame({'Col1' : [10, 20, 30]})

print(A)

   Col1  Col2
0     1     2
1     2     3
2     3     4

print(B)

   Col1
0    10
1    20
2    30

我尝试使用df.apply尝试将BCol1与A的每一列相乘.所以我想要的输出是:

I tried to use df.apply in an attempt to multiply Col1 of B with each column of A. So my desired output is:

   Col1  Col2
0    10    20
1    40    60
2    90   120

我的第一个尝试是使用lambda,它工作正常.

My first attempt was to use a lambda and it worked fine.

df_new = A.apply(lambda x: B.Col1.values * x, 0) 
print(df_new)

   Col1  Col2
0    10    20
1    40    60
2    90   120

但是lambda总是很慢,所以我认为我可以通过传递B.col1.values.__mul__来加快速度,但这就是它的作用:

But lambdas are always slow, so I thought I could speed this up with passing B.col1.values.__mul__ instead, but this is what it gave:

print(A.apply(B.Col1.values.__mul__, 0))

Col1    NotImplemented
Col2    NotImplemented
dtype: object

我打印出了__mul__,这是在numpy数组中进行乘法运算的一种神奇方法:

I printed out __mul__, all it is is a magic method for multiplication in numpy arrays:

print(B.Col1.values.__mul__)
<method-wrapper '__mul__' of numpy.ndarray object at 0x1154d9620>

为什么会出现此错误?

推荐答案

您可以这样做:

A.apply(B.Col1.__mul__,0)

返回您所追求的.

区别在于B.Col1.values.__mul__正在调用numpy插槽函数,而B.Col1.__mul__正在调用pandas方法.

The difference is that B.Col1.values.__mul__ is calling the numpy slot function, but B.Col1.__mul__ is calling a pandas method.

就像编写pandas方法一样,它避免了numpy引起的一些低级头痛:

Likely the pandas method was written to avoid some low level headache from numpy:

>>>print(inspect.getsource(pd.Series.__mul__))

def wrapper(left, right, name=name, na_op=na_op):

    if isinstance(right, pd.DataFrame):
        return NotImplemented

    left, right = _align_method_SERIES(left, right)

    converted = _Op.get_op(left, right, name, na_op)

    left, right = converted.left, converted.right
    lvalues, rvalues = converted.lvalues, converted.rvalues
    dtype = converted.dtype
    wrap_results = converted.wrap_results
    na_op = converted.na_op

    if isinstance(rvalues, ABCSeries):
        name = _maybe_match_name(left, rvalues)
        lvalues = getattr(lvalues, 'values', lvalues)
        rvalues = getattr(rvalues, 'values', rvalues)
        # _Op aligns left and right
    else:
        name = left.name
        if (hasattr(lvalues, 'values') and
                not isinstance(lvalues, pd.DatetimeIndex)):
            lvalues = lvalues.values

    result = wrap_results(safe_na_op(lvalues, rvalues))
    return construct_result(
        left,
        result,
        index=left.index,
        name=name,
        dtype=dtype,
    )

在np插槽函数上找不到源,但可能类似于

Can't find source on the np slot function, but it's likely something similar to this

这篇关于乘数据框列时,Pandas v0.20返回NotImplemented的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆