如何从 pandas 中的多个列中计算多个列 [英] How to calculate multiple columns from multiple columns in pandas

查看:94
本文介绍了如何从 pandas 中的多个列中计算多个列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用函数从熊猫数据框中的多个列中计算多个列. 该函数接受三个参数-a-,-b-和-c-,并返回三个计算值-sum-,-prod-和-quot-.在我的熊猫数据框中,我有三个列-a-,-b-和-c-,我要从中计算列-sum-,-prod-和-quot-.

I am trying to calculate multiple colums from multiple columns in a pandas dataframe using a function. The function takes three arguments -a-, -b-, and -c- and and returns three calculated values -sum-, -prod- and -quot-. In my pandas data frame I have three coumns -a-, -b- and and -c- from which I want to calculate the columns -sum-, -prod- and -quot-.

仅当我只有三行时,我所做的映射才起作用.我不知道出了什么问题,尽管我希望它必须选择正确的轴.有人可以解释正在发生的事情以及如何计算我想要的值. 以下是我测试过的情况.

The mapping that I do works only when I have exactly three rows. I do not know what is going wrong, although I expect that it has to do something with selecting the correct axis. Could someone explain what is happening and how I can calculate the values that I would like to have. Below are the situations that I have tested.

初始值

def sum_prod_quot(a,b,c):
    sum  = a + b + c
    prod = a * b * c
    quot = a / b / c
    return (sum, prod, quot)

df = pd.DataFrame({ 'a': [20, 100, 18],
                    'b': [ 5,  10,  3],
                    'c': [ 2,  10,  6],
                    'd': [ 1,   2,  3]
                 })

df
    a   b   c  d
0   20   5   2  1
1  100  10  10  2
2   18   3   6  3

计算步骤

仅使用三行

当我从此数据帧计算三列并使用函数功能时,得到:

When I calculate three columns from this dataframe and using the function function I get:

df['sum'], df['prod'], df['quot'] = \
        list( map(sum_prod_quot, df['a'], df['b'], df['c']))

df
     a   b   c  d    sum     prod   quot
0   20   5   2  1   27.0    120.0   27.0
1  100  10  10  2  200.0  10000.0  324.0
2   18   3   6  3    2.0      1.0    1.0

这正是我想要的结果:sum列具有a,b,c列中元素的总和; prod列具有a,b,c列中元素的乘积,quot列具有a,b,c列中元素的商.

This is exactly the result that I want to have: The sum-column has the sum of the elements in the columns a,b,c; the prod-column has the product of the elements in the columns a,b,c and the quot-column has the quotients of the elements in the columns a,b,c.

使用三行以上

当我将数据框扩展为一行时,出现错误!

When I expand the dataframe with one row, I get an error!

数据帧定义为:

df = pd.DataFrame({ 'a': [20, 100, 18, 40],
                    'b': [ 5,  10,  3, 10],
                    'c': [ 2,  10,  6,  4],
                    'd': [ 1,   2,  3,  4]
                 })
df
     a   b   c  d
0   20   5   2  1
1  100  10  10  2
2   18   3   6  3
3   40  10   4  4

电话是

df['sum'], df['prod'], df['quot'] = \
        list( map(sum_prod_quot, df['a'], df['b'], df['c']))

结果是

...
    list( map(sum_prod_quot, df['a'], df['b'], df['c']))
ValueError: too many values to unpack (expected 3) 

虽然我希望有更多的行:

while I would expect an extra row:

df
     a   b   c  d    sum     prod   quot
0   20   5   2  1   27.0    120.0   27.0
1  100  10  10  2  200.0  10000.0  324.0
2   18   3   6  3    2.0      1.0    1.0
3   40  10   4  4   54.0   1600.0    1.0

使用少于三行

当我将数据框缩小一行时,我也会得到一个错误. 数据框定义为:

When I reduce tthe dataframe with one row I get also an error. The dataframe is defined as:

df = pd.DataFrame({ 'a': [20, 100],
                    'b': [ 5,  10],
                    'c': [ 2,  10],
                    'd': [ 1,   2]
                 })
df
     a   b   c  d
0   20   5   2  1
1  100  10  10  2

电话是

df['sum'], df['prod'], df['quot'] = \
        list( map(sum_prod_quot, df['a'], df['b'], df['c']))

结果是

...
    list( map(sum_prod_quot, df['a'], df['b'], df['c']))
ValueError: need more than 2 values to unpack

虽然我希望行会更少:

df
     a   b   c  d    sum     prod   quot
0   20   5   2  1   27.0    120.0   27.0
1  100  10  10  2  200.0  10000.0  324.0

问题

我的问题:

1)为什么会出现这些错误?

1) Why do I get these errors?

2)我该如何修改呼叫以获取所需的数据帧?

2) How do I have to modify the call such that I get the desired data frame?

注意

此链接中提出了类似的问题,但给出的答案对我不起作用.

In this link a similar question is asked, but the given answer did not work for me.

推荐答案

对于3行,答案似乎也不正确.您可以检查除first row and first column以外的其他值吗?查看结果,20*5*2的乘积为 120,它为200,并位于sum列的下方.您需要以正确的方式形成列表,然后再分配给新列.您可以尝试使用以下方法设置新列:

The answer doesn't seem correct for 3 rows as well. Can you check other values except first row and first column. Looking at the results, product of 20*5*2 is NOT 120, it's 200 and is placed below in sum column. You need to form list in correct way before assigning to new columns. You can try use following to set the new columns:

df['sum'], df['prod'], df['quot'] = zip(*map(sum_prod_quot, df['a'], df['b'], df['c']))

有关详细信息,请访问链接

For details follow the link

这篇关于如何从 pandas 中的多个列中计算多个列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆