将 pandas 数据框中的两列求和 [英] summing two columns in a pandas dataframe
本文介绍了将 pandas 数据框中的两列求和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
当我使用这种语法时,它将创建一个序列,而不是将列添加到我的新数据帧(总和)中.请帮忙.
when I use this syntax it creates a series rather than adding a column to my new dataframe (sum). Please help.
我的代码:
sum = data['variance'] = data.budget + data.actual
我的数据(在数据框df中):(当前除了预算外,其他所有内容-实际的,我想创建差异列吗?
My Data (in dataframe df): (currently has everything except the budget - actual, I want to create a variance column?
cluster date budget actual | budget - actual
0 a 2014-01-01 00:00:00 11000 10000 1000
1 a 2014-02-01 00:00:00 1200 1000
2 a 2014-03-01 00:00:00 200 100
3 b 2014-04-01 00:00:00 200 300
4 b 2014-05-01 00:00:00 400 450
5 c 2014-06-01 00:00:00 700 1000
6 c 2014-07-01 00:00:00 1200 1000
7 c 2014-08-01 00:00:00 200 100
8 c 2014-09-01 00:00:00 200 300
推荐答案
我认为您误解了一些python语法,以下内容有两种分配方式:
I think you've misunderstood some python syntax, the following does two assignments:
In [11]: a = b = 1
In [12]: a
Out[12]: 1
In [13]: b
Out[13]: 1
所以在您的代码中,就好像您在做
So in your code it was as if you were doing:
sum = df['budget'] + df['actual'] # a Series
# and
df['variance'] = df['budget'] + df['actual'] # assigned to a column
后者为df创建了一个新列:
The latter creates a new column for df:
In [21]: df
Out[21]:
cluster date budget actual
0 a 2014-01-01 00:00:00 11000 10000
1 a 2014-02-01 00:00:00 1200 1000
2 a 2014-03-01 00:00:00 200 100
3 b 2014-04-01 00:00:00 200 300
4 b 2014-05-01 00:00:00 400 450
5 c 2014-06-01 00:00:00 700 1000
6 c 2014-07-01 00:00:00 1200 1000
7 c 2014-08-01 00:00:00 200 100
8 c 2014-09-01 00:00:00 200 300
In [22]: df['variance'] = df['budget'] + df['actual']
In [23]: df
Out[23]:
cluster date budget actual variance
0 a 2014-01-01 00:00:00 11000 10000 21000
1 a 2014-02-01 00:00:00 1200 1000 2200
2 a 2014-03-01 00:00:00 200 100 300
3 b 2014-04-01 00:00:00 200 300 500
4 b 2014-05-01 00:00:00 400 450 850
5 c 2014-06-01 00:00:00 700 1000 1700
6 c 2014-07-01 00:00:00 1200 1000 2200
7 c 2014-08-01 00:00:00 200 100 300
8 c 2014-09-01 00:00:00 200 300 500
顺便说一句,您不应将sum
用作变量名,因为它会覆盖内置的sum函数.
As an aside, you shouldn't use sum
as a variable name as the overrides the built-in sum function.
这篇关于将 pandas 数据框中的两列求和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文