获取总计 pandas 列 [英] Get total of Pandas column

查看:105
本文介绍了获取总计 pandas 列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目标

Target

我有一个Pandas数据框,如下所示,它具有多列,并希望获取列的总和,MyColumn.

I have a Pandas data frame, as shown below, with multiple columns and would like to get the total of column, MyColumn.

数据框 -df:

print df

           X           MyColumn  Y              Z   
0          A           84        13.0           69.0   
1          B           76         77.0          127.0   
2          C           28         69.0           16.0   
3          D           28         28.0           31.0   
4          E           19         20.0           85.0   
5          F           84        193.0           70.0   


我的尝试 :


My attempt:

我尝试使用groupby.sum()来获取列的总和:

I have attempted to get the sum of the column using groupby and .sum():

Total = df.groupby['MyColumn'].sum()

print Total

这会导致以下错误:

TypeError: 'instancemethod' object has no attribute '__getitem__'


预期产量


Expected Output

我希望输出如下:

319

或者,我希望使用新的row标题为TOTALdf进行编辑,其中包含总计:

Or alternatively, I would like df to be edited with a new row entitled TOTAL containing the total:

           X           MyColumn  Y              Z   
0          A           84        13.0           69.0   
1          B           76         77.0          127.0   
2          C           28         69.0           16.0   
3          D           28         28.0           31.0   
4          E           19         20.0           85.0   
5          F           84        193.0           70.0   
TOTAL                  319

推荐答案

您应使用然后您使用 loc 使用Series,在这种情况下,索引应设置为与您需要求和的特定列相同:

Then you use loc with Series, in that case the index should be set as the same as the specific column you need to sum:

df.loc['Total'] = pd.Series(df['MyColumn'].sum(), index = ['MyColumn'])
print (df)
         X  MyColumn      Y      Z
0        A      84.0   13.0   69.0
1        B      76.0   77.0  127.0
2        C      28.0   69.0   16.0
3        D      28.0   28.0   31.0
4        E      19.0   20.0   85.0
5        F      84.0  193.0   70.0
Total  NaN     319.0    NaN    NaN

因为如果传递标量,则将填充所有行的值:

because if you pass scalar, the values of all rows will be filled:

df.loc['Total'] = df['MyColumn'].sum()
print (df)
         X  MyColumn      Y      Z
0        A        84   13.0   69.0
1        B        76   77.0  127.0
2        C        28   69.0   16.0
3        D        28   28.0   31.0
4        E        19   20.0   85.0
5        F        84  193.0   70.0
Total  319       319  319.0  319.0

另两个解决方案是 at ,然后 ix 参见以下应用程序:

Two other solutions are with at, and ix see the applications below:

df.at['Total', 'MyColumn'] = df['MyColumn'].sum()
print (df)
         X  MyColumn      Y      Z
0        A      84.0   13.0   69.0
1        B      76.0   77.0  127.0
2        C      28.0   69.0   16.0
3        D      28.0   28.0   31.0
4        E      19.0   20.0   85.0
5        F      84.0  193.0   70.0
Total  NaN     319.0    NaN    NaN


df.ix['Total', 'MyColumn'] = df['MyColumn'].sum()
print (df)
         X  MyColumn      Y      Z
0        A      84.0   13.0   69.0
1        B      76.0   77.0  127.0
2        C      28.0   69.0   16.0
3        D      28.0   28.0   31.0
4        E      19.0   20.0   85.0
5        F      84.0  193.0   70.0
Total  NaN     319.0    NaN    NaN

注意:自Pandas v0.20起,已不推荐使用ix.改用lociloc.

Note: Since Pandas v0.20, ix has been deprecated. Use loc or iloc instead.

这篇关于获取总计 pandas 列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆