如何在 pandas 中创建总和行和总和列? [英] How do I create a sum row and sum column in pandas?
问题描述
我正在学习可汗学院的统计课程,这是我上大学时的重温课程,也是一种让我快速了解熊猫和熊猫的方法.其他科学Python.
I'm going through the Khan Academy course on Statistics as a bit of a refresher from my college days, and as a way to get me up to speed on pandas & other scientific Python.
我从可汗学院(Khan Academy)那里得到了一张看起来像这样的桌子:
I've got a table that looks like this from Khan Academy:
| Undergraduate | Graduate | Total
-------------+---------------+----------+------
Straight A's | 240 | 60 | 300
-------------+---------------+----------+------
Not | 3,760 | 440 | 4,200
-------------+---------------+----------+------
Total | 4,000 | 500 | 4,500
我想使用熊猫重新创建此表.当然,我可以使用类似的东西创建一个DataFrame
I would like to recreate this table using pandas. Of course I could create a DataFrame using something like
"Graduate": {...},
"Undergraduate": {...},
"Total": {...},
但是,这似乎是一种幼稚的方法,会很快崩溃,而且实际上并没有真正的可扩展性.
But that seems like a naive approach that would both fall over quickly and just not really be extensible.
我的表格的非总计部分如下:
I've got the non-totals part of the table like this:
df = pd.DataFrame(
{
"Undergraduate": {"Straight A's": 240, "Not": 3_760},
"Graduate": {"Straight A's": 60, "Not": 440},
}
)
df
我一直在寻找,发现了一些很有前途的东西,例如:
I've been looking and found a couple of promising things, like:
df['Total'] = df.sum(axis=1)
但是我没有发现任何非常优雅的东西.
But I didn't find anything terribly elegant.
我确实找到了crosstab
函数,该函数看起来应该可以执行我想要的操作,但是为了做到这一点,我似乎必须为所有这些值创建一个由1/0组成的数据帧,似乎很傻,因为我已经有了汇总.
I did find the crosstab
function that looks like it should do what I want, but it seems like in order to do that I'd have to create a dataframe consisting of 1/0 for all of these values, which seems silly because I've already got an aggregate.
我发现一些方法似乎可以手动建立新的总计行,但似乎应该有更好的方法,例如:
I have found some approaches that seem to manually build a new totals row, but it seems like there should be a better way, something like:
totals(df, rows=True, columns=True)
之类的.
这是否存在于大熊猫中,还是我必须凑齐自己的方法?
Does this exist in pandas, or do I have to just cobble together my own approach?
推荐答案
或者分两步,按照您的建议使用.sum()
函数(也可能更具可读性):
Or in two steps, using the .sum()
function as you suggested (which might be a bit more readable as well):
import pandas as pd
df = pd.DataFrame( {"Undergraduate": {"Straight A's": 240, "Not": 3_760},"Graduate": {"Straight A's": 60, "Not": 440},})
#Total sum per column:
df.loc['Total',:]= df.sum(axis=0)
#Total sum per row:
df.loc[:,'Total'] = df.sum(axis=1)
输出:
Graduate Undergraduate Total
Not 440 3760 4200
Straight A's 60 240 300
Total 500 4000 4500
这篇关于如何在 pandas 中创建总和行和总和列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!