如何获得数据框列值的平均值 [英] how to get the average of dataframe column values
问题描述
A B
DATE
2013-05-01 473077 71333
2013-05-02 35131 62441
2013-05-03 727 27381
2013-05-04 481 1206
2013-05-05 226 1733
2013-05-06 NaN 4064
2013-05-07 NaN 41151
2013-05-08 NaN 8144
2013-05-09 NaN 23
2013-05-10 NaN 10
说我有上面的数据框.获得具有相同索引(即列A和B的平均值)的序列的最简单方法是什么?平均需要忽略NaN值.不同之处在于,该解决方案需要灵活地向数据框添加新列.
say i have the dataframe above. what is the easiest way to get a series with the same index which is the average of the columns A and B? the average needs to ignore NaN values. the twist is that this solution needs to be flexible to the addition of new columns to the dataframe.
我最接近的是
df.sum(axis=1) / len(df.columns)
但是,这似乎并没有忽略NaN值
however, this does not seem to ignore the NaN values
(注意:我对熊猫图书馆还是有点陌生,所以我猜想有一种很明显的方法可以使我有限的大脑根本看不到)
(note: i am still a bit new to the pandas library, so i'm guessing there's an obvious way to do this that my limited brain is simply not seeing)
推荐答案
仅使用df.mean()
就可以正确处理NaN:
Simply using df.mean()
will Do The Right Thing(tm) with respect to NaNs:
>>> df
A B
DATE
2013-05-01 473077 71333
2013-05-02 35131 62441
2013-05-03 727 27381
2013-05-04 481 1206
2013-05-05 226 1733
2013-05-06 NaN 4064
2013-05-07 NaN 41151
2013-05-08 NaN 8144
2013-05-09 NaN 23
2013-05-10 NaN 10
>>> df.mean(axis=1)
DATE
2013-05-01 272205.0
2013-05-02 48786.0
2013-05-03 14054.0
2013-05-04 843.5
2013-05-05 979.5
2013-05-06 4064.0
2013-05-07 41151.0
2013-05-08 8144.0
2013-05-09 23.0
2013-05-10 10.0
dtype: float64
如果还有其他要忽略的列,则可以使用df[["A", "B"]].mean(axis=1)
.
You can use df[["A", "B"]].mean(axis=1)
if there are other columns to ignore.
这篇关于如何获得数据框列值的平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!