pandas :分组依据和数据透视表的区别 [英] Pandas: group by and Pivot table difference

查看:149
本文介绍了 pandas :分组依据和数据透视表的区别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚开始学习Pandas,想知道pandas groupbypandas pivot_table函数之间是否有任何区别.谁能帮助我了解他们之间的区别. 帮助将不胜感激.

I just started learning Pandas and was wondering if there is any difference between pandas groupby and pandas pivot_table functions. Can anyone help me understand the difference between them. Help would be appreciated.

推荐答案

pivot_tablegroupby都用于聚合数据框.区别仅在于结果的形状.

Both pivot_table and groupby are used to aggregate your dataframe. The difference is only with regard to the shape of the result.

使用pd.pivot_table(df, index=["a"], columns=["b"], values=["c"], aggfunc=np.sum)创建一个表,其中a在行轴上,b在列轴上,并且值是c的总和.

Using pd.pivot_table(df, index=["a"], columns=["b"], values=["c"], aggfunc=np.sum) a table is created where a is on the row axis, b is on the column axis, and the values are the sum of c.

示例:

df = pd.DataFrame({"a": [1,2,3,1,2,3], "b":[1,1,1,2,2,2], "c":np.random.rand(6)})
pd.pivot_table(df, index=["a"], columns=["b"], values=["c"], aggfunc=np.sum)

b         1         2
a                    
1  0.528470  0.484766
2  0.187277  0.144326
3  0.866832  0.650100

使用groupby,将给定的维放入列中,并为这些维的每种组合创建行.

Using groupby, the dimensions given are placed into columns, and rows are created for each combination of those dimensions.

在此示例中,我们创建了一系列值c的总和,并按ab的所有唯一组合分组.

In this example, we create a series of the sum of values c, grouped by all unique combinations of a and b.

df.groupby(['a','b'])['c'].sum()

a  b
1  1    0.528470
   2    0.484766
2  1    0.187277
   2    0.144326
3  1    0.866832
   2    0.650100
Name: c, dtype: float64

如果我们省略['c'],则与groupby相似.在这种情况下,它将创建一个数据帧(不是一系列数据),该数据帧是按ab的唯一值分组的所有剩余列的总和.

A similar usage of groupby is if we omit the ['c']. In this case, it creates a dataframe (not a series) of the sums of all remaining columns grouped by unique values of a and b.

print df.groupby(["a","b"]).sum()
            c
a b          
1 1  0.528470
  2  0.484766
2 1  0.187277
  2  0.144326
3 1  0.866832
  2  0.650100

这篇关于 pandas :分组依据和数据透视表的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆