"as_index = False"和"reset_index()"之间的区别在于:在 pandas groupby [英] Difference between "as_index = False", and "reset_index()" in pandas groupby

查看:425
本文介绍了"as_index = False"和"reset_index()"之间的区别在于:在 pandas groupby的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只是想知道这2个设备执行的功能有什么区别.

I just wanted to know what is the difference in the function performed by these 2.

数据:

import pandas as pd
df = pd.DataFrame({"ID":["A","B","A","C","A","A","C","B"], "value":[1,2,4,3,6,7,3,4]})

as_index = False:

as_index=False :

df_group1 = df.groupby("ID").sum().reset_index()

reset_index():

reset_index() :

df_group2 = df.groupby("ID", as_index=False).sum()

它们两个都给出完全相同的输出.

Both of them give the exact same output.

  ID  value
0  A     18
1  B      6
2  C      6

有人可以告诉我有什么区别吗?有任何例子可以说明吗?

Can anyone tell me what is the difference and any example illustrating the same?

推荐答案

使用as_index=False时,您向groupby()表示不想将列ID设置为索引(duh!).当两种实现产生相同的结果时,请使用as_index=False,因为它将节省您一些键入操作和不必要的pandas操作;)

When you use as_index=False, you indicate to groupby() that you don't want to set the column ID as the index (duh!). When both implementation yield the same results, use as_index=False because it will save you some typing and an unnecessary pandas operation ;)

但是,有时您希望对组应用更复杂的操作.在这种情况下,您可能会发现一个比另一个更合适.

However, sometimes, you want to apply more complicated operations on your groups. In those occasions, you might find out that one is more suited than the other.

示例1: 您想对两个轴上一组中三个变量(即列)的值求和.

Example 1: You want to sum the values of three variables (i.e. columns) in a group on both axes.

使用as_index=True可以在不指定列名称的情况下在axis=1上应用求和,然后对轴0上的值求和.操作完成后,可以使用reset_index(drop=True/False)在以下位置获取数据帧正确的表格.

Using as_index=True allows you to apply a sum over axis=1 without specifying the names of the columns, then summing the value over axis 0. When the operation is finished, you can use reset_index(drop=True/False) to get the dataframe under the right form.

示例2: .您需要根据groupby()中的列为该组设置一个值.

Example 2: You need to set a value for the group based on the columns in the groupby().

设置as_index=False允许您检查公共列而不是索引的条件,这通常更容易.

Setting as_index=False allow you to check the condition on a common column and not on an index, which is often way easier.

在某些时候,对组应用操作时,您可能会遇到KeyError.在这种情况下,通常是因为您试图在聚合函数中使用一列,该列当前是GroupBy对象的索引.

At some point, you might come across KeyError when applying operations on groups. In that case, it is often because you are trying to use a column in your aggregate function that is currently an index of your GroupBy object.

这篇关于"as_index = False"和"reset_index()"之间的区别在于:在 pandas groupby的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆