从Pandas中的GroupBy对象获取所有密钥 [英] Get all keys from GroupBy object in Pandas
问题描述
我正在寻找一种获取GroupBy对象中所有键的列表的方法,但是我似乎无法通过文档或Google来找到一个键.
I'm looking for a way to get a list of all the keys in a GroupBy object, but I can't seem to find one via the docs nor through Google.
肯定有一种通过组的键访问组的方法,如下所示:
There is definitely a way to access the groups through their keys, like so:
df_gb = df.groupby(['EmployeeNumber'])
df_gb.get_group(key)
...所以我认为有一种方法可以访问GroupBy对象中的键列表(或类似键).我正在寻找这样的东西:
...so I figure there's a way to access a list (or the like) of the keys in a GroupBy object. I'm looking for something like this:
df_gb.keys
Out: [1234, 2356, 6894, 9492]
我认为我可以遍历GroupBy对象并以这种方式获取键,但是我认为必须有一种更好的方法.
I figure I could just loop through the GroupBy object and get the keys that way, but I think there's got to be a better way.
推荐答案
您可以通过groupby
对象上的属性.groups
访问此对象,这将返回一个字典,该字典的键为您提供了组:>
You can access this via attribute .groups
on the groupby
object, this returns a dict, the keys of the dict gives you the groups:
In [40]:
df = pd.DataFrame({'group':[0,1,1,1,2,2,3,3,3], 'val':np.arange(9)})
gp = df.groupby('group')
gp.groups.keys()
Out[40]:
dict_keys([0, 1, 2, 3])
这是groups
的输出:
In [41]:
gp.groups
Out[41]:
{0: Int64Index([0], dtype='int64'),
1: Int64Index([1, 2, 3], dtype='int64'),
2: Int64Index([4, 5], dtype='int64'),
3: Int64Index([6, 7, 8], dtype='int64')}
更新
看起来像这样,因为groups
的类型是dict
,所以当您调用keys
时,组顺序不会得到维护:
it looks like that because the type of groups
is a dict
then the group order isn't maintained when you call keys
:
In [65]:
df = pd.DataFrame({'group':list('bgaaabxeb'), 'val':np.arange(9)})
gp = df.groupby('group')
gp.groups.keys()
Out[65]:
dict_keys(['b', 'e', 'g', 'a', 'x'])
如果您致电groups
,您会看到订单保持不变:
if you call groups
you can see the order is maintained:
In [79]:
gp.groups
Out[79]:
{'a': Int64Index([2, 3, 4], dtype='int64'),
'b': Int64Index([0, 5, 8], dtype='int64'),
'e': Int64Index([7], dtype='int64'),
'g': Int64Index([1], dtype='int64'),
'x': Int64Index([6], dtype='int64')}
然后保持密钥顺序,对此的一种破解方法是访问每个组的.name
属性:
then the key order is maintained, a hack around this is to access the .name
attribute of each group:
In [78]:
gp.apply(lambda x: x.name)
Out[78]:
group
a a
b b
e e
g g
x x
dtype: object
这不好,因为它没有向量化,但是,如果您已经有一个聚合的对象,则可以获取索引值:
which isn't great as this isn't vectorised, however if you already have an aggregated object then you can just get the index values:
In [81]:
agg = gp.sum()
agg
Out[81]:
val
group
a 9
b 13
e 7
g 1
x 6
In [83]:
agg.index.get_level_values(0)
Out[83]:
Index(['a', 'b', 'e', 'g', 'x'], dtype='object', name='group')
这篇关于从Pandas中的GroupBy对象获取所有密钥的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!