pandas :遍历已排序顺序的列的唯一值 [英] Pandas: iterate over unique values of a column that is already in sorted order

查看:86
本文介绍了 pandas :遍历已排序顺序的列的唯一值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经按排序的顺序构造了一个熊猫数据框,并希望遍历具有相同特定列值的组.在我看来,groupby功能对此很有用,但据我所知,执行groupby不能保证键的顺序.如何按已排序的顺序提取unqiue列值.

I have constructed a pandas data frame in sorted order and would like to iterate over groups having identical values of a particular column. It seems to me that the groupby functionality is useful for this, but as far as I can tell performing groupby does not give any guarantee about the order of the key. How can I extract the unqiue column values in sorted order.

这是示例数据帧:

Foo,1
Foo,2
Bar,2
Bar,1

我想要一个列表["Foo","Bar"],该列表的顺序由原始数据帧的顺序来保证.然后,我可以使用此列表提取适当的行.在我的情况下,排序实际上是由数据帧中也提供的列定义的(上面的示例中未包括),因此,如果不能直接提取信息,则重新排序的解决方案是可以接受的.

I would like a list ["Foo","Bar"] where the order is guaranteed by the order of the original data frame. I can then use this list to extract appropriate rows. The sort is actually defined in my case by columns that are also given in the data frame (not included in the example above) and so a solution that re-sorts will be acceptable if the information can not be pulled out directly.

推荐答案

如注释中所述,您可以在将保留顺序的列上使用unique(与numpy的unique不同,它不会进行排序):

As mentioned in the comments, you can use unique on the column which will preserve the order (unlike numpy's unique, it doesn't sort):

In [11]: df
Out[11]: 
     0  1
0  Foo  1
1  Foo  2
2  Bar  2
3  Bar  1

In [12]: df[0].unique()
Out[12]: array(['Foo', 'Bar'], dtype=object)

然后,您可以使用groupby的get_group访问相关行:

Then you can access the relevant rows using groupby's get_group:

In [13]: g = df.groupby([0])

In [14]: g.get_group('Foo')
Out[14]: 
     0  1
0  Foo  1
1  Foo  2    

这篇关于 pandas :遍历已排序顺序的列的唯一值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆