显示多索引 pandas 数据框的前 10 行 [英] Show first 10 rows of multi-index pandas dataframe
本文介绍了显示多索引 pandas 数据框的前 10 行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个多级索引熊猫 DataFrame
,其中第一级是 year
,第二级是 username
.我只有一列已经按降序排序.我想显示每个索引级别 0 的前 2 行.
I have a multilevel index pandas DataFrame
where the first level is year
and the second level is username
. I only have one column which is already sorted in a descending manner. I want to show the first 2 rows of each index level 0.
我有什么:
count
year username
2010 b 677
a 505
c 400
d 300
...
2014 a 100
b 80
我想要的:
count
year username
2010 b 677
a 505
2011 c 677
d 505
2012 e 677
f 505
2013 g 677
i 505
2014 h 677
j 505
推荐答案
这里有一个答案.也许有更好的方法来做到这一点(使用索引?),但我认为它有效.原理看似复杂,其实很简单:
Here is an answer. Maybe there is a better way to do that (with indexing ?), but I thing it works. The principle seems complex but is quite simple:
- 按年份和用户名索引
DataFrame
. - 按年份对
DataFrame
分组,即索引的第一级 (=0
) - 对
groupby
获取的子DataFrame
进行两次操作(每年一次)- 按计数升序对索引进行排序
sort_index(by='count')
-> 计数较多的行将位于DataFrame
的尾部立> - 使用负切片符号 (
[-top:]
) 仅保留最后的top
行(在本例中为 2).也可以使用tail
方法 (tail(top)
) 来提高可读性.
- Index the
DataFrame
by year and username. - Group the
DataFrame
by year which is the first level (=0
) of the index - Apply two operations on the sub
DataFrame
obtained by thegroupby
(one for each year)- sort the index by count in ascending order
sort_index(by='count')
-> the row with more counts will be at the tail of theDataFrame
- Only keep the last
top
rows (2 in this case) by using the negative slicing notation ([-top:]
). Thetail
method could also be used (tail(top)
) to improve readability.
# Test data df = pd.DataFrame({'year': [2010, 2010, 2010, 2011,2011,2011, 2012, 2012, 2013, 2013, 2014, 2014], 'username': ['b','a','a','c','c','d','e','f','g','i','h','j'], 'count': [400, 505, 678, 677, 505, 505, 677, 505, 677, 505, 677, 505]}) df = df.set_index(['year','username']) top = 2 df = df.groupby(level=0).apply(lambda df: df.sort_index(by='count')[-top:]) df.index = df.index.droplevel(0) df count year username 2010 a 505 a 678 2011 d 505 c 677 2012 f 505 e 677 2013 i 505 g 677 2014 j 505 h 677
这篇关于显示多索引 pandas 数据框的前 10 行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
- sort the index by count in ascending order
- 按计数升序对索引进行排序
查看全文