排序数据透视表(多索引) [英] Sorting pivot table (multi index)
问题描述
在透视表上放置两个行标签"(Excel术语)后,我试图按降序对透视表的值进行排序.
I'm trying to sort a pivot table's values in descending order after putting two "row labels" (Excel term) on the pivot.
样本数据:
x = pd.DataFrame({'col1':['a','a','b','c','c', 'a','b','c', 'a','b','c'],
'col2':[ 1, 1, 1, 1, 1, 2, 2, 2, 3, 3, 3],
'col3':[ 1,.67,0.5, 2,.65, .75,2.25,2.5, .5, 2,2.75]})
print(x)
col1 col2 col3
0 a 1 1.00
1 a 1 0.67
2 b 1 0.50
3 c 1 2.00
4 c 1 0.65
5 a 2 0.75
6 b 2 2.25
7 c 2 2.50
8 a 3 0.50
9 b 3 2.00
10 c 3 2.75
要创建枢轴,我正在使用以下功能:
To create the pivot, I'm using the following function:
pt = pd.pivot_table(x, index = ['col1', 'col2'], values = 'col3', aggfunc = np.sum)
print(pt)
col3
col1 col2
a 1 1.67
2 0.75
3 0.50
b 1 0.50
2 2.25
3 2.00
c 1 2.65
2 2.50
3 2.75
换句话说,此变量pt
首先按col1
排序,然后按col1
内的col2
值排序,然后按所有这些范围内的col3
排序.很好,但是我想按col3
(值)进行排序,同时将在col2
中细分的组保持不变(此列可以是任意顺序,并且可以随机排列).
In words, this variable pt
is first sorted by col1
, then by values of col2
within col1
then by col3
within all of those. This is great, but I would like to sort by col3
(the values) while keeping the groups that were broken out in col2
(this column can be any order and shuffled around).
目标输出看起来像这样(col3
降序排列,在col2
中以任意顺序排列,并带有col1
组):
The target output would look something like this (col3
in descending order with any order in col2
with that group of col1
):
col3
col1 col2
a 1 1.67
2 0.75
3 0.50
b 2 2.25
3 2.00
1 0.50
c 3 2.75
1 2.65
2 2.50
我尝试了下面的代码,但是这只是对整个数据透视表值进行排序,并且丢失了分组(我正在寻找组内的排序).
I have tried the code below, but this just sorts the entire pivot table values and loses the grouping (I'm looking for sorting within the group).
pt.sort_values(by = 'col3', ascending = False)
作为指导,在这里提出了一个类似的问题,但我无法通过提供的输出获得成功的输出:
For guidance, a similar question was asked (and answered) here, but I was unable to get a successful output with the provided output:
我从该答案中得到的错误是ValueError: all keys need to be the same shape
The error I get from that answer is ValueError: all keys need to be the same shape
推荐答案
您需要 reset_index
表示DataFrame
,然后 set_index
for MultiIndex
:
You need reset_index
for DataFrame
, then sort_values
by col1
and col3
and last set_index
for MultiIndex
:
df = df.reset_index()
.sort_values(['col1','col3'], ascending=[True, False])
.set_index(['col1','col2'])
print (df)
col3
col1 col2
a 1 1.67
2 0.75
3 0.50
b 2 2.25
3 2.00
1 0.50
c 3 2.75
1 2.65
2 2.50
这篇关于排序数据透视表(多索引)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!