在 pandas 数据透视表中排序 [英] Sorting in a Pandas pivot_table

查看:127
本文介绍了在 pandas 数据透视表中排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在寻找试图正确地对数据透视表进行排序的方法,但是我没有任何运气.

I have been looking all over trying to figure out how to sort my pivot table correctly and I haven't had any luck.

    client          unit    task                hours   month
0   A               DVADA   Account Management  6.50    January     
1   A               DVADA   Buying              1.25    January 
2   A               DVADA   Meeting / Call      0.50    January 
3   A               DVADA   Account Management  3.00    January 
4   A               DVADA   Billing             2.50    February    
5   A               DVADA   Account Management  6.50    February        
6   A               DVADA   Buying              1.25    February    
7   A               DVADA   Meeting / Call      0.50    February    
8   A               DVADA   Account Management  3.00    February    
9   A               DVADA   Billing             2.50    February
10  A               DVADA   Billing             2.50    December    
11  A               DVADA   Account Management  6.50    December        
12  A               DVADA   Buying              1.25    December    
13  A               DVADA   Meeting / Call      0.50    December    
14  A               DVADA   Account Management  3.00    December    
15  A               DVADA   Billing             2.50    December
16  A               DVADA   Account Management  6.50    August      
17  A               DVADA   Buying              1.25    August  
18  A               DVADA   Meeting / Call      0.50    August  
19  A               DVADA   Account Management  3.00    August
20  A               DVADA   Account Management  6.50    April       
21  A               DVADA   Buying              1.25    April   
22  A               DVADA   Meeting / Call      0.50    April   
23  A               DVADA   Account Management  3.00    April
24  B               DVADA   Account Management  6.50    January     
25  B               DVADA   Buying              1.25    January 
26  B               DVADA   Meeting / Call      0.50    January 
27  B               DVADA   Account Management  3.00    January 
28  B               DVADA   Billing             2.50    February    
29  B               DVADA   Account Management  6.50    February        
30  B               DVADA   Buying              1.25    February    
31  B               DVADA   Meeting / Call      0.50    February    
32  B               DVADA   Account Management  3.00    February    
33  B               DVADA   Billing             2.50    February
34  B               DVADA   Billing             2.50    December    
35  B               DVADA   Account Management  6.50    December        
36  B               DVADA   Buying              1.25    December    
37  B               DVADA   Meeting / Call      0.50    December    
38  B               DVADA   Account Management  3.00    December    
39  B               DVADA   Billing             2.50    December
40  B               DVADA   Account Management  6.50    August      
41  B               DVADA   Buying              1.25    August  
42  B               DVADA   Meeting / Call      0.50    August  
43  B               DVADA   Account Management  3.00    August
44  B               DVADA   Account Management  6.50    April       
45  B               DVADA   Buying              1.25    April   
46  B               DVADA   Meeting / Call      0.50    April   
47  C               DVADA   Account Management  3.00    April
48  C               DVADA   Account Management  6.50    January     
49  C               DVADA   Buying              1.25    January 
50  C               DVADA   Meeting / Call      0.50    January 
51  C               DVADA   Account Management  3.00    January 
52  C               DVADA   Billing             2.50    February    
53  C               DVADA   Account Management  6.50    February        
54  C               DVADA   Buying              1.25    February    
55  C               DVADA   Meeting / Call      0.50    February    
56  C               DVADA   Account Management  3.00    February    
57  C               DVADA   Billing             2.50    February
58  C               DVADA   Billing             2.50    December    
59  C               DVADA   Account Management  6.50    December        
60  C               DVADA   Buying              1.25    December    
61  C               DVADA   Meeting / Call      0.50    December    
62  C               DVADA   Account Management  3.00    December    
63  C               DVADA   Billing             2.50    December
64  C               DVADA   Account Management  6.50    August      
65  C               DVADA   Buying              1.25    August  
66  C               DVADA   Meeting / Call      0.50    August  
67  C               DVADA   Account Management  3.00    August
68  C               DVADA   Account Management  6.50    April       
69  C               DVADA   Buying              1.25    April   
70  C               DVADA   Meeting / Call      0.50    April   
71  C               DVADA   Account Management  3.00    April

df = pd.pivot_table(vp_clients,values ='hours',index = ['client','month'],aggfunc = sum)

df = pd.pivot_table(vp_clients, values='hours', index=['client', 'month'], aggfunc=sum)

这将返回包含三列(客户,月份,小时)的数据透视表.每个客户有12个月(1月至12月),每个月中的每个月都有一个小时.

Which returns a pivot table with three columns (client, month, hours). Each client has 12 months (Jan-Dec) and each of those months has a hours for that month.

                        hours
client          month

A               April   203.50
                August  227.75
                December 159.75
                February 203.25
                January 199.25

B               April   203.50
                August  227.75
                December 159.75
                February 203.25
                January 199.25

C               April   203.50
                August  227.75
                December 159.75
                February 203.25
                January 199.25

我想按月对数据透视表进行排序,但要保留client列.

I want to sort this pivot table by the months but keep the client column in tacked.

                           hours
client           month

A               January 203.50
                February 227.75
                March    159.75
                April    203.25
                May     199.90

B               January 203.50
                February 227.75
                March    159.75
                April    203.25
                May     199.90

C               January 203.50
                February 227.75
                March    159.75
                April    203.25
                May     199.90

排序问题已由Scott的以下答案解决.现在,我想向每位客户添加一行,其中包含已用的总小时数.

                           hours
client           month

A               January    203.50
                February   227.75
                March      159.75
                April      203.25
                May        199.90
                Total     1000.34

B               January    203.50
                February   227.75
                March      159.75
                April      203.25
                May       199.90
                Total     1000.34

C               January   203.50
                February   227.75
                March      159.75
                April      203.25
                May       199.90
                Total     1000.34

任何帮助将不胜感激

推荐答案

更新为在每个客户端末尾添加Total

vp_clients['month'] = pd.Categorical(vp_clients['month'], 
                                     ordered=True, 
                                     categories=['January','February','March',
                                                 'April','May','June','July',
                                                 'August','September','October',
                                                 'November','December','Total'])

df = pd.pivot_table(vp_clients, values='hours', index=['client', 'month'], aggfunc=sum)

df = df.dropna()

pd.concat([df,df.sum(level=0).assign(month='Total').set_index('month', append=True)]).sort_index()

输出:

                 hours
client month          
A      January   11.25
       February  16.25
       April     11.25
       August    11.25
       December  16.25
       Total     66.25
B      January   11.25
       February  16.25
       April      8.25
       August    11.25
       December  16.25
       Total     63.25
C      January   11.25
       February  16.25
       April     14.25
       August    11.25
       December  16.25
       Total     69.25


让我们使用pd.Categorical:

vp_clients['month'] = pd.Categorical(vp_clients['month'], 
                                     ordered=True, 
                                     categories=['January','February','March',
                                                 'April','May','June','July',
                                                 'August','September','October',
                                                 'November','December'])

df = pd.pivot_table(vp_clients, values='hours', index=['client', 'month'], aggfunc=sum)

df.dropna()

输出:

                 hours
client month          
A      January   11.25
       February  16.25
       April     11.25
       August    11.25
       December  16.25
B      January   11.25
       February  16.25
       April      8.25
       August    11.25
       December  16.25
C      January   11.25
       February  16.25
       April     14.25
       August    11.25
       December  16.25

这篇关于在 pandas 数据透视表中排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆