pandas 使用sort_values对2个数据框进行排序,然后按日期进行子排序 [英] Pandas Using sort_values to Sort 2 Dataframes then Sub-Sort by Date

查看:245
本文介绍了 pandas 使用sort_values对2个数据框进行排序,然后按日期进行子排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个包含类似信息类型的数据框.我正在尝试将它们合并在一起并进行重组.这是数据帧的示例:

I have two dataframes consisting a similar type of informatio. I'm attempting to merge them toghether and reorganize them. Here is a sample of the dataframes:

df1 = 
Member Nbr    Name-First      Name-Last      Date-Join 
20                    Zoe        Soumas     2011-08-01   
3128               Julien        Bougie     2011-07-22   
3535               Michel        Bibeau     2015-02-18   
4116          Christopher        Duthie     2014-12-02   
4700                Manoj       Chauhan     2014-11-11   
4802                 Anna        Balian     2014-07-26   
5004             Abdullah         Cekic     2012-03-12   
5130             Raymonde        Girard     2011-01-04  



df2 =      
Member Nbr    Name-First      Name-Last      Date-Join 
3762              Robert        Ortopan     2010-01-31   
3762              Robert        Ortopan     2010-02-28   
3892           Christian         Burnet     2010-03-24   
3892           Christian         Burnet     2010-04-24   
5022              Robert      Ngabirano     2010-06-25   
5022              Robert      Ngabirano     2010-07-28 

我想要的是一个按Member Nbr排序的数据框,如果该成员出现不止一次,则它将按加入日期重新排列.所以我会:

what I would like to have is a dataframe that is sorted by Member Nbr, where if the member appears more than once then it will orgonized again by join date. So I would have:

df12 =  
Member Nbr    Name-First      Name-Last      Date-Join 
20                   Zoe         Soumas     2011-08-01   
3128              Julien         Bougie     2011-07-22   
3535              Michel         Bibeau     2015-02-18  
3762              Robert        Ortopan     2010-01-31   
3762              Robert        Ortopan     2010-02-28
3892           Christian         Burnet     2010-03-24   
3892           Christian         Burnet     2010-04-24     
4116         Christopher         Duthie     2014-12-02   
4700               Manoj        Chauhan     2014-11-11   
4802                Anna         Balian     2014-07-26   
5004            Abdullah          Cekic     2012-03-12
5022              Robert      Ngabirano     2010-06-25   
5022              Robert      Ngabirano     2010-07-28    
5130            Raymonde         Girard     2011-01-04 

我已经设法使用df12 = pd.concat([df1, df2], ignore_index=True)合并两个数据帧,它们将df2放在df1的底部.使用

I've manage to concatonate both data frames using df12 = pd.concat([df1, df2], ignore_index=True) which place df2 at the bottom of df1. After using

df12.sort_values(by='Member Nbr', axis=0, inplace=True)

成员按升序排列,但不止一次出现(在不同的加入日期)的成员按降序排列.那是

The members are arraange in ascending order, but those that appear more than once (at different join dates) are arrange descending order. That is

Member Nbr    Name-First      Name-Last      Date-Join 
20                   Zoe         Soumas     2011-08-01   
3128              Julien         Bougie     2011-07-22   
3535              Michel         Bibeau     2015-02-18  
3762              Robert        Ortopan     2010-02-28  # Wrongly sorted 
3762              Robert        Ortopan     2010-01-31
3892           Christian         Burnet     2010-04-24  # Wrongly sorted  
3892           Christian         Burnet     2010-03-24     
4116         Christopher         Duthie     2014-12-02   
4700               Manoj        Chauhan     2014-11-11   
4802                Anna         Balian     2014-07-26   
5004            Abdullah          Cekic     2012-03-12
5022              Robert      Ngabirano     2010-07-28 # Wrongly sorted   
5022              Robert      Ngabirano     2010-06-25    
5130            Raymonde         Girard     2011-01-04

是否有办法让加入日期不止一个的那些成员也按日期升序排列?

Is there a way to have those members with more than one join date also be arranged in ascending order by date?

推荐答案

by参数可以是列的列表,以便数据框首先按第一列排序(对于并列,按第二列并列)按第三列等)

by parameter can be a list of columns so that the dataframe is first sorted by the first column (and for ties by the second column, and for ties by the third column etc.)

df12.sort_values(by=['Member Nbr', 'Date-Join'], inplace=True)

产生

    Member Nbr   Name-First  Name-Last  Date-Join
0           20          Zoe     Soumas 2011-08-01
1         3128       Julien     Bougie 2011-07-22
2         3535       Michel     Bibeau 2015-02-18
4         3762       Robert    Ortopan 2010-01-31
3         3762       Robert    Ortopan 2010-02-28
6         3892    Christian     Burnet 2010-03-24
5         3892    Christian     Burnet 2010-04-24
7         4116  Christopher     Duthie 2014-12-02
8         4700        Manoj    Chauhan 2014-11-11
9         4802         Anna     Balian 2014-07-26
10        5004     Abdullah      Cekic 2012-03-12
12        5022       Robert  Ngabirano 2010-06-25
11        5022       Robert  Ngabirano 2010-07-28
13        5130     Raymonde     Girard 2011-01-04

请注意,要使其正常工作,日期联接"列的类型应为datetime.

Note that for this to work correctly, Date-Join column should be of type datetime.

这篇关于 pandas 使用sort_values对2个数据框进行排序,然后按日期进行子排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆