如何使用自定义顺序对DataFrame进行两列排序? [英] How to sort a DataFrame by two columns, using a custom order?

查看:3342
本文介绍了如何使用自定义顺序对DataFrame进行两列排序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大熊猫DataFrame,我需要按一个特定的顺序排列在一列,只是升序在另一列。
这两个列都有重复的值。
它或多或少是这样的:

 将大熊猫导入为pd 

df = pd.DataFrame()
df [0] = pd.Series(['a','aa','c'] * 2)
df [1] = pd.Series([ 2] * 3)
df [2] = pd.Series(range(6))
print(df)

0 1 2
0 a 1 0
1 aa 2 1
2 c 1 2
3 a 2 3
4 aa 1 4
5 c 2 5

现在,我需要按列0和1排序,但不按字母顺序排列:列0应该首先按照顺序:

  order = ['a','c','aa'] 

我该怎么做?



我想像这样排序:

  print(sorted_df)

0 1 2
0 a 1 0
1 a 2 3
2 c 1 2
3 c 2 5
4 aa 1 4
5 aa 2 1






使用python 3.5.2,pandas 0.18.1

解决方案

您可以使用大熊猫的分类系列来提供单独排序顺序的功能:

  df [0] = df [0] .astype(category)。cat.reorder_categories(order,ordered = True)
print(df.sort_values([0,1]))

0 1 2
0 a 1 0
3 a 2 3
2 c 1 2
5 c 2 5
4 aa 1 4
1 aa 2 1


I have a pandas DataFrame that I need to sort in a particular order in one column, and just ascending in another. Both columns have repeated values. It looks more or less like this:

import pandas as pd

df = pd.DataFrame()
df[0] = pd.Series( [ 'a', 'aa', 'c' ] * 2 )
df[1] = pd.Series( [ 1, 2 ] * 3 )
df[2] = pd.Series( range(6) )
print( df )

    0  1  2
0   a  1  0
1  aa  2  1
2   c  1  2
3   a  2  3
4  aa  1  4
5   c  2  5

Now, say that I need to order by columns 0 and 1, but not alphabetically: Column 0 should first follow an order:

order = [ 'a', 'c', 'aa' ]

How do I do that?

I would like to have it sorted like this:

print( sorted_df )

    0  1  2
0   a  1  0
1   a  2  3
2   c  1  2
3   c  2  5
4  aa  1  4
5  aa  2  1


Using python 3.5.2, pandas 0.18.1

解决方案

You can use pandas' categorical Series for this purpose which supplies the functionality of an individual sort order:

 df[0] = df[0].astype("category").cat.reorder_categories(order, ordered=True)
 print(df.sort_values([0, 1]))

    0   1   2
 0  a   1   0
 3  a   2   3
 2  c   1   2
 5  c   2   5
 4  aa  1   4
 1  aa  2   1

这篇关于如何使用自定义顺序对DataFrame进行两列排序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆