pandas 用大写字母排序 [英] pandas sort with capital letters

查看:240
本文介绍了 pandas 用大写字母排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

运行此代码:

df = pd.DataFrame(['ADc','Abc','AEc'],columns = ['Test'],index=[0,1,2])
df.sort(columns=['Test'],axis=0, ascending=False,inplace=True)

返回按以下格式排序的数据框列:[Abc, AEc, ADc]. ADc应该早于AEc,这是怎么回事?

Returns a dataframe column ordered as: [Abc, AEc, ADc]. ADc should be before AEc, what's going on?

推荐答案

我不认为这是熊猫的错误​​.这似乎只是python排序算法与大小写混合字母(区分大小写)一起工作的方式-外观这里

I don't think that's a pandas bug. It seems to be just the way python sorting algorithm works with mixed cased letters (being case sensitive) - look here

因为您这样做:

In [1]: l1 = ['ADc','Abc','AEc']
In [2]: l1.sort(reverse=True)
In [3]: l1
Out[3]: ['Abc', 'AEc', 'ADc']

因此,由于显然不能使用pandas sort方法来控制排序算法,因此只需使用该列的小写版本进行排序并将其放到后面:

So, since apparently one cannot control the sorting algorithm using the pandas sort method, just use a lower cased version of that column for the sorting and drop it later on:

In [4]: df = pd.DataFrame(['ADc','Abc','AEc'], columns=['Test'], index=[0,1,2])
In [5]: df['test'] = df['Test'].str.lower()
In [6]: df.sort(columns=['test'], axis=0, ascending=True, inplace=True)
In [7]: df.drop('test', axis=1, inplace=True)
In [8]: df
Out[8]:
  Test
1  Abc
0  ADc
2  AEc

注意:如果要按字母顺序对列进行排序,则必须将ascending参数设置为True

Note: If you want the column sorted alphabetically, the ascending argument must be set to True

按照 DSM 的建议,为避免创建新的帮助器列,您可以执行以下操作:

As DSM suggested, to avoid creating a new helper column, you can do:

df = df.loc[df["Test"].str.lower().order().index]

更新:

weatherfrog 指出,对于较新版本的熊猫,正确的方法是.sort_values().因此,上述单线变为:

As pointed out by weatherfrog, for newer versions of pandas the correct method is .sort_values(). So the above one-liner becomes:

df = df.loc[df["Test"].str.lower().sort_values().index]

这篇关于 pandas 用大写字母排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆