如何对大 pandas 数据框进行非词汇排序? [英] How to sort pandas dataframe non-lexical?

查看:76
本文介绍了如何对大 pandas 数据框进行非词汇排序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在以下数据框中对credit进行排序的方法是使用sort_values()函数(我也尝试过sort()):

What I do to sort credit in the following dataframe is to use sort_values() function (I've also tried sort()):

df.sort_values('credit', ascending=False, inplace=True)

问题在于,积分的排序方式如下:

The problem is that credits are sorted like below:

                i    credit           m  reg_date          b      id
----------------------------------------------------------------------
238             0   4600000.00        0  2014-04-14      False  102214   
127             0   4600000.00        0  2014-12-30      False  159479   
13              0  16800000.00        0  2015-01-12      False  163503   
248             0  16720000.00        0  2012-11-11      False    5116

升序是False,这就是为什么4600000.00优先于其他功劳的原因.但这不是我想要的.我想根据值进行排序.因此,在上面的示例中,16800000.0016720000.00应该在4600000.00之前.如何对该数据框进行非词汇排序?

Ascending is False that's why 4600000.00 is before other credits. But this is not what I wanted. I wanted to sort based on the values. So in the sample above 16800000.00 and 16720000.00 should be before 4600000.00. How to sort this Dataframe non-lexical?

EDIT-1:
数据不仅如此,而且可以包含:

EDIT-1:
Data is more than that and can contain:

120             0  16708000.00        0  2013-12-17      False   51433
248             0  16720000.00        0  2012-11-11      False    5116
13              0  16800000.00        0  2015-01-12      False  163503
21              0   4634000.00        0  2014-12-29      False  159239
136             0   4650000.00        0  2012-11-07      False    4701
..            ...          ...      ...         ...        ...     ...
231             0   7715000.00        0  2014-02-15      False   83936
182             0   7750000.00        0  2015-07-13      False  201584

推荐答案

您可以按float类型分别对列进行排序,并使用索引对原始索引进行切片

You could sort the column separately as type float and use the index to slice the original index

在您的情况下:

import pandas as pd
from StringIO import StringIO

text = """136             0   4650000.00        0  2012-11-07      False    4701
231             0   7715000.00        0  2014-02-15      False   83936
13              0  16800000.00        0  2015-01-12      False  163503
120             0  16708000.00        0  2013-12-17      False   51433
248             0  16720000.00        0  2012-11-11      False    5116
21              0   4634000.00        0  2014-12-29      False  159239
182             0   7750000.00        0  2015-07-13      False  201584
"""

df = pd.read_csv(StringIO(text), delim_whitespace=True,
                 header=None, index_col=0,
                 names=['i', 'credit', 'm', 'reg_date', 'b', 'id'])

print df.loc[df.credit.astype(float).sort_values(ascending=False).index]

     i      credit  m    reg_date      b      id
13   0  16800000.0  0  2015-01-12  False  163503
248  0  16720000.0  0  2012-11-11  False    5116
120  0  16708000.0  0  2013-12-17  False   51433
182  0   7750000.0  0  2015-07-13  False  201584
231  0   7715000.0  0  2014-02-15  False   83936
136  0   4650000.0  0  2012-11-07  False    4701
21   0   4634000.0  0  2014-12-29  False  159239

这篇关于如何对大 pandas 数据框进行非词汇排序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆