如何根据字典条件重新排序 pandas 数据框 [英] how to re order a pandas dataframe based on a dictionary condition

查看:94
本文介绍了如何根据字典条件重新排序 pandas 数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个像这样的df,

I have a df like this,

    case        step    deep                   value
0   case 1      1       ram in India           ram,cricket
1   NaN         2       ram plays cricket       NaN
2   case 2      1       ravi played football   ravi
3   NaN         2       ravi works welll        NaN
4   case 3      1       Sri bought a car       sri
5   NaN         2       sri went out            NaN

和一本字典, my_dict = {ram:1,cricket:1,ravi:2.5,sri:1}

我正在尝试根据以下内容对数据帧重新排序根据字典的值,我使用tfidf方法实现了该字典。我面临重新排序的困难,因为我们需要对包括值在内的行进行重新排序。

I am trying to re-order the dataframe according to the values of the dictionary, I achieved this dictionary using tfidf method. I face difficulty in re-ordering as we need to re-order the rows including with the values.

我的预期输出是

    case        step    deep                   value
2   case 2      1       ravi played football   ravi
3   NaN         2       ravi works welll        NaN
0   case 1      1       ram in India           ram,cricket
1   NaN         2       ram plays cricket       NaN
4   case 3      1       Sri bought a car       sri
5   NaN         2       sri went out            NaN

请帮助,谢谢!

推荐答案

您可以创建 MultiIndex 进行排序,只有 value 列中的必要值位于中my_dict

You can create MultiIndex for sorting, only is necessary values from column value are in my_dict:

my_dict = {'ram':1,'cricket':1,'ravi':2.5,'sri':1}

#create DataFrame from value column, replace and sum columns
a = df['value'].str.split(',', expand=True).replace(my_dict).sum(axis=1)
#create groups
b = df['step'].diff().le(0).cumsum()
#create Series by summing per groups
c = a.groupby(b).transform('sum')
#create MultiIndex
df.index = [c,b]
print (df)
            case  step                  deep        value
    step                                                 
2.0 0     case 1     1          ram in India  ram,cricket
    0        NaN     2     ram plays cricket          NaN
2.5 1     case 2     1  ravi played football         ravi
    1        NaN     2      ravi works welll          NaN
1.0 2     case 3     1      Sri bought a car          sri
    2        NaN     2          sri went out          NaN







#sorting MultiIndex and removing
df = df.sort_index(ascending=False).reset_index(drop=True)
print (df)
     case  step                  deep        value
0  case 2     1  ravi played football         ravi
1     NaN     2      ravi works welll          NaN
2  case 1     1          ram in India  ram,cricket
3     NaN     2     ram plays cricket          NaN
4  case 3     1      Sri bought a car          sri
5     NaN     2          sri went out          NaN

这篇关于如何根据字典条件重新排序 pandas 数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆