pandas 多指数排序特定字段 [英] pandas multi index sort specific fields

查看：122 发布时间：2017/3/26 0:51:39 python sorting pandas dataframe multi-index

本文介绍了 pandas 多指数排序特定字段的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我通过为分组的数据帧运行series.describe（）获得了熊猫的多重索引。如何通过 modelName.mean 排序，只保留sepcific字段？
这个

  summary.sortlevel（1）['kappa']

对它们进行排序，但保留所有其他字段，如count。如何才能保持意味着和 std ？

编辑

这是df的文本表示。

  kappa 
 modelName 
 biasTotal计数5.000000 
平均值0.526183 
 std 0.013429 
最小0.507536 
 25％0.519706 
 50％0.525565 
 75％ 0.538931 
最大0.539175 
 biasTotalWithDistanceMetricAccount计数5.000000 
平均值0.527275 
 std 0.014218 
最小值0.506428 
 25％ 0.520438 
 50％0.529771 
 75％0.538475 
最多0.541262 
 lightGBMbias总计5.000000 
平均值0.531639 
 std 0.013819 
最小0.513363

解决方案

你可以这样做：

数据：

 在[77]中：df 
输出[77]：
 0 
 level_1 level_0 
a 25％2.000000 
 50％4.000000 
 75％7.000000 
计数5.000000 
最大7.000000 
平均4.400000 
 min 2.000000 
 std 2.509980 
b 25％2.000000 
 50％6.000000 
 75％8.000000 
计数5.000000 
最大8.000000 
平均值5.000000 
最小1.000000 
标准3.316625 
c 25％3.000000 
 50％4.000000 
 75％5.000000 
计数5.000000 
最大8.000000 
平均值4.000000 
分钟0.000000 
标准2.915476 
d 25％4.000000 
 50％8.000000 
 75％8.000000 
计数5.000000 
最大9.000000 
平均值6.000000 
最小1.000000 
 std 3.391165

解决方案：

 在[78]：df.loc [pd.IndexSlice [:, ['mean'，'std' ]]，：] 
输出[78]：
 0 
 level_1 level_0 
a平均4.400000 
 std 2.509980 
b平均值5.000000 
 std 3.316625 
c意味着4.000000 
 std 2.915476 
d意味着6.000000 
 std 3.391165

设置：

  df =（pd.DataFrame（np.random.randint（0,10，（5,4）），columns = list（'abcd'））
 .describe（）
 .stack（）
 .reset_index（）
 .set_index（['level_1'，'level_0']）
 .sort_index（）
）

I obtained a multi index in pandas by running series.describe() for a grouped dataframe. How can I sort these series by modelName.mean and only keep sepcific fields? This

summary.sortlevel(1)['kappa']

sorts them but retains all the other fields like count. How can I only keep mean and std?

edit

this is a textual representation of the df.

                                             kappa
modelName                                         
biasTotal                          count  5.000000
                                   mean   0.526183
                                   std    0.013429
                                   min    0.507536
                                   25%    0.519706
                                   50%    0.525565
                                   75%    0.538931
                                   max    0.539175
biasTotalWithDistanceMetricAccount count  5.000000
                                   mean   0.527275
                                   std    0.014218
                                   min    0.506428
                                   25%    0.520438
                                   50%    0.529771
                                   75%    0.538475
                                   max    0.541262
lightGBMbiasTotal                  count  5.000000
                                   mean   0.531639
                                   std    0.013819
                                   min    0.513363

解决方案

You can do it this way:

Data:

In [77]: df
Out[77]:
                        0
level_1 level_0
a       25%      2.000000
        50%      4.000000
        75%      7.000000
        count    5.000000
        max      7.000000
        mean     4.400000
        min      2.000000
        std      2.509980
b       25%      2.000000
        50%      6.000000
        75%      8.000000
        count    5.000000
        max      8.000000
        mean     5.000000
        min      1.000000
        std      3.316625
c       25%      3.000000
        50%      4.000000
        75%      5.000000
        count    5.000000
        max      8.000000
        mean     4.000000
        min      0.000000
        std      2.915476
d       25%      4.000000
        50%      8.000000
        75%      8.000000
        count    5.000000
        max      9.000000
        mean     6.000000
        min      1.000000
        std      3.391165

Solution:

In [78]: df.loc[pd.IndexSlice[:, ['mean','std']], :]
Out[78]:
                        0
level_1 level_0
a       mean     4.400000
        std      2.509980
b       mean     5.000000
        std      3.316625
c       mean     4.000000
        std      2.915476
d       mean     6.000000
        std      3.391165

Setup:

df = (pd.DataFrame(np.random.randint(0,10,(5,4)),columns=list('abcd'))
        .describe()
        .stack()
        .reset_index()
        .set_index(['level_1','level_0'])
        .sort_index()
)

这篇关于 pandas 多指数排序特定字段的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas 多指数排序特定字段 [英] pandas multi index sort specific fields

问题描述

编辑

edit

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas 多指数排序特定字段 [英] pandas multi index sort specific fields

问题描述

编辑

edit

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭