pandas 多指数排序特定字段 [英] pandas multi index sort specific fields
本文介绍了 pandas 多指数排序特定字段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
modelName.mean
排序,只保留sepcific字段? 这个
summary.sortlevel(1)['kappa']
对它们进行排序,但保留所有其他字段,如count。如何才能保持意味着
和 std
?
编辑
这是df的文本表示。
kappa
modelName
biasTotal计数5.000000
平均值0.526183
std 0.013429
最小0.507536
25%0.519706
50%0.525565
75% 0.538931
最大0.539175
biasTotalWithDistanceMetricAccount计数5.000000
平均值0.527275
std 0.014218
最小值0.506428
25% 0.520438
50%0.529771
75%0.538475
最多0.541262
lightGBMbias总计5.000000
平均值0.531639
std 0.013819
最小0.513363
解决方案
你可以这样做:
数据:
在[77]中:df
输出[77]:
0
level_1 level_0
a 25%2.000000
50%4.000000
75%7.000000
计数5.000000
最大7.000000
平均4.400000
min 2.000000
std 2.509980
b 25%2.000000
50%6.000000
75%8.000000
计数5.000000
最大8.000000
平均值5.000000
最小1.000000
标准3.316625
c 25%3.000000
50%4.000000
75%5.000000
计数5.000000
最大8.000000
平均值4.000000
分钟0.000000
标准2.915476
d 25%4.000000
50%8.000000
75%8.000000
计数5.000000
最大9.000000
平均值6.000000
最小1.000000
std 3.391165
解决方案:
在[78]:df.loc [pd.IndexSlice [:, ['mean','std' ]],:]
输出[78]:
0
level_1 level_0
a平均4.400000
std 2.509980
b平均值5.000000
std 3.316625
c意味着4.000000
std 2.915476
d意味着6.000000
std 3.391165
设置:
df =(pd.DataFrame(np.random.randint(0,10,(5,4)),columns = list('abcd'))
.describe()
.stack()
.reset_index()
.set_index(['level_1','level_0'])
.sort_index()
)
I obtained a multi index in pandas by running series.describe() for a grouped dataframe. How can I sort these series by modelName.mean
and only keep sepcific fields?
This
summary.sortlevel(1)['kappa']
sorts them but retains all the other fields like count. How can I only keep mean
and std
?
edit
this is a textual representation of the df.
kappa
modelName
biasTotal count 5.000000
mean 0.526183
std 0.013429
min 0.507536
25% 0.519706
50% 0.525565
75% 0.538931
max 0.539175
biasTotalWithDistanceMetricAccount count 5.000000
mean 0.527275
std 0.014218
min 0.506428
25% 0.520438
50% 0.529771
75% 0.538475
max 0.541262
lightGBMbiasTotal count 5.000000
mean 0.531639
std 0.013819
min 0.513363
解决方案
You can do it this way:
Data:
In [77]: df
Out[77]:
0
level_1 level_0
a 25% 2.000000
50% 4.000000
75% 7.000000
count 5.000000
max 7.000000
mean 4.400000
min 2.000000
std 2.509980
b 25% 2.000000
50% 6.000000
75% 8.000000
count 5.000000
max 8.000000
mean 5.000000
min 1.000000
std 3.316625
c 25% 3.000000
50% 4.000000
75% 5.000000
count 5.000000
max 8.000000
mean 4.000000
min 0.000000
std 2.915476
d 25% 4.000000
50% 8.000000
75% 8.000000
count 5.000000
max 9.000000
mean 6.000000
min 1.000000
std 3.391165
Solution:
In [78]: df.loc[pd.IndexSlice[:, ['mean','std']], :]
Out[78]:
0
level_1 level_0
a mean 4.400000
std 2.509980
b mean 5.000000
std 3.316625
c mean 4.000000
std 2.915476
d mean 6.000000
std 3.391165
Setup:
df = (pd.DataFrame(np.random.randint(0,10,(5,4)),columns=list('abcd'))
.describe()
.stack()
.reset_index()
.set_index(['level_1','level_0'])
.sort_index()
)
这篇关于 pandas 多指数排序特定字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文