如何按中值对 pandas 中的箱线图进行排序? [英] How can I sort a boxplot in pandas by the median values?

查看:58
本文介绍了如何按中值对 pandas 中的箱线图进行排序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想按类别XY 在数据框df 中绘制列Z 的箱线图.如何按中位数降序对箱线图进行排序?

将pandas导入为pd随机导入n = 100# 这可能是一种生成随机数据的奇怪方式;请随时纠正它df = pd.DataFrame({"X": [random.choice(["A","B","C"]) for i in range(n)],"Y": [random.choice(["a","b","c"]) for i in range(n)],"Z": [random.gauss(0,1) for i in range(n)]})df.boxplot(column="Z", by=["X", "Y"])

请注意

I want to draw a boxplot of column Z in dataframe df by the categories X and Y. How can I sort the boxplot by the median, in descending order?

import pandas as pd
import random
n = 100
# this is probably a strange way to generate random data; please feel free to correct it
df = pd.DataFrame({"X": [random.choice(["A","B","C"]) for i in range(n)], 
                   "Y": [random.choice(["a","b","c"]) for i in range(n)],
                   "Z": [random.gauss(0,1) for i in range(n)]})
df.boxplot(column="Z", by=["X", "Y"])

Note that this question is very similar, but they use a different data structure. I'm relatively new to pandas (and have only done some tutorials on python in general), so I couldn't figure out how to make my data work with the answer posted there. This may well be more of a reshaping than a plotting question. Maybe there is a solution using groupby?

解决方案

You can use the answer in How to sort a boxplot by the median values in pandas but first you need to group your data and create a new data frame:

import pandas as pd
import random
import matplotlib.pyplot as plt

n = 100
# this is probably a strange way to generate random data; please feel free to correct it
df = pd.DataFrame({"X": [random.choice(["A","B","C"]) for i in range(n)], 
                   "Y": [random.choice(["a","b","c"]) for i in range(n)],
                   "Z": [random.gauss(0,1) for i in range(n)]})
grouped = df.groupby(["X", "Y"])

df2 = pd.DataFrame({col:vals['Z'] for col,vals in grouped})

meds = df2.median()
meds.sort_values(ascending=False, inplace=True)
df2 = df2[meds.index]
df2.boxplot()

plt.show()

这篇关于如何按中值对 pandas 中的箱线图进行排序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆