具有2个y轴的分组箱线图,每x刻度2个绘制的变量 [英] Grouped boxplot with 2 y axes, 2 plotted variables per x tick
问题描述
我试图制作一个18年记录的月降雨量和洪水频率的箱线图.即每个x刻度是一个月,每个x刻度与两个箱形图相关联,其中一个是降雨,一个是洪水频率.到目前为止,我已经成功地使用seaborn绘制了这些图(请参见下面的代码和图像),但是我不知道如何创建带有两个y轴的箱形图,这是我需要的,因为每个变量的比例都不同.
数据如下所示(数据集中的Flood_freq最大值为7,此处未显示):
小组降雨Flood_freq1月115.679997 01月1日72.929999 01月2日39.719999 0一月3 46.799999 11月4日54.989998 0...212年12月51.599998 0213年12月45.359999 0214年12月10.260000 0215年12月52.709998 0
这是我使用的代码:
dd = pd.melt(FBPdf,id_vars = ['Group'],value_vars = ['Rainfall','Flood_freq'],var_name ='Data')sns.boxplot(x ='Group',y ='value',data = dd,hue ='Data')
这将导致以下结果:
此后,我查看了seaborn文档,似乎不允许使用2个y轴(
我认为稍加微调就可以看起来很不错.
I am trying to make a boxplot of an 18 year record of monthly rainfall and flood frequency. i.e. each x tick is the month, and each x tick is associated with two boxplots, one of the rainfall and one of the flood frequency. So far I have managed to plot these using seaborn (see following code and image), however I do not know how to create the boxplot with two y axes, which I need because the scales for each variable differ.
The data looks like this (largest value of flood_freq in the dataset is 7, not shown here):
Group Rainfall Flood_freq
0 Jan 115.679997 0
1 Jan 72.929999 0
2 Jan 39.719999 0
3 Jan 46.799999 1
4 Jan 54.989998 0
...
212 Dec 51.599998 0
213 Dec 45.359999 0
214 Dec 10.260000 0
215 Dec 52.709998 0
This is the code I have used:
dd=pd.melt(FBPdf,id_vars=['Group'],value_vars=['Rainfall','Flood_freq'],var_name='Data')
sns.boxplot(x='Group',y='value',data=dd,hue='Data')
Which results in this:
I have since looked on the seaborn documentation and it seems it does not permit 2 y axes (Seaborn boxplot with 2 y-axes). Is anyone able to offer potential alternatives for what I am trying to achieve? The solutions on the link above do not relate to this double-y-axis and grouped boxplot problem I have.
Thank you very much in advance!
With some fake data and a little help from this tutorial and this answer, here a minimal example how to achieve what you want using only numpy
and matplotlib
:
from matplotlib import pyplot as plt
import numpy as np
rainfall = np.random.rand((12*18))*300
floods = np.random.rand((12*18))*2
t = np.arange(0.01, 10.0, 0.01)
data1 = np.exp(t)
data2 = np.sin(2 * np.pi * t)
fig, ax1 = plt.subplots()
months = [
'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec',
]
ax1.set_xlabel('month')
ax1.set_ylabel('rainfall', color='tab:blue')
res1 = ax1.boxplot(
rainfall.reshape(-1,12), positions = np.arange(12)-0.25, widths=0.4,
patch_artist=True,
)
for element in ['boxes', 'whiskers', 'fliers', 'means', 'medians', 'caps']:
plt.setp(res1[element], color='k')
for patch in res1['boxes']:
patch.set_facecolor('tab:blue')
ax2 = ax1.twinx() # instantiate a second axes that shares the same x-axis
ax2.set_ylabel('floods', color='tab:orange')
res2 = ax2.boxplot(
floods.reshape(-1,12), positions = np.arange(12)+0.25, widths=0.4,
patch_artist=True,
)
##from https://stackoverflow.com/a/41997865/2454357
for element in ['boxes', 'whiskers', 'fliers', 'means', 'medians', 'caps']:
plt.setp(res2[element], color='k')
for patch in res2['boxes']:
patch.set_facecolor('tab:orange')
ax1.set_xlim([-0.55, 11.55])
ax1.set_xticks(np.arange(12))
ax1.set_xticklabels(months)
fig.tight_layout() # otherwise the right y-label is slightly clipped
plt.show()
The result looks something like this:
I think with a little fine tuning this can actually look quite nice.
这篇关于具有2个y轴的分组箱线图,每x刻度2个绘制的变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!