如何在matplotlib中制作具有不同y轴的堆叠折线图? [英] How to make stacked line chart with different y-axis in matplotlib?

查看:121
本文介绍了如何在matplotlib中制作具有不同y轴的堆叠折线图?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道如何制作堆积折线图,以便在matplotlib中使用不同的列.关键是当我们进行聚合时,我需要在两个不同的列上进行数据聚合,我想我需要制作一个用于绘图的大数据框.我没有在 Pandas matplotlib 中找到更漂亮和方便的方法来做到这一点.谁能建议做一些可能的调整来做到这一点?有什么想法吗?

I am wondering how should I make stacked line chart which is gonna take different columns in matplotlib. The point is when we are doing aggregation, I need to do data aggregation on two different columns, I think I need to make one big dataframe that will be used for plotting. I didn't find prettier and handy way to do this in pandas, matplotlib. Can anyone suggest possible tweaks to do this? any ideas?

我的尝试

这是我需要做的第一个聚合:

this is the first aggregation I need to do:

import pandas as pd
import matplotlib.pyplot as plt

url = "https://gist.githubusercontent.com/adamFlyn/4657714653398e9269263a7c8ad4bb8a/raw/fa6709a0c41888503509e569ace63606d2e5c2ff/mydf.csv"
df = pd.read_csv(url, parse_dates=['date'])

df_re = df[df['retail_item'].str.contains("GROUND BEEF")]
df_rei = df_re.groupby(['date', 'retail_item']).agg({'number_of_ads': 'sum'})
df_rei = df_rei.reset_index(level=[0,1])
df_rei['week'] = pd.DatetimeIndex(df_rei['date']).week
df_rei['year'] = pd.DatetimeIndex(df_rei['date']).year
df_rei['week'] = df_rei['date'].dt.strftime('%W').astype('uint8')

df_ret_df1 = df_rei.groupby(['retail_item', 'week'])['number_of_ads'].agg([max, min, 'mean']).stack().reset_index(level=[2]).rename(columns={'level_2': 'mm', 0: 'vals'}).reset_index()

这是我需要做的第二个聚合,与第一个相似,除了我现在选择不同的列:

and this is second aggregation that I need to do which is similar to first one except I am choosing different column now:

df_re['price_gap'] = df_re['high_price'] - df_re['low_price']
dff_rei1 = df_re.groupby(['date', 'retail_item']).agg({'price_gap': 'mean'})
dff_rei1 = dff_rei1.reset_index(level=[0,1])
dff_rei1['week'] = pd.DatetimeIndex(dff_rei1['date']).week
dff_rei1['year'] = pd.DatetimeIndex(dff_rei1['date']).year
dff_rei1['week'] = dff_rei1['date'].dt.strftime('%W').astype('uint8')

dff_ret_df2 = dff_rei1.groupby(['retail_item', 'week'])['price_gap'].agg([max, min, 'mean']).stack().reset_index(level=[2]).rename(columns={'level_2': 'mm', 0: 'vals'}).reset_index()

现在我正在努力如何将第一次、第二次聚合的输出组合到一个数据帧中以制作堆叠折线图.可以这样做吗?

now I am struggling how can I combine the output of first, second aggregation into one dataframe for making stacked line chart. Is that possible to do so?

目标:

我想制作堆叠式折线图,其中y轴占据不同的列,例如y轴应显示广告数量和价格范围,而x轴应显示52周期限.这是我尝试制作折线图的部分代码:

I want to make stacked line charts where its y axis is taking different columns such as y axis should show # of ads, and price range, while x-axis shows 52 week period. This is partial code I attempted to make line chart:

for g, d in df_ret_df1.groupby('retail_item'):
    fig, ax = plt.subplots(figsize=(7, 4), dpi=144)
    sns.lineplot(x='week', y='vals', hue='mm', data=d,alpha=.8)
    y1 = d[d.mm == 'max']
    y2 = d[d.mm == 'min']
    plt.fill_between(x=y1.week, y1=y1.vals, y2=y2.vals)
    
    for year in df['year'].unique():
        data = df_rei[(df_rei.date.dt.year == year) & (df_rei.retail_item == g)]
        sns.lineplot(x='week', y='price_gap', ci=None, data=data,label=year,alpha=.8)

有没有什么优雅的方法可以构造绘图数据,以便在熊猫中轻松地完成不同列上的数据聚合?还有其他方法可以做到这一点吗?有什么想法吗?

is there any elegant way so we can construct plotting data where data aggregation on different columns can be done easily in pandas? Is there other way around to make this happen? any thoughts?

所需的输出:

这是我想要获得的所需输出:

here is the desired output that I want to get:

我应该如何制作绘图数据以获得这样的我想要的绘图?有什么想法吗?

How should I make plotting data in order to get my desired plot like this? Any idea?

推荐答案

Pandas 的 groupby 功能非常通用,您可以显着减少代码行数以实现用于绘图的最终数据框.

Pandas groupby feature is very versatile, and you can reduce the lines of code considerably to achieve the final dataframe for plotting.

plotdf = df_re.groupby([ 'retail_item',df_re['date'].dt.year,df_re['date'].dt.week]).agg({'number_of_ads':'sum','price_gap':'mean'}).unstack().T

一旦您以正确的方式完成了聚合,就可以使用for循环显示不同图中所需的每个度量.通过使用pandas describe功能绘制阴影范围以动态计算最小值和最大值:

Once you have the aggregation done the right way, use a for loop to show each of the measures needed in a different plot. Plot a shaded range by using pandas describe feature to compute the min and max on the fly:

f,axs = plt.subplots(2,1,figsize=(20,14))
axs=axs.ravel()

for i,x in enumerate(['number_of_ads','price_gap']):
    plotdf.loc[x].plot(rot=90,grid=True,ax=axs[i])
    plotdf.loc[x].T.describe().T[['min','max']].plot(kind='area',color=['w','grey'],alpha=0.3,ax=axs[i],title= x)

使用更新后的代码进行

plotdf = df_re.groupby(['retail_item',df_re['date'].dt.year,df_re['date'].dt.week]).agg({'number_of_ads':'sum','weighted_avg':'mean'}).unstack().T
f,axs = plt.subplots(3,2,figsize=(20,14))
axs=axs.ravel()
i=0
for col in plotdf.columns.get_level_values(0).unique():
    for x in ['number_of_ads','weighted_avg']:
        plotdf.loc[x,col].plot(rot=90,grid=True,ax=axs[i]);
      plotdf.loc[x,col].T.describe().T[['min','max']].plot(kind='area',color=['w','grey'],alpha=0.3,ax=axs[i],title= col+', '+x)
        i+=1

这篇关于如何在matplotlib中制作具有不同y轴的堆叠折线图?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆