如何在 pandas 中创建groupby子图? [英] How to create groupby subplots in Pandas?

查看:106
本文介绍了如何在 pandas 中创建groupby子图?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含犯罪时间序列数据的数据框,其中包含犯罪方面的信息(看起来像下面的格式).我想在数据框上执行分组图,以便可以探索一段时间内的犯罪趋势.

I've got a dataframe with timeseries data of crime with a facet on offence (which looks like the format below). What I'd like to perform a groupby plot on the dataframe so that it's possible to explore trends in crime over time.

    Offence                     Rolling year total number of offences       Month
0   Criminal damage and arson   1001                                        2003-03-31
1   Drug offences               66                                         2003-03-31
2   All other theft offences    617                                   2003-03-31
3   Bicycle theft               92                                    2003-03-31
4   Domestic burglary           282                                   2003-03-31

我有一些代码可以完成这项工作,但是它有点笨拙,并且丢失了熊猫在单个绘图上提供的时间序列格式. (我提供了一个图像来说明).谁能为我可以使用的这类情节建议一个成语?

I've got some code which does the job, but it's a bit clumsy and it loses the time series formatting that Pandas delivers on a single plot. (I've included an image to illustrate). Can anyone suggest an idiom for such plots that I can use?

我将求助于Seaborn,但我不知道如何将xlabel格式化为时间序列.

I would turn to Seaborn but I can't work out how to format the xlabel as timeseries.

[![subs = \[\]
for idx, (i, g) in enumerate(df.groupby("Offence")):
        subs.append({"data": g.set_index("Month").resample("QS-APR", how="sum" ).ix\["2010":\],
                     "title":i})

ax = plt.figure(figsize=(25,15))
for i,g in enumerate(subs):
    plt.subplot(5, 5, i)
    plt.plot(g\['data'\])
    plt.title(g\['title'\])
    plt.xlabel("Time")
    plt.ylabel("No. of crimes")
    plt.tight_layout()][1]][1]

推荐答案

这是熊猫的6个散点图的可复制示例,连续6年从pd.groupby()获得.在x轴上-表示当年的石油价格(布伦特),在y上-表示当年的sp500的值.

This is a reproducible example of 6 scatterplots in Pandas, obtained from pd.groupby() for 6 consecutive years. On x axis -- there is oil price (brent) for the year, on y -- the value for sp500 for the same year.

import matplotlib.pyplot as plt
import pandas as pd
import Quandl as ql
%matplotlib inline

brent = ql.get('FRED/DCOILBRENTEU')
sp500 = ql.get('YAHOO/INDEX_GSPC')
values = pd.DataFrame({'brent':brent.VALUE, 'sp500':sp500.Close}).dropna()["2009":"2015"]

fig, axes = plt.subplots(2,3, figsize=(15,5))
for (year, group), ax in zip(values.groupby(values.index.year), axes.flatten()):
    group.plot(x='brent', y='sp500', kind='scatter', ax=ax, title=year)

这将产生以下图:

(以防万一,从这些图表中,您可能会推断出2010年石油与sp500之间存在很强的相关性,而其他年份则没有.)

(Just in case, from these plots you may infer there was a strong correlation between oil and sp500 in 2010 but not in other years).

您可以在group.plot()中更改kind,以使其适合您的特定种类或数据.我预计,如果数据中有熊猫,pandas将保留x轴的日期格式.

You may change kind in group.plot() so that it suits your specific kind or data. My anticipation, pandas will preserve the date formatting for x-axis if you have it in your data.

这篇关于如何在 pandas 中创建groupby子图?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆