如何从 Pandas 数据框中绘制多个折线图 [英] How to plot multiple line charts from a Pandas data frames

查看:606
本文介绍了如何从 Pandas 数据框中绘制多个折线图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从这样的数据框中制作折线图数组

I'm trying to make an array of line charts from a data frame like this

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.DataFrame({ 'CITY' : np.random.choice(['PHOENIX','ATLANTA','CHICAGO', 'MIAMI', 'DENVER'], 10000),
                    'DAY': np.random.choice(['Monday','Tuesday','Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'], 10000),
                    'TIME_BIN': np.random.randint(1, 86400, size=10000),
                    'COUNT': np.random.randint(1, 700, size=10000)})

df['TIME_BIN'] = pd.to_datetime(df['TIME_BIN'], unit='s').dt.round('10min').dt.strftime('%H:%M:%S')
print(df)

         CITY  COUNT        DAY  TIME_BIN
0     ATLANTA    270  Wednesday  10:50:00
1     CHICAGO    375  Wednesday  12:20:00
2       MIAMI    490   Thursday  11:30:00
3       MIAMI    571     Sunday  23:30:00
4      DENVER    379   Saturday  07:30:00
...       ...    ...        ...       ...
9995  ATLANTA    107   Saturday  21:10:00
9996   DENVER    127    Tuesday  15:00:00
9997   DENVER    330     Friday  06:20:00
9998  PHOENIX    379   Saturday  19:50:00
9999  CHICAGO    628   Saturday  01:30:00

这就是我现在所拥有的:

This is what I have right now:

piv = df.pivot(columns="DAY").plot(x='TIME_BIN', kind="Line", subplots=True)
plt.show()

但是 x 轴格式混乱,我需要每个城市都成为自己的行.我该如何解决?我在想我需要遍历一周中的每一天,而不是尝试在一行中创建一个数组.我试过seaborn,没有运气.总而言之,这就是我想要实现的目标:

But the x-axis formatting is messed up and I need each city to be its own line. How do I fix that? I'm thinking that I need to loop through each day of the week instead of trying to make an array in a single line. I've tried seaborn with no luck. To summarize, this is what I'm trying to achieve:

  • x轴上的TIME_BIN
  • Y 轴计数
  • 每个城市的色线不同
  • 每天一张图表

推荐答案

我不明白旋转在这里有什么帮助,因为最后你需要将数据划分两次,一次是一周中的几天,这应该是放入几个子图,并再次用于城市,这些城市应该有自己的彩色线.在这一点上,我们已经达到了 pandas 可以用它的绘图包装器做的极限.

I don't see how pivoting helps here, since at the end you need to divide your data twice, once for the days of the week, which shall be put into several subplots, and again for the cities, which shall have their own colored line. At this point we're at the limit of what pandas can do with its plotting wrapper.

使用 matplotlib 可以循环遍历天和城市这两个类别,然后绘制数据.

Using matplotlib one can loop through the two categories, days and cities and just plot the data.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates

df = pd.DataFrame({ 
    'CITY' : np.random.choice(['PHOENIX','ATLANTA','CHICAGO', 'MIAMI', 'DENVER'], 10000),
    'DAY': np.random.choice(['Monday','Tuesday','Wednesday', 'Thursday', 
                             'Friday', 'Saturday', 'Sunday'], 10000),
    'TIME_BIN': np.random.randint(1, 86400, size=10000),
    'COUNT': np.random.randint(1, 700, size=10000)})

df['TIME_BIN'] = pd.to_datetime(df['TIME_BIN'], unit='s').dt.round('10min')


days = ['Monday','Tuesday','Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
cities = np.unique(df["CITY"])
fig, axes = plt.subplots(nrows=len(days), figsize=(13,8), sharex=True)

# loop over days (one could use groupby here, but that would lead to days unsorted)
for i, day in enumerate(days):
    ddf = df[df["DAY"] == day].sort_values("TIME_BIN")
    # loop over cities
    for city in cities:
        dddf = ddf[ddf["CITY"] == city]
        axes[i].plot(dddf["TIME_BIN"], dddf["COUNT"], label=city)
    axes[i].margins(x=0)
    axes[i].set_title(day)


fmt = matplotlib.dates.DateFormatter("%H:%M") 
axes[-1].xaxis.set_major_formatter(fmt)   
axes[0].legend(bbox_to_anchor=(1.02,1))
fig.subplots_adjust(left=0.05,bottom=0.05, top=0.95,right=0.85, hspace=0.8)    
plt.show()

使用 seaborn FacetGrid 可以实现大致相同的效果.

Roughly the same can be achived with a seaborn FacetGrid.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates
import seaborn as sns

df = pd.DataFrame({ 
    'CITY' : np.random.choice(['PHOENIX','ATLANTA','CHICAGO', 'MIAMI', 'DENVER'], 10000),
    'DAY': np.random.choice(['Monday','Tuesday','Wednesday', 'Thursday', 
                             'Friday', 'Saturday', 'Sunday'], 10000),
    'TIME_BIN': np.random.randint(1, 86400, size=10000),
    'COUNT': np.random.randint(1, 700, size=10000)})

df['TIME_BIN'] = pd.to_datetime(df['TIME_BIN'], unit='s').dt.round('10min')

days = ['Monday','Tuesday','Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
cities = np.unique(df["CITY"])

g = sns.FacetGrid(data=df.sort_values('TIME_BIN'), 
                  row="DAY", row_order=days, 
                  hue="CITY", hue_order=cities, sharex=True, aspect=5)
g.map(plt.plot, "TIME_BIN", "COUNT")

g.add_legend()
g.fig.subplots_adjust(left=0.05,bottom=0.05, top=0.95,hspace=0.8)
fmt = matplotlib.dates.DateFormatter("%H:%M")
g.axes[-1,-1].xaxis.set_major_formatter(fmt)
plt.show()

这篇关于如何从 Pandas 数据框中绘制多个折线图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆