pandas 数据框多色线条图 [英] Pandas Dataframe Multicolor Line plot

查看:102
本文介绍了 pandas 数据框多色线条图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有DateTime索引的熊猫数据框,其中两列代表风速和环境温度.这是半天的数据

I have a Pandas Dataframe with a DateTime index and two column representing Wind Speed and ambient Temperature. Here is the data for half a day

                        temp        winds

2014-06-01 00:00:00     8.754545    0.263636
2014-06-01 01:00:00     8.025000    0.291667
2014-06-01 02:00:00     7.375000    0.391667
2014-06-01 03:00:00     6.850000    0.308333
2014-06-01 04:00:00     7.150000    0.258333
2014-06-01 05:00:00     7.708333    0.375000
2014-06-01 06:00:00     9.008333    0.391667
2014-06-01 07:00:00     10.858333   0.300000
2014-06-01 08:00:00     12.616667   0.341667
2014-06-01 09:00:00     15.008333   0.308333
2014-06-01 10:00:00     17.991667   0.491667
2014-06-01 11:00:00     21.108333   0.491667
2014-06-01 12:00:00     21.866667   0.395238

我想将此数据绘制成一条线,其中颜色根据温度而变化.因此,例如,从浅红色到深红色的温度越高.

I would like to plot this data as one line where the color changes according to temperature. So from light red to dark red the higher the temperature for example.

我发现了这个彩色示例 matplotlib行,但我不知道如何将其与pandas DataFrame一起使用.有谁知道我能做什么? 如果可以这样做,是否还可以作为附加功能根据风速更改线宽?因此,风越快,线越宽.

I found this example of multicolored lines with matplotlib but I have no idea how to use this with a pandas DataFrame. Has anyone an idea what I could do? If it is possible to do this, would it also be possible as additional feature to change the width of the line according to wind speed? So the faster the wind the wider the line.

感谢您的帮助!

推荐答案

pandas中的内置plot方法可能无法执行此操作.您需要提取数据并使用matplotlib进行绘制.

The build-in plot method in pandas probably won't be able to do it. You need to extract the data and plot them using matplotlib.

from matplotlib.collections import LineCollection
import matplotlib.dates as mpd

x=mpd.date2num(df.index.to_pydatetime())
y=df.winds.values
c=df['temp'].values
points = np.array([x, y]).T.reshape(-1, 1, 2)
segments = np.concatenate([points[:-1], points[1:]], axis=1)
lc = LineCollection(segments, cmap=plt.get_cmap('copper'), norm=plt.Normalize(0, 10))
lc.set_array(c)
lc.set_linewidth(3)
ax=plt.gca()
ax.add_collection(lc)
plt.xlim(min(x), max(x))
ax.xaxis.set_major_locator(mpd.HourLocator())
ax.xaxis.set_major_formatter(mpd.DateFormatter('%Y-%m-%d:%H:%M:%S'))
_=plt.setp(ax.xaxis.get_majorticklabels(), rotation=70 )
plt.savefig('temp.png')

有两个值得一提的问题,

There are two issues worth mentioning,

  • 颜色渐变的范围由norm=plt.Normalize(0, 10)
  • 控制
  • pandasmatplotlib以不同的方式绘制时间序列,这要求在绘制之前将df.index转换为float.通过修改major_locators,我们将把xaxis majorticklabels恢复为日期时间格式.
  • the range of the color gradient is controlled by norm=plt.Normalize(0, 10)
  • pandas and matplotlib plot time series differently, which requires the df.index to be converted to float before plotting. And by modifying the major_locators, we will get the xaxis majorticklabels back into date-time format.
  • 当我们要绘制多条线(数据将在两个单独的x范围内绘制)时,第二个问题可能会导致问题:

    The second issue may cause problem when we want to plot more than just one lines (the data will be plotted in two separate x ranges):

    #follow what is already plotted:
    df['another']=np.random.random(13)
    print ax.get_xticks()
    df.another.plot(ax=ax, secondary_y=True)
    print ax.get_xticks(minor=True)
    
    [ 735385.          735385.04166667  735385.08333333  735385.125
      735385.16666667  735385.20833333  735385.25        735385.29166667
      735385.33333333  735385.375       735385.41666667  735385.45833333
      735385.5       ]
    [389328 389330 389332 389334 389336 389338 389340]
    

    因此,我们需要在不使用pandas.plot()方法的情况下进行此操作:

    Therefore we need to do it without .plot() method of pandas:

    ax.twinx().plot(x, df.another)
    

    这篇关于 pandas 数据框多色线条图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆