pandas 数据框多色线条图 [英] Pandas Dataframe Multicolor Line plot
问题描述
我有一个带有DateTime索引的熊猫数据框,其中两列代表风速和环境温度.这是半天的数据
I have a Pandas Dataframe with a DateTime index and two column representing Wind Speed and ambient Temperature. Here is the data for half a day
temp winds
2014-06-01 00:00:00 8.754545 0.263636
2014-06-01 01:00:00 8.025000 0.291667
2014-06-01 02:00:00 7.375000 0.391667
2014-06-01 03:00:00 6.850000 0.308333
2014-06-01 04:00:00 7.150000 0.258333
2014-06-01 05:00:00 7.708333 0.375000
2014-06-01 06:00:00 9.008333 0.391667
2014-06-01 07:00:00 10.858333 0.300000
2014-06-01 08:00:00 12.616667 0.341667
2014-06-01 09:00:00 15.008333 0.308333
2014-06-01 10:00:00 17.991667 0.491667
2014-06-01 11:00:00 21.108333 0.491667
2014-06-01 12:00:00 21.866667 0.395238
我想将此数据绘制成一条线,其中颜色根据温度而变化.因此,例如,从浅红色到深红色的温度越高.
I would like to plot this data as one line where the color changes according to temperature. So from light red to dark red the higher the temperature for example.
我发现了这个彩色示例 matplotlib行,但我不知道如何将其与pandas DataFrame一起使用.有谁知道我能做什么? 如果可以这样做,是否还可以作为附加功能根据风速更改线宽?因此,风越快,线越宽.
I found this example of multicolored lines with matplotlib but I have no idea how to use this with a pandas DataFrame. Has anyone an idea what I could do? If it is possible to do this, would it also be possible as additional feature to change the width of the line according to wind speed? So the faster the wind the wider the line.
感谢您的帮助!
推荐答案
pandas
中的内置plot
方法可能无法执行此操作.您需要提取数据并使用matplotlib
进行绘制.
The build-in plot
method in pandas
probably won't be able to do it. You need to extract the data and plot them using matplotlib
.
from matplotlib.collections import LineCollection
import matplotlib.dates as mpd
x=mpd.date2num(df.index.to_pydatetime())
y=df.winds.values
c=df['temp'].values
points = np.array([x, y]).T.reshape(-1, 1, 2)
segments = np.concatenate([points[:-1], points[1:]], axis=1)
lc = LineCollection(segments, cmap=plt.get_cmap('copper'), norm=plt.Normalize(0, 10))
lc.set_array(c)
lc.set_linewidth(3)
ax=plt.gca()
ax.add_collection(lc)
plt.xlim(min(x), max(x))
ax.xaxis.set_major_locator(mpd.HourLocator())
ax.xaxis.set_major_formatter(mpd.DateFormatter('%Y-%m-%d:%H:%M:%S'))
_=plt.setp(ax.xaxis.get_majorticklabels(), rotation=70 )
plt.savefig('temp.png')
有两个值得一提的问题,
There are two issues worth mentioning,
norm=plt.Normalize(0, 10)
pandas
和matplotlib
以不同的方式绘制时间序列,这要求在绘制之前将df.index
转换为float
.通过修改major_locators
,我们将把xaxis majorticklabels
恢复为日期时间格式. norm=plt.Normalize(0, 10)
pandas
and matplotlib
plot time series differently, which requires the df.index
to be converted to float
before plotting. And by modifying the major_locators
, we will get the xaxis majorticklabels
back into date-time format. 当我们要绘制多条线(数据将在两个单独的x范围内绘制)时,第二个问题可能会导致问题:
The second issue may cause problem when we want to plot more than just one lines (the data will be plotted in two separate x ranges):
#follow what is already plotted:
df['another']=np.random.random(13)
print ax.get_xticks()
df.another.plot(ax=ax, secondary_y=True)
print ax.get_xticks(minor=True)
[ 735385. 735385.04166667 735385.08333333 735385.125
735385.16666667 735385.20833333 735385.25 735385.29166667
735385.33333333 735385.375 735385.41666667 735385.45833333
735385.5 ]
[389328 389330 389332 389334 389336 389338 389340]
因此,我们需要在不使用pandas
的.plot()
方法的情况下进行此操作:
Therefore we need to do it without .plot()
method of pandas
:
ax.twinx().plot(x, df.another)
这篇关于 pandas 数据框多色线条图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!