如何在时间序列线图上绘制回归线 [英] How to plot a regression line on a timeseries line plot

查看:52
本文介绍了如何在时间序列线图上绘制回归线的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个关于斜率值的问题,我计算如下:

将pandas导入为pd将 yfinance 导入为 yf导入 matplotlib.pyplot 作为 plt将日期时间导入为 dt将 numpy 导入为 npdf = yf.download('aapl', '2015-01-01', '2021-01-01')df.rename(columns = {'Adj Close' : 'Adj_close'}, inplace= True)x1 = pd.Timestamp('2019-01-02')x2 = df.index[-1]y1 = df[df.index == x1].Adj_close[0]y2 = df[df.index == x2].Adj_close[0]斜率 = (y2 - y1)/(x2 - x1). 天角度 = 圆(np.rad2deg(np.arctan2(y2 - y1,(x2 - x1)),1)图, ax1 = plt.subplots(figsize= (15, 6))ax1.grid(True, linestyle=':')ax1.set_zorder(1)ax1.set_frame_on(假)ax1.plot(df.index, df.Adj_close, c='k', lw= 0.8)ax1.plot([x1, x2], [y1, y2], c='k')ax1.set_xlim(df.index[0], df.index[-1])plt.show()

它返回坡度的角度值为 7.3 度.看图表,哪个看起来不真实:

看起来接近 45 度.这里有什么问题?

这是我需要计算角度的直线:

解决方案

  • OP 中的实现不是确定或绘制线性模型的正确方法.因此,绕过了确定绘制直线的角度的问题,并展示了一种更严格的绘制回归线的方法.
  • 可以通过将日期时间日期转换为序数来添加回归线. 可以使用 sklearn 计算模型,或使用 seaborn 添加到图中.regplot,如下所示.
  • 使用

    计算线性模型

    • 使用

      斜坡的角度

      • 这是axes 方面的一个工件,对于xy 来说是不相等的.当坡向相等时,看到坡度为 7.0 度.

      x = x2 - x1y = y2[0][0] - y1[0][0]斜率 = y/x打印(轮(斜率,7)==轮(model.coef_[0][0],7))[出去]:真的角度 = 圆(np.rad2deg(np.arctan2(y,x)),1)打印(角度)[出去]:7.0# 给定现有图ax1 = df.plot(y='Adj Close', c='k', figsize=(15, 6), grid=True,legend=False,title='根据 2019 年 1 月 2 日的回归线调整收盘价')ax1.plot([x1, x2], [y1[0][0], y2[0][0]], label='线性模型', c='magenta')# 使方面相等ax1.set_aspect('equal',adjustable='box')

      I have a question about the value of the slope in degrees which I have calculated below:

      import pandas as pd
      import yfinance as yf
      import matplotlib.pyplot as plt
      import datetime as dt
      import numpy as np
      
      df = yf.download('aapl', '2015-01-01', '2021-01-01')
      df.rename(columns = {'Adj Close' : 'Adj_close'}, inplace= True)
      
      x1 = pd.Timestamp('2019-01-02')
      x2 = df.index[-1]
      y1 = df[df.index == x1].Adj_close[0]
      y2 = df[df.index == x2].Adj_close[0]
      
      slope = (y2 - y1)/ (x2 - x1).days
      angle = round(np.rad2deg(np.arctan2(y2 - y1, (x2 - x1).days)), 1)
      
      fig, ax1 = plt.subplots(figsize= (15, 6))
      ax1.grid(True, linestyle= ':')
      ax1.set_zorder(1)
      ax1.set_frame_on(False)
      ax1.plot(df.index, df.Adj_close, c= 'k', lw= 0.8)
      ax1.plot([x1, x2], [y1, y2], c= 'k')
      
      
      ax1.set_xlim(df.index[0], df.index[-1])
      plt.show()
      

      It returns the value of the angle of the slope as 7.3 degrees. Which doesnt look true looking at the chart:

      It looks close to 45 degrees. What is wrong here?

      Here is the line for which I need to calculate the angle:

      解决方案

      • The implementation in the OP is not the correct way to determine, or plot a linear model. As such, the question about determining the angle to plot the line is bypassed, and a more rigorous approach to plotting the regression line is shown.
      • A regression line can be added by converting the datetime dates to ordinal. The model can be calculated with sklearn, or added to the plot with seaborn.regplot, as show below.
      • Plot the full data with pandas.DataFrame.plot
      • Tested in python 3.8.11, pandas 1.3.2, matplotlib 3.4.3, seaborn 0.11.2, sklearn 0.24.2

      Imports and Data

      import yfinance as yf
      import pandas as pd
      import seaborn as sns
      import matplotlib.pyplot as plt
      import numpy as np
      from sklearn.linear_model import LinearRegression
      
      # download the data
      df = yf.download('aapl', '2015-01-01', '2021-01-01')
      
      # convert the datetime index to ordinal values, which can be used to plot a regression line
      df.index = df.index.map(pd.Timestamp.toordinal)
      
      # display(df.iloc[:5, [4]])
              Adj Close
      Date             
      735600  24.782110
      735603  24.083958
      735604  24.086227
      735605  24.423975
      735606  25.362394
      
      # convert the regression line start date to ordinal
      x1 = pd.to_datetime('2019-01-02').toordinal()
      
      # data slice for the regression line
      data=df.loc[x1:].reset_index()
      

      Plot a Regression Line with seaborn

      • Using seaborn.regplot no calculations are required to add the regression line to the line plot of the data.
      • Convert the x-axis labels to datetime format
      • Play around with the xticks and labels if you need the endpoints adjusted.

      # plot the Adj Close data
      ax1 = df.plot(y='Adj Close', c='k', figsize=(15, 6), grid=True, legend=False,
                    title='Adjusted Close with Regression Line from 2019-01-02')
      
      # add a regression line
      sns.regplot(data=data, x='Date', y='Adj Close', ax=ax1, color='magenta', scatter_kws={'s': 7}, label='Linear Model', scatter=False)
      
      ax1.set_xlim(df.index[0], df.index[-1])
      
      # convert the axis back to datetime
      xticks = ax1.get_xticks()
      labels = [pd.Timestamp.fromordinal(int(label)).date() for label in xticks]
      ax1.set_xticks(xticks)
      ax1.set_xticklabels(labels)
      
      ax1.legend()
      
      plt.show()
      

      Calculate the Linear Model

      # create the model
      model = LinearRegression()
      
      # extract x and y from dataframe data
      x = data[['Date']]
      y = data[['Adj Close']]
      
      # fit the mode
      model.fit(x, y)
      
      # print the slope and intercept if desired
      print('intercept:', model.intercept_)
      print('slope:', model.coef_)
      
      intercept: [-90078.45713565]
      slope: [[0.1222514]]
      
      # calculate y1, given x1
      y1 = model.predict(np.array([[x1]]))
      
      print(y1)
      array([[28.27904095]])
      
      # calculate y2, given the last date in data
      x2 = data.Date.iloc[-1]
      y2 = model.predict(np.array([[x2]]))
      
      print(y2)
      array([[117.40030862]])
      
      # this can be added to `ax1` with
      ax1 = df.plot(y='Adj Close', c='k', figsize=(15, 6), grid=True, legend=False,
                    title='Adjusted Close with Regression Line from 2019-01-02')
      ax1.plot([x1, x2], [y1[0][0], y2[0][0]], label='Linear Model', c='magenta')
      ax1.legend()
      

      Angle of the Slope

      • This is an artifact of the aspect of the axes, which is not equal for x and y. When the aspect is equal, see that the slope is 7.0 deg.

      x = x2 - x1
      y = y2[0][0] - y1[0][0]
      slope = y / x
      
      print(round(slope, 7) == round(model.coef_[0][0], 7))
      [out]:
      True
      
      angle = round(np.rad2deg(np.arctan2(y, x)), 1)
      print(angle)
      [out]:
      7.0
      
      # given the existing plot
      ax1 = df.plot(y='Adj Close', c='k', figsize=(15, 6), grid=True, legend=False,
                    title='Adjusted Close with Regression Line from 2019-01-02')
      ax1.plot([x1, x2], [y1[0][0], y2[0][0]], label='Linear Model', c='magenta')
      
      # make the aspect equal
      ax1.set_aspect('equal', adjustable='box')
      

      这篇关于如何在时间序列线图上绘制回归线的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆