与Seaborn进行绘图时,如何根据年份传播绘图的日期轴? [英] How to Spread Plot's Date Axis According To Years When Plotting With Seaborn?

查看:23
本文介绍了与Seaborn进行绘图时,如何根据年份传播绘图的日期轴?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正尝试通过使用Google股价来训练Python线性回归模型:

在这部分代码之前,我已经管理了必要的列转换.在这一部分中,我拆分数据框架,创建并训练模型,并预测值.

来自sklearn.model_selection的

 导入train_test_splitX = df [["Open","High","Low","pc"]]]y = df ["Close"]X_train,X_test,y_train,y_test = train_test_split(X,y)从sklearn.linear_model导入LinearRegressionmodel = LinearRegression()model.fit(X_train,y_train)model.score(X_test,y_test)y_pred = model.predict(X_test) 

在这部分之后我想要实现的是我想将这些预测的日期设置为将来的日期,以便将它们组合到我的数据框中并绘图.我设法为真实数据和预测数据创建了2个数据框,并合并并将它们融合到新的数据框中以进行绘制.

  dates =(df [-320:] ["Date"]).valuesdf_plot = pd.DataFrame(columns = ["Date","Close"]])df_plot [日期"] =日期df_plot ["Close"] = y_test.values.transpose()df_predd = pd.DataFrame(columns = ["Predicted","Date"])df_predd ["Predicted"] = y_pred.transpose()df_predd ["Date"] =日期df_predd [日期"] = df_predd [日期"] + pd.offsets.DateOffset(years = 8)#我想将其绘制为将来的预测级联= pd.concat([df_predd.assign(dataset ='df_predd'),df_plot.assign(dataset ='df_plot')],axis = 0)blended_df = pd.melt(串联,id_vars = [日期"],value_vars = [预测",关闭"])sns.relplot(x ='Date',y ='value',data = melted_df,hue ="variable",style ='variable',kind ="line",height = 10) 

这是不想要的输出:

我想要类似这样的输出:

我想念什么?我检查了日期列的类型.现在是日期时间.我无法像上面显示的第一个图一样散布x轴.任何帮助将不胜感激.预先感谢.

解决方案

为简化您的示例,请考虑以下两个玩具数据帧:

 将pandas导入为pd将seaborn导入为sns导入matplotlib.pyplot作为plt将numpy导入为npnp.random.seed(1)df_actual = pd.DataFrame(data = {日期":pd.date_range(开始="2020-01-01",周期= 8,频率="MS"),值":np.random.randint(10,30,8),})df_forecast = pd.DataFrame(data = {日期":pd.date_range(开始="2020-08-01",周期= 4,频率="MS"),值":np.random.randint(10,30,4)}) 

如果要在共享的x轴上一起绘制实际值和预测值,我想到的最简单的方法是通过添加 type 列并将其输入到 hue 参数.

记住要连接"通过使预测数据框的第一个值与实际数据框的最后一个值相同来在两行中显示:

 #第一个预测值==最后一个实际值df_forecast.iloc [0,:] = df_actual.iloc [-1,:]df_forecast ["type"] ="forecast".df_actual ["type"] ="actual"df = pd.concat([df_actual,df_forecast]) 

最后,您按如下方式创建情节:

  plt.figure(figsize =(10,5))sns.lineplot(x =日期",y =值",色相=类型",data = df) 

I'm trying to train a Linear Regression Model with Python via using Google Stock Prices that can be found here: https://www.kaggle.com/medharawat/google-stock-price And trying to predict future stocks by given features. After that I'm planning to plot it with the values in current dataset.

First, I read dataframes with date values with date parser and concatted these 2 dataframes into one in order to split it myself:

parser = lambda date: pd.datetime.strptime(date, '%m/%d/%Y')
df_test=pd.read_csv("/kaggle/input/google-stock-price/Google_Stock_Price_Test.csv",parse_dates=[0], date_parser=parser)
df_train=pd.read_csv("/kaggle/input/google-stock-price/Google_Stock_Price_Train.csv",parse_dates=[0], date_parser=parser)
df=pd.concat([df_train,df_test])

Then I changed the type of Close column as "float64" and plotted the Date-Close relation via using seaborn:

import seaborn as sns
sns.relplot(x='Date', y='Close', data=df,kind="line")

The output is:

I've managed the necessary column translations until this part of the code. In this part I split the data frame, create and trained model, and predicted values.

from sklearn.model_selection import train_test_split

X=df[["Open","High","Low","pc"]]
y=df["Close"]     
X_train,X_test,y_train,y_test = train_test_split(X,y)

from sklearn.linear_model import LinearRegression
model=LinearRegression()
model.fit(X_train,y_train)
model.score(X_test,y_test)
y_pred=model.predict(X_test)

What I want to achieve after this part is I want to set these predictions' dates for future dates in order to combine them into my data frame and plot. I managed to create 2 data frames for real and predicted data and concat and melt them into new dataframe in order to plot it.

dates=(df[-320:]["Date"]).values
df_plot=pd.DataFrame(columns=["Date","Close"])
df_plot["Date"]=dates
df_plot["Close"]=y_test.values.transpose()

df_predd=pd.DataFrame(columns=["Predicted","Date"])
df_predd["Predicted"]=y_pred.transpose()
df_predd["Date"]=dates
df_predd["Date"]=df_predd["Date"]+pd.offsets.DateOffset(years=8) #I want to plot it as future predictions

concatenated = pd.concat([df_predd.assign(dataset='df_predd'), df_plot.assign(dataset='df_plot')],axis=0)
melted_df=pd.melt(concatenated,id_vars=["Date"],value_vars=["Predicted","Close"])

sns.relplot(x='Date', y='value', data=melted_df,hue="variable",style='variable',kind="line",height=10)

Here's the undesired output:

I want an output something like that:

What Am I Missing? I checked the Date Column's type. It's datetime. I can't spread the x-axis as the first plot shown above. Any helps will be appreciated. Thanks in advance.

解决方案

To simplify your example, consider these two toy dataframes:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

np.random.seed(1)

df_actual = pd.DataFrame(data={
    "date"  : pd.date_range(start="2020-01-01", periods=8, freq="MS"),
    "value" : np.random.randint(10, 30, 8),
})

df_forecast = pd.DataFrame(data={
    "date"  : pd.date_range(start="2020-08-01", periods=4, freq="MS"),
    "value" : np.random.randint(10, 30, 4)
})

If you want to plot the actual and forecast values together on a shared x axis, the easiest way I can think of is to differentiate them by adding a type column and feeding it to the hue parameter of seaborn's lineplot.

Remember to "connect" the two lines by making the first value of the forecast dataframe the same as the last value of the actual dataframe:

#first forecast value == last actual value
df_forecast.iloc[0, :] = df_actual.iloc[-1, :]

df_forecast["type"] = "forecast"
df_actual["type"] = "actual"

df = pd.concat([df_actual, df_forecast])

Finally, you create your plot like so:

plt.figure(figsize=(10,5))
sns.lineplot(x="date", y="value", hue="type", data=df)

这篇关于与Seaborn进行绘图时,如何根据年份传播绘图的日期轴?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆