绘图:如何使用绘图绘制回归线? [英] Plotly: How to plot a regression line using plotly?
问题描述
我有一个数据帧df,其列为pm1和pm25。我想显示这两个信号之间的相关性的图表(带有Plotly)。到目前为止,我已经设法显示了散点图,但是我没有设法画出信号之间的相关性的拟合线。到目前为止,我已经尝试过:
I have a dataframe, df with the columns pm1 and pm25. I want to show a graph(with Plotly) of how correlated these 2 signals are. So far, I have managed to show the scatter plot, but I don't manage to draw the fit line of correlation between the signals. So far, I have tried this:
denominator=df.pm1**2-df.pm1.mean()*df.pm1.sum()
print('denominator',denominator)
m=(df.pm1.dot(df.pm25)-df.pm25.mean()*df.pm1.sum())/denominator
b=(df.pm25.mean()*df.pm1.dot(df.pm1)-df.pm1.mean()*df.pm1.dot(df.pm25))/denominator
y_pred=m*df.pm1+b
lineOfBestFit = go.Scattergl(
x=df.pm1,
y=y_pred,
name='Line of best fit',
line=dict(
color='red',
)
)
data = [dataPoints, lineOfBestFit]
figure = go.Figure(data=data)
figure.show()
图:
如何使lineOfBestFit正确绘制?
How can I make the lineOfBestFit to be drawn properly?
推荐答案
对于我喜欢的回归分析o使用 statsmodels.api
或 sklearn.linear_model
。我还喜欢在熊猫数据框中同时组织数据和回归结果。这是一种以干净有序的方式完成所需工作的方法:
For regression analysis I like to use statsmodels.api
or sklearn.linear_model
. I also like to organize both the data and regression results in a pandas dataframe. Here's one way to do what you're looking for in a clean and organized way:
使用sklearn或statsmodels绘图:
使用sklearn的代码:
from sklearn.linear_model import LinearRegression
import plotly.graph_objects as go
import pandas as pd
import numpy as np
import datetime
# data
np.random.seed(123)
numdays=20
X = (np.random.randint(low=-20, high=20, size=numdays).cumsum()+100).tolist()
Y = (np.random.randint(low=-20, high=20, size=numdays).cumsum()+100).tolist()
df = pd.DataFrame({'X': X, 'Y':Y})
# regression
reg = LinearRegression().fit(np.vstack(df['X']), Y)
df['bestfit'] = reg.predict(np.vstack(df['X']))
# plotly figure setup
fig=go.Figure()
fig.add_trace(go.Scatter(name='X vs Y', x=df['X'], y=df['Y'].values, mode='markers'))
fig.add_trace(go.Scatter(name='line of best fit', x=X, y=df['bestfit'], mode='lines'))
# plotly figure layout
fig.update_layout(xaxis_title = 'X', yaxis_title = 'Y')
fig.show()
使用statsmodels的代码:
import plotly.graph_objects as go
import statsmodels.api as sm
import pandas as pd
import numpy as np
import datetime
# data
np.random.seed(123)
numdays=20
X = (np.random.randint(low=-20, high=20, size=numdays).cumsum()+100).tolist()
Y = (np.random.randint(low=-20, high=20, size=numdays).cumsum()+100).tolist()
df = pd.DataFrame({'X': X, 'Y':Y})
# regression
df['bestfit'] = sm.OLS(df['Y'],sm.add_constant(df['X'])).fit().fittedvalues
# plotly figure setup
fig=go.Figure()
fig.add_trace(go.Scatter(name='X vs Y', x=df['X'], y=df['Y'].values, mode='markers'))
fig.add_trace(go.Scatter(name='line of best fit', x=X, y=df['bestfit'], mode='lines'))
# plotly figure layout
fig.update_layout(xaxis_title = 'X', yaxis_title = 'Y')
fig.show()
这篇关于绘图:如何使用绘图绘制回归线?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!