如何在同一图中绘制多个线性回归 [英] How to plot multiple linear regressions in the same figure
问题描述
鉴于以下内容:
将 numpy 导入为 np将熊猫作为pd导入将 seaborn 作为 sns 导入np.random.seed(365)x1 = np.random.randn(50)y1 = np.random.randn(50)* 100x2 = np.random.randn(50)y2 = np.random.randn(50) * 100df1 = pd.DataFrame({'x1':x1, 'y1': y1})df2 = pd.DataFrame({'x2':x2, 'y2': y2})sns.lmplot('x1','y1',df1,fit_reg = True,ci = None)sns.lmplot('x2','y2',df2,fit_reg = True,ci = None)
这将创建 2 个单独的图.如何将df2中的数据添加到SAME图上?我在网上找到的所有 seaborn 示例似乎都关注如何创建相邻图形(例如,通过 'hue' 和 'col_wrap' 选项).此外,我不喜欢使用可能存在附加列的数据集示例,因为这在我正在从事的项目中没有自然意义.
如果需要混合使用 matplotlib/seaborn 函数来实现这一点,如果有人能帮助说明,我将不胜感激.
你可以使用 seaborn
的
如果我理解正确,这就是您所需要的.
请注意,您需要注意 .regplot
参数,并且可能要更改我作为示例输入的值.
- 该行末尾的
-
;
是禁止显示命令的输出(我在可见的地方使用ipython notebook). - 文档对
.map()
方法.本质上,它就是这样做,将绘图命令与数据映射.然而,它适用于像regplot
这样的低级"绘图命令,而不是lmlplot
,后者实际上是在幕后调用 regplot. - 通常
plt.scatter
将采用以下参数:c ='none'
,edgecolor ='r'
来制作未填充的标记.但是seaborn正在干扰该过程并强制标记变色,因此我看不到一种简单/直观的方法来解决此问题,而是在seaborn生成绘图后操纵ax
元素,这是最好的作为另一个问题的一部分来解决.
Given the following:
import numpy as np
import pandas as pd
import seaborn as sns
np.random.seed(365)
x1 = np.random.randn(50)
y1 = np.random.randn(50) * 100
x2 = np.random.randn(50)
y2 = np.random.randn(50) * 100
df1 = pd.DataFrame({'x1':x1, 'y1': y1})
df2 = pd.DataFrame({'x2':x2, 'y2': y2})
sns.lmplot('x1', 'y1', df1, fit_reg=True, ci = None)
sns.lmplot('x2', 'y2', df2, fit_reg=True, ci = None)
This will create 2 separate plots. How can I add the data from df2 onto the SAME graph? All the seaborn examples I have found online seem to focus on how you can create adjacent graphs (say, via the 'hue' and 'col_wrap' options). Also, I prefer not to use the dataset examples where an additional column might be present as this does not have a natural meaning in the project I am working on.
If there is a mixture of matplotlib/seaborn functions that are required to achieve this, I would be grateful if someone could help illustrate.
You could use seaborn
's FacetGrid
class to get desired result.
You would need to replace your plotting calls with these lines:
# sns.lmplot('x1', 'y1', df1, fit_reg=True, ci = None)
# sns.lmplot('x2', 'y2', df2, fit_reg=True, ci = None)
df = pd.concat([df1.rename(columns={'x1':'x','y1':'y'})
.join(pd.Series(['df1']*len(df1), name='df')),
df2.rename(columns={'x2':'x','y2':'y'})
.join(pd.Series(['df2']*len(df2), name='df'))],
ignore_index=True)
pal = dict(df1="red", df2="blue")
g = sns.FacetGrid(df, hue='df', palette=pal, size=5);
g.map(plt.scatter, "x", "y", s=50, alpha=.7, linewidth=.5, edgecolor="white")
g.map(sns.regplot, "x", "y", ci=None, robust=1)
g.add_legend();
This will yield this plot:
Which is if I understand correctly is what you need.
Note that you will need to pay attention to .regplot
parameters and may want to change the values I have put as an example.
;
at the end of the line is to suppress output of the command (I use ipython notebook where it's visible).- Docs give some explanation on the
.map()
method. In essence, it does just that, maps plotting command with data. However it will work with 'low-level' plotting commands likeregplot
, and notlmlplot
, which is actually calling regplot behind the scene. - Normally
plt.scatter
would take parameters:c='none'
,edgecolor='r'
to make non-filled markers. But seaborn is interfering the process and enforcing color to the markers, so I don't see an easy/straigtforward way to fix this, but to manipulateax
elements after seaborn has produced the plot, which is best to be addressed as part of a different question.
这篇关于如何在同一图中绘制多个线性回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!