Python Statsmodels Mixedlm(混合线性模型)随机效应 [英] Python Statsmodels Mixedlm (Mixed Linear Model) random effects

查看:280
本文介绍了Python Statsmodels Mixedlm(混合线性模型)随机效应的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对 Statsmodels Mixedlm 的输出有点困惑,希望有人能解释一下.

I am a bit confused about the output of Statsmodels Mixedlm and am hoping someone could explain.

我有一个大型单户住宅数据集,包括每个房产的前两个销售价格/销售日期.我已经对整个数据集进行了地理编码并获取了每个属性的高程.我试图了解不同城市之间海拔高度与房价升值之间的关系有何不同.

I have a large dataset of single family homes, including the previous two sale prices/sale dates for each property. I have geocoded this entire dataset and fetched the elevation for each property. I am trying to understand the way in which the relationship between elevation and property price appreciation varies between different cities.

我使用 statsmodels 混合线性模型来回归价格升值对海拔的影响,保持许多其他因素不变,将城市作为我的组类别.

I have used statsmodels mixed linear model to regress price appreciation on elevation, holding a number of other factors constant, with cities as my groups category.

md = smf.mixedlm('price_relative_ind~Elevation+YearBuilt+Sale_Amount_1+LivingSqFt',data=Miami_SF,groups=Miami_SF['City'])

mdf = md.fit()

mdf.random_effects

输入 mdf.random_effects 返回系数列表.我能否将此列表解释为每个城市的斜率(即,将海拔与销售价格升值联系起来的个体回归系数)?或者这些结果是每个城市的拦截?

Entering mdf.random_effects returns a list of coefficients. Can I interpret this list as, essentially, the slope for each individual city (i.e., the individual regression coefficient relating Elevation to sale price appreciation)? Or are these results the intercepts for each City?

推荐答案

我目前也在努力解决 MixedLM 中的随机效应问题.查看文档,似乎只使用 groups 参数,而不使用 exog_rere_formula 只会向每个组添加一个随机截距.文档中的一个示例:

I'm currently trying to get my head around random effects in MixedLM aswell. Looking at the docs, it seems as though using just the groups parameter, without exog_re or re_formula will simply add a random intercept to each group. An example from the docs:

# A basic mixed model with fixed effects for the columns of exog and a random intercept for each distinct value of group:

model = sm.MixedLM(endog, exog, groups)
result = model.fit()

因此,在这种情况下,您希望 random_effects 方法返回城市的截距,而不是系数/斜率.

As such, you would expect the random_effects method to return the city's intercepts in this case, not the coefficients/slopes.

要为您的其他特征之一添加随机斜率,您可以执行与 statsmodels 的 Jupyter 教程中的此示例类似的操作,使用斜率和截距:

To add a random slope with respect to one of your other features, you can do something similar to this example from statsmodels' Jupyter tutorial, either with a slope and an intercept:

model = sm.MixedLM.from_formula(
    "Y ~ X", data, re_formula="X", groups=data["C"])

或者只有斜率:

model = sm.MixedLM.from_formula(
    "Y ~ X", data, re_formula="0 + X", groups=data["C"])

查看 random_effects 的文档,它说它返回每个组的随机效应的平均值.然而,由于随机效应只是由于截距,这应该等于截距本身.

Looking at the docs for random_effects, it says that it returns the mean for each groups's random effects. However, as the random effects are only due to the intercept, this should just be equal to the intercept itself.

MixedLMResults.random_effects()[source]
    The conditional means of random effects given the data.

    Returns:    
        random_effects : dict
        A dictionary mapping the distinct group values to the means of the random effects for the group.

一些需要进一步查看的有用资源包括:

Some useful resources to look further at include:

  • 文档 用于公式版本混合机器学习
  • 文档 为 MixedML 的结果
  • 这个 Jupyter 笔记本,其中包含以下示例使用 MixedML (Python)
  • 斯坦福教程混合模型 (R)
  • 教程固定和随机效应 (R)
  • Docs for the formula version of MixedML
  • Docs for the results of MixedML
  • This Jupyter notebook with examples for using MixedML (Python)
  • Stanford tutorial on mixed models (R)
  • Tutorial on fixed and random effects (R)

这篇关于Python Statsmodels Mixedlm(混合线性模型)随机效应的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆