statsmodels WLS 有 get_influence() 函数吗? [英] Does statsmodels WLS have get_influence() function?

查看:24
本文介绍了statsmodels WLS 有 get_influence() 函数吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何从适用于 python statsmodels 的 WLS 模型中获得杠杆作用/get_influence

How do I get leverage/get_influence from a WLS model fit in python statsmodels

http://statsmodels.sourceforge.net/stable/index.html 为例

# Load data
dat = sm.datasets.get_rdataset("Guerry", "HistData").data

# Fit regression model (using the natural log of one of the regressors)
results_ols = smf.ols('Lottery ~ Literacy + np.log(Pop1831)', data=dat).fit()
results_w = smf.wls('Lottery ~ Literacy + np.log(Pop1831)', data=dat).fit()

我可以打电话

results_ols.get_influence 

但不是 results_wls.get_influence()

but not results_wls.get_influence()

wls 是否有等价物?

Is there an equivalent for wls ?

我也会对 statsmodels 之外的任何解决方案感兴趣.

I would be interested in any solutions outside of statsmodels as well.

推荐答案

您可以通过对加权变量使用 OLS 来获取加权变量的影响和异常值度量.

You can get the influence and outlier measures for the weighted variables by using OLS on the weighted variables.

例如,如果 mod_wls 是您的 WLS 模型(模型实例,而不是结果实例),则

For example if mod_wls is your WLS model (the model instance, not the results instance), then

res = sm.OLS(mod_wls.wendog, mod_wls.wexog).fit()
infl = res.get_influence()

AFAIK,大多数或所有影响度量都是正确的,但它们是根据加权变量和观察值计算的.一些影响度量在原始变量方面有一些定义,但这些将不可用.例如,WLS 的帽子矩阵有两种定义方式,一种对应于如上使用加权变量,另一种对应于原始变量的影响.

AFAIK, most or all influence measures will be correct but they are in terms of weighted variables and observations. There are some definitions of some the influence measures in terms of the original variables, but those will not be available. For example, there are two ways to define the hat matrix for WLS, one corresponding to using weighted variables as above and another that has the influence in terms of the original variable.

(类似的问题出现在 GLM 和 RLM 中,它们都基于迭代重新加权最小二乘法,例如 https://github.com/statsmodels/statsmodels/issues/808

(A similar issue shows up in GLM and RLM which are both based on iteratively reweighted least squares, e.g. https://github.com/statsmodels/statsmodels/issues/808

影响和异常值统计尚未扩展到其他模型,主要是因为缺乏对明确处理这种情况的统计文献的参考,以及不知道可用于单元测试的另一个包中的参考实现.

The influence and outlier statistics have not been extended to other models mostly for a lack of reference to the statistical literature that explicitly handles this case, and for not knowing of a reference implementation in another package that could be used for the unit tests.

更新
GLM 现在有一些异常影响措施https://www.statsmodels.org/dev/生成/statsmodels.genmod.generalized_linear_model.GLMResults.get_influence.html

但仍然没有明确的 WLS)

but still nothing explicitly for WLS )

这篇关于statsmodels WLS 有 get_influence() 函数吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆