解释Alias表测试R中模型的多重共线性 [英] Interpreting Alias table testing multicollinearity of model in R

查看:87
本文介绍了解释Alias表测试R中模型的多重共线性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人可以帮我解释别名函数输出,以便在多元回归模型中测试多重共线性.我知道模型中的某些预测变量高度相关,因此我想使用别名表进行识别.

Could someone help me interpret the alias function output for testing for multicollinearity in a multiple regression model. I know some predictor variables in my model are highly correlated, and I want to identify them using the alias table.

Model :
Score ~ Comments + Pros + Cons + Advice + Response + Value + Recommendation 
+ 6Months + 12Months + 2Years + 3Years + Daily + Weekly + Monthly

Complete :
            (Intercept) Comments Pros Cons Advice Response Value1
UseMonthly1      0           0    0    0    0      0          0                
             Recommendation1 6Months1 12Months1 2Years1
UseMonthly1   0               1        1       1             
             3Years1 Daily1 Weekly1
UseMonthly1  1         -1        -1    

值,建议,6个月,12个月,2年,3年,每日,每周和每月是二进制分类变量.
分数,评论,优点,缺点,建议和响应是数字变量.

Value, Recommendation, 6Months, 12Months, 2Years, 3Years, Daily, Weekly, and Monthly are binary categorical variables.
Score, Comments, Pros, Cons, Advice, and Response are numeric variables.

我可以假设UseMonthly与6月,12月,2年,3年,每日,每周高度相关吗?别名输出中的1和-1值有什么区别?是正相关还是负相关?

Can I assume UseMonthly is highly correlated with 6Months, 12Months, 2Years, 3Years, Daily, Weekly? What is the difference between the 1 and -1 values in the alias output? Is it positive and negative correlation?

推荐答案

完整"矩阵中的非零条目表明这些术语与UseMonthly线性相关.这意味着它们是高度相关的,但是术语可以是高度相关的,而不会线性相关.

Nonzero entries in the "complete" matrix show that those terms are linearly dependent on UseMonthly. This means they're highly correlated, but terms can be highly correlated without being linearly dependent.

如果您的目的是识别和删除相关变量,则应该删除UseMonthly,但是您可能还希望删除其他变量.确定可能存在多重共线性问题的变量的一种常见方法是搜索大方差膨胀因子(例如,用car::vif计算).

If your purpose is to identify and remove correlated variables, you should remove UseMonthly, but you'll probably also want to remove others as well. A common way to identify variables which can be problematic with respect to multicollinearity is to search for large variance inflation factors (calculated by e.g. car::vif).

这篇关于解释Alias表测试R中模型的多重共线性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆