DFFITS的计算对杠杆的影响以及对回归的影响 [英] Calculation of DFFITS as diagnostic for Leverage and Influence in regression
问题描述
我正在尝试手动计算 DFFITS.获得的值应等于通过 dffits
函数获得的第一个值.但是我自己的计算肯定有问题.
I am trying to calculate DFFITS by hand. The value obtained should be equal to the first value obtained by dffits
function. However there must be something wrong with my own calculation.
attach(cars)
x1 <- lm(speed ~ dist, data = cars) # all observations
x2 <- lm(speed ~ dist, data = cars[-1,]) # without first obs
x <- model.matrix(speed ~ dist) # x matrix
h <- diag(x%*%solve(crossprod(x))%*%t(x)) # hat values
num_dffits <- x1$fitted.values[1] - x2$fitted.values[1] #Numerator
denom_dffits <- sqrt(anova(x2)$`Mean Sq`[2]*h[1]) #Denominator
df_fits <- num_dffits/denom_dffits #DFFITS
dffits(x1)[1] # DFFITS function
推荐答案
您的分子是错误的.从第二个模型中删除了第一个基准后,相应的预测值不在 fitted(x2)
中.我们需要使用 predict(x2,cars [1,])
代替 fitted(x2)[1]
.
Your numerator is wrong. As you have removed first datum from the second model, corresponding predicted value is not in fitted(x2)
. We need to use predict(x2, cars[1, ])
in place of fitted(x2)[1]
.
帽子值可以通过以下方式有效地计算
Hat values can be efficiently computed by
h <- rowSums(qr.Q(x1$qr) ^ 2)
或使用其R包装函数
h <- hat(x1$qr, FALSE)
R还具有用于获取帽子值的通用功能:
R also has a generic function for getting hat values, too:
h <- lm.influence(x1, FALSE)$hat
或其包装函数
h <- hatvalues(x1)
您也不必致电 anova
即可获得MSE:
c(crossprod(x2$residuals)) / x2$df.residual
这篇关于DFFITS的计算对杠杆的影响以及对回归的影响的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!