为什么"Varimax"和"Varimax"之间在心理::原理上有区别?和"varimax"? [英] Why are there differences in psych::principal between "Varimax" and "varimax"?
问题描述
在相关问题中,我问为什么stats::varimax
和GPArotation::Varimax
之间存在差异,psych::principal
都会调用这两个差异,具体取决于为rotate =
设置的选项.
这两个之间的差异(请参见另一个问题)说明了与psych::principal
的部分差异,但不是全部差异.
看来这些差异以某种方式使
library(GPArotation)
library(psych)
data("Thurstone")
principal.unrotated <- principal(r = Thurstone, nfactors = 4, rotate = "none") # find unrotated PCs first
loa <- unclass(principal.unrotated$loadings)
# just to compare that the rot.mat is correct
varimax.stats <- stats::varimax(x = loa, normalize = TRUE)
varimax.GPA <- GPArotation::Varimax(L = loa, normalize = TRUE)
# notice we're here NOT interested in the difference between stats and GPA, that's the other question
diff.from.rot.meth <- unclass(varimax.stats$loadings - varimax.GPA$loadings) # very small differences, see this question: https://stackoverflow.com/questions/32350891/why-are-there-differences-between-gparotationvarimax-and-statsvarimax
mean(abs(diff.from.rot.meth))
#> [1] 8.036863e-05
principal.varimax.stats <- principal(r = Thurstone, nfactors = 4, rotate = "varimax")
principal.Varimax.GPA <- principal(r = Thurstone, nfactors = 4, rotate = "Varimax")
diff.from.princ <- principal.Varimax.GPA$rot.mat - principal.varimax.stats$rot.mat # quite a substantial change, because Theta is NOT a rotmat, that makes sense
mean(abs(diff.from.princ))
#> [1] 0.021233
mean(abs(diff.from.rot.meth)) - mean(abs(diff.from.princ)) # principal has MUCH bigger differences
#> [1] -0.02115263
对于浮点制品或其他东西来说,这似乎太大了.
我的假设是(其他)差异是由于 GPArotation::Varimax
默认为(Kaiser)normalize == FALSE
,而** stats::varimax
默认为(Kaiser)normalize == TRUE
,而 的设置与`principal :: psych``内的设置不同.
stats::varimax
手册:
## varimax with normalize = TRUE is the default
GPArotation::Varimax
/GPArotation::GPForth
手册:
参数normalize给出了是否应该以及如何在旋转之前进行任何归一化,然后在旋转之后撤消归一化的指示.如果normalize为FALSE(默认设置),则不执行规范化.如果normalize为TRUE,则完成Kaiser规范化. (因此将归一化的A sum等于1.0的平方行条目.这有时称为Horst归一化.)
此外,他们psych::Kaiser
手册警告:
默认情况下,GPArotation包不进行规范化,fa函数也不进行规范化.然后,更令人困惑的是,统计信息中的varimax确实出现了,GPArotation中的Varmax却没有.
任何人都可以确认差异实际上是归一化选项造成的吗?
由于程序(当然,因为psych :: principal中的normalize
选项不是评估后,所有其他程序都必须将该选项切换为TRUE).虽然可以配置stats :: varimax和GPArotation :: Varimax的精度(参数eps
),但在psych :: principal中将忽略此精度,并且似乎与eps = 1e-5隐含等于stats :: varimax.
如果我们将stats :: varimax和GPArotations :: Varimax的精度提高到eps=1e-15
,那么我们得到的结果(至少8位数字)与我测试过的MatMate程序中的并发实现相同也非常精确地等于SPSS计算.
psych :: principal中缺少对显式选项eps
的处理似乎是一个错误,其较差的隐式默认值肯定令人不满意.
eps=1e-15
进行很多旋转,请参见最后的输出.因此,如果可以停止迭代,则可能是实施了另一个内部过程,或者在决策中对eps参数进行了不同的评估.在此答案结尾处看到的一个示例,可能暗示了这种效果.
下面的比较协议;仅显示加载的前两行.
The first three computations with low accuracy (eps=1e-5)
all procedures give equal or comparable results,
except GPArotation, which is already near the exact answer
--------------------------------
> principal(r = Thurstone, nfactors = 4, rotate = "varimax")$loadings *1000000
> stats::varimax(x = loa, normalize = TRUE,eps=1e-5)$loadings *1000000
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-5)$loadings *1000000
The second three computations with (attempted) high accuracy (eps=1e-15)
all procedures except psych::principal give equal results, and they
agree also with external crosscheck using MatMate
--------------------------------
> principal(r = Thurstone, nfactors = 4, rotate = "varimax",eps=1e-15)$loadings*1000000
> stats::varimax(x = loa, normalize = TRUE,eps=1e-15)$loadings*1000000
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-10)$loadings*1000000
尝试使用大的eps
或不使用大的eps
很少/默认准确结果
# ===== Numerical documentation (only first two rows are displayed)================
> principal(r = Thurstone, nfactors = 4, rotate = "varimax")$loadings*1000000
RC1 RC2 RC3 RC4
Sentences 871655.72 216638.46 198427.07 175202.57
Vocabulary 855609.28 294166.99 153181.45 180525.99
> stats::varimax(x = loa, normalize = TRUE,eps=1e-5)$loadings *1000000
PC1 PC2 PC3 PC4
Sentences 871655.72 216638.46 198427.07 175202.57
Vocabulary 855609.28 294166.99 153181.45 180525.99
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-5)$loadings *1000000
PC1 PC2 PC3 PC4
Sentences 871717.3 216618.7 198176.3 175204.47
Vocabulary 855663.1 294146.3 152930.7 180517.21
# =============================================================================
现在尝试使用较小的eps
> principal(r = Thurstone, nfactors = 4, rotate = "varimax",eps=1e-15)$loadings *1000000
RC1 RC2 RC3 RC4
Sentences 871655.72 216638.46 198427.07 175202.57
Vocabulary 855609.28 294166.99 153181.45 180525.99
> stats::varimax(x = loa, normalize = TRUE,eps=1e-15)$loadings *1000000
PC1 PC2 PC3 PC4
Sentences 871716.83 216619.69 198174.31 175207.86
Vocabulary 855662.58 294147.47 152928.77 180519.37
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-10)$loadings *1000000
PC1 PC2 PC3 PC4
Sentences 871716.8 216619.7 198174.31 175207.86
Vocabulary 855662.6 294147.5 152928.77 180519.37
Warnmeldung:
In GPForth(L, Tmat = Tmat, method = "varimax", normalize = normalize, :
convergence not obtained in GPForth. 1000 iterations used.
# Result by MatMate: --------------------------------------------------------
lad = cholesky(Thurstone)
pc = rot(lad,"pca")
pc4 = pc[*,1..4] // arrive at the first four pc's
t = gettrans( normzl(pc4),"varimax") // get rotation-matrix for row-normalized pc's
vmx = pc4 * t // rotate pc4 by rotation-matrix
display = vmx * 1000000
PC1 PC2 PC3 PC4
Sentences 871716.83 216619.68 198174.31 175207.87
Vocabulary 855662.58 294147.46 152928.77 180519.37
# ===============================================================================>
通过将stats :: varimax和GPArotation :: Varimax更好地匹配结果,可以将stat ::中的eps设置为1e-12,将GPArotation中的eps设置为1e-6,其中一个值是其他.然后,我们得到
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-6)$loadings*1000000
PC1 PC2 PC3 PC4
Sentences 871716.8 216619.8 198174.49 175207.63
Vocabulary 855662.5 294147.6 152928.94 180519.21
> stats::varimax(x = loa, normalize = TRUE,eps=1e-12)$loadings*1000000
PC1 PC2 PC3 PC4
Sentences 871716.80 216619.74 198174.40 175207.85
Vocabulary 855662.55 294147.52 152928.86 180519.36
In a related question, I have asked why there are differences between stats::varimax
and GPArotation::Varimax
, both of which psych::principal
calls, depending on the option set for rotate =
.
The differences between these two (see other question) account for some, but not all of the differences from psych::principal
.
It appears that these differences somehow get exacerbated by psych::principal
.
(I have a simple theory why, and I'd like to get that confirmed).
library(GPArotation)
library(psych)
data("Thurstone")
principal.unrotated <- principal(r = Thurstone, nfactors = 4, rotate = "none") # find unrotated PCs first
loa <- unclass(principal.unrotated$loadings)
# just to compare that the rot.mat is correct
varimax.stats <- stats::varimax(x = loa, normalize = TRUE)
varimax.GPA <- GPArotation::Varimax(L = loa, normalize = TRUE)
# notice we're here NOT interested in the difference between stats and GPA, that's the other question
diff.from.rot.meth <- unclass(varimax.stats$loadings - varimax.GPA$loadings) # very small differences, see this question: https://stackoverflow.com/questions/32350891/why-are-there-differences-between-gparotationvarimax-and-statsvarimax
mean(abs(diff.from.rot.meth))
#> [1] 8.036863e-05
principal.varimax.stats <- principal(r = Thurstone, nfactors = 4, rotate = "varimax")
principal.Varimax.GPA <- principal(r = Thurstone, nfactors = 4, rotate = "Varimax")
diff.from.princ <- principal.Varimax.GPA$rot.mat - principal.varimax.stats$rot.mat # quite a substantial change, because Theta is NOT a rotmat, that makes sense
mean(abs(diff.from.princ))
#> [1] 0.021233
mean(abs(diff.from.rot.meth)) - mean(abs(diff.from.princ)) # principal has MUCH bigger differences
#> [1] -0.02115263
This seems way too big for just a floating point artefact or something.
My hypothesis would be that the (additional) difference stems from the fact that GPArotation::Varimax
defaults to (Kaiser) normalize == FALSE
, whereas **stats::varimax
defaults to (Kaiser) normalize == TRUE
, which cannot be set differently from within `principal::psych``.
stats::varimax
manual:
## varimax with normalize = TRUE is the default
GPArotation::Varimax
/ GPArotation::GPForth
manual:
The argument normalize gives an indication of if and how any normalization should be done before rotation, and then undone after rotation. If normalize is FALSE (the default) no normalization is done. If normalize is TRUE then Kaiser normalization is done. (So squared row entries of normalized A sum to 1.0. This is sometimes called Horst normalization.)
Also, they psych::Kaiser
manual warns:
The GPArotation package does not (by default) normalize, nor does the fa function. Then, to make it more confusing, varimax in stats does,Varimax in GPArotation does not.
Can anyone confirm that the difference is, in fact, accounted for by the normalization options?
The problem of different loadings occurs because of different accuracy in the procedures (of course, since in psych::principal the normalize
option is not evaluated, all the other procedures must used with that option switched to TRUE). While the accuracy in stats::varimax and GPArotation::Varimax can be configured (parameter eps
) this is ignored in psych::principal, and seems to be implicitely equal to stats::varimax with eps=1e-5 .
If we increase accuracy in stats::varimax and GPArotations::Varimax to eps=1e-15
then we get the same result (at least up to 8 digits) as in a concurring implementation in my MatMate-program which I've tested to be very accurately equal to the SPSS computations too.
The missing handling of the explicite option eps
in psych::principal seems to be a bug, and its poor implicite default value is surely unsatisfying.
Interestingly, the GPArotation::Varimax needs very many rotations with eps=1e-15
, see the output at the end; so either there is another internal procedure implemented or the eps-parameter is evaluated differently in the decision when the iterations can be stopped. One example, see at the end of this answer, might suggest such an effect.
Below the protocol of the comparision; only the first two rows of the loadings are displayed.
The first three computations with low accuracy (eps=1e-5)
all procedures give equal or comparable results,
except GPArotation, which is already near the exact answer
--------------------------------
> principal(r = Thurstone, nfactors = 4, rotate = "varimax")$loadings *1000000
> stats::varimax(x = loa, normalize = TRUE,eps=1e-5)$loadings *1000000
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-5)$loadings *1000000
The second three computations with (attempted) high accuracy (eps=1e-15)
all procedures except psych::principal give equal results, and they
agree also with external crosscheck using MatMate
--------------------------------
> principal(r = Thurstone, nfactors = 4, rotate = "varimax",eps=1e-15)$loadings*1000000
> stats::varimax(x = loa, normalize = TRUE,eps=1e-15)$loadings*1000000
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-10)$loadings*1000000
The attempts to little/default accurate results with large eps
or without
# ===== Numerical documentation (only first two rows are displayed)================
> principal(r = Thurstone, nfactors = 4, rotate = "varimax")$loadings*1000000
RC1 RC2 RC3 RC4
Sentences 871655.72 216638.46 198427.07 175202.57
Vocabulary 855609.28 294166.99 153181.45 180525.99
> stats::varimax(x = loa, normalize = TRUE,eps=1e-5)$loadings *1000000
PC1 PC2 PC3 PC4
Sentences 871655.72 216638.46 198427.07 175202.57
Vocabulary 855609.28 294166.99 153181.45 180525.99
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-5)$loadings *1000000
PC1 PC2 PC3 PC4
Sentences 871717.3 216618.7 198176.3 175204.47
Vocabulary 855663.1 294146.3 152930.7 180517.21
# =============================================================================
Now the attempts to more accurate results with smaller eps
> principal(r = Thurstone, nfactors = 4, rotate = "varimax",eps=1e-15)$loadings *1000000
RC1 RC2 RC3 RC4
Sentences 871655.72 216638.46 198427.07 175202.57
Vocabulary 855609.28 294166.99 153181.45 180525.99
> stats::varimax(x = loa, normalize = TRUE,eps=1e-15)$loadings *1000000
PC1 PC2 PC3 PC4
Sentences 871716.83 216619.69 198174.31 175207.86
Vocabulary 855662.58 294147.47 152928.77 180519.37
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-10)$loadings *1000000
PC1 PC2 PC3 PC4
Sentences 871716.8 216619.7 198174.31 175207.86
Vocabulary 855662.6 294147.5 152928.77 180519.37
Warnmeldung:
In GPForth(L, Tmat = Tmat, method = "varimax", normalize = normalize, :
convergence not obtained in GPForth. 1000 iterations used.
# Result by MatMate: --------------------------------------------------------
lad = cholesky(Thurstone)
pc = rot(lad,"pca")
pc4 = pc[*,1..4] // arrive at the first four pc's
t = gettrans( normzl(pc4),"varimax") // get rotation-matrix for row-normalized pc's
vmx = pc4 * t // rotate pc4 by rotation-matrix
display = vmx * 1000000
PC1 PC2 PC3 PC4
Sentences 871716.83 216619.68 198174.31 175207.87
Vocabulary 855662.58 294147.46 152928.77 180519.37
# ===============================================================================>
A much better matching of the results by stats::varimax and GPArotation::Varimax could be achieved by setting eps to 1e-12 in stat:: and to 1e-6 in GPArotation , where one value is the square of the other. We get then
> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-6)$loadings*1000000
PC1 PC2 PC3 PC4
Sentences 871716.8 216619.8 198174.49 175207.63
Vocabulary 855662.5 294147.6 152928.94 180519.21
> stats::varimax(x = loa, normalize = TRUE,eps=1e-12)$loadings*1000000
PC1 PC2 PC3 PC4
Sentences 871716.80 216619.74 198174.40 175207.85
Vocabulary 855662.55 294147.52 152928.86 180519.36
这篇关于为什么"Varimax"和"Varimax"之间在心理::原理上有区别?和"varimax"?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!