为什么"Varimax"和"Varimax"之间在心理::原理上有区别?和"varimax"? [英] Why are there differences in psych::principal between "Varimax" and "varimax"?

查看:92
本文介绍了为什么"Varimax"和"Varimax"之间在心理::原理上有区别?和"varimax"?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

相关问题中,我问为什么stats::varimaxGPArotation::Varimax之间存在差异,psych::principal都会调用这两个差异,具体取决于为rotate =设置的选项.

这两个之间的差异(请参见另一个问题)说明了与psych::principal的部分差异,但不是全部差异. 看来这些差异以某种方式使加剧了 . (为什么我有一个简单的理论,我想得到证实).

library(GPArotation)
library(psych)
data("Thurstone")

principal.unrotated <- principal(r = Thurstone, nfactors = 4, rotate = "none")  # find unrotated PCs first
loa <- unclass(principal.unrotated$loadings)

# just to compare that the rot.mat is correct
varimax.stats <- stats::varimax(x = loa, normalize = TRUE)
varimax.GPA <- GPArotation::Varimax(L = loa, normalize = TRUE)
# notice we're here NOT interested in the difference between stats and GPA, that's the other question

diff.from.rot.meth <- unclass(varimax.stats$loadings - varimax.GPA$loadings)  # very small differences, see this question: https://stackoverflow.com/questions/32350891/why-are-there-differences-between-gparotationvarimax-and-statsvarimax
mean(abs(diff.from.rot.meth))
#> [1] 8.036863e-05

principal.varimax.stats <- principal(r = Thurstone, nfactors = 4, rotate = "varimax")
principal.Varimax.GPA <- principal(r = Thurstone, nfactors = 4, rotate = "Varimax")

diff.from.princ <- principal.Varimax.GPA$rot.mat - principal.varimax.stats$rot.mat  # quite a substantial change, because Theta is NOT a rotmat, that makes sense
mean(abs(diff.from.princ))
#> [1] 0.021233

mean(abs(diff.from.rot.meth)) - mean(abs(diff.from.princ))  # principal has MUCH bigger differences
#> [1] -0.02115263

对于浮点制品或其他东西来说,这似乎太大了.

我的假设是(其他)差异是由于 GPArotation::Varimax默认为(Kaiser)normalize == FALSE,而** stats::varimax默认为(Kaiser)normalize == TRUE ,而 的设置与`principal :: psych``内的设置不同.

stats::varimax手册:

## varimax with normalize = TRUE is the default

GPArotation::Varimax/GPArotation::GPForth手册:

参数normalize给出了是否应该以及如何在旋转之前进行任何归一化,然后在旋转之后撤消归一化的指示.如果normalize为FALSE(默认设置),则不执行规范化.如果normalize为TRUE,则完成Kaiser规范化. (因此将归一化的A sum等于1.0的平方行条目.这有时称为Horst归一化.)

此外,他们psych::Kaiser手册警告:

默认情况下,GPArotation包不进行规范化,fa函数也不进行规范化.然后,更令人困惑的是,统计信息中的varimax确实出现了,GPArotation中的Varmax却没有.

任何人都可以确认差异实际上是归一化选项造成的吗?

解决方案

由于程序(当然,因为psych :: principal中的normalize选项不是评估后,所有其他程序都必须将该选项切换为TRUE).虽然可以配置stats :: varimax和GPArotation :: Varimax的精度(参数eps),但在psych :: principal中将忽略此精度,并且似乎与eps = 1e-5隐含等于stats :: varimax.

如果我们将stats :: varimax和GPArotations :: Varimax的精度提高到eps=1e-15,那么我们得到的结果(至少8位数字)与我测试过的MatMate程序中的并发实现相同也非常精确地等于SPSS计算.

psych :: principal中缺少对显式选项eps的处理似乎是一个错误,其较差的隐式默认值肯定令人不满意.

有趣的是,GPArotation :: Varimax需要使用eps=1e-15进行很多旋转,请参见最后的输出.因此,如果可以停止迭代,则可能是实施了另一个内部过程,或者在决策中对eps参数进行了不同的评估.在此答案结尾处看到的一个示例,可能暗示了这种效果.

下面的比较协议;仅显示加载的前两行.

The first three computations with low accuracy (eps=1e-5) 
all procedures give equal or comparable results, 
except GPArotation, which is already near the exact answer
--------------------------------
>  principal(r = Thurstone, nfactors = 4, rotate = "varimax")$loadings    *1000000
>  stats::varimax(x = loa, normalize = TRUE,eps=1e-5)$loadings            *1000000
>  GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-5)$loadings      *1000000

The second three computations with (attempted) high accuracy (eps=1e-15) 
all procedures except psych::principal give equal results, and they
agree also with external crosscheck using MatMate
--------------------------------
>  principal(r = Thurstone, nfactors = 4, rotate = "varimax",eps=1e-15)$loadings*1000000
>  stats::varimax(x = loa, normalize = TRUE,eps=1e-15)$loadings*1000000
>  GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-10)$loadings*1000000

尝试使用大的eps或不使用大的eps很少/默认准确结果

# ===== Numerical documentation (only first two rows are displayed)================
> principal(r = Thurstone, nfactors = 4, rotate = "varimax")$loadings*1000000
                RC1       RC2       RC3       RC4      
Sentences       871655.72 216638.46 198427.07 175202.57
Vocabulary      855609.28 294166.99 153181.45 180525.99


> stats::varimax(x = loa, normalize = TRUE,eps=1e-5)$loadings         *1000000
                PC1       PC2       PC3       PC4      
Sentences       871655.72 216638.46 198427.07 175202.57
Vocabulary      855609.28 294166.99 153181.45 180525.99

> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-5)$loadings   *1000000
                     PC1      PC2      PC3       PC4
Sentences       871717.3 216618.7 198176.3 175204.47
Vocabulary      855663.1 294146.3 152930.7 180517.21
# =============================================================================

现在尝试使用较小的eps

获得更准确的结果

> principal(r = Thurstone, nfactors = 4, rotate = "varimax",eps=1e-15)$loadings  *1000000
                RC1       RC2       RC3       RC4      
Sentences       871655.72 216638.46 198427.07 175202.57
Vocabulary      855609.28 294166.99 153181.45 180525.99

> stats::varimax(x = loa, normalize = TRUE,eps=1e-15)$loadings                   *1000000
                PC1       PC2       PC3       PC4      
Sentences       871716.83 216619.69 198174.31 175207.86
Vocabulary      855662.58 294147.47 152928.77 180519.37

> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-10)$loadings             *1000000
                     PC1      PC2       PC3       PC4
Sentences       871716.8 216619.7 198174.31 175207.86
Vocabulary      855662.6 294147.5 152928.77 180519.37

Warnmeldung:
In GPForth(L, Tmat = Tmat, method = "varimax", normalize = normalize,  :
  convergence not obtained in GPForth. 1000 iterations used.

# Result by MatMate: --------------------------------------------------------
 lad = cholesky(Thurstone) 
 pc = rot(lad,"pca")
 pc4 = pc[*,1..4]                           // arrive at the first four pc's
     t = gettrans( normzl(pc4),"varimax")   // get rotation-matrix for row-normalized pc's
 vmx = pc4 * t                              // rotate pc4 by rotation-matrix 
 display = vmx     * 1000000
                     PC1      PC2       PC3       PC4
Sentences       871716.83    216619.68   198174.31   175207.87
Vocabulary      855662.58    294147.46   152928.77   180519.37


# ===============================================================================> 

通过将stats :: varimax和GPArotation :: Varimax更好地匹配结果,可以将stat ::中的eps设置为1e-12,将GPArotation中的eps设置为1e-6,其中一个值是其他.然后,我们得到

> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-6)$loadings*1000000
                     PC1      PC2       PC3       PC4
Sentences       871716.8 216619.8 198174.49 175207.63
Vocabulary      855662.5 294147.6 152928.94 180519.21

> stats::varimax(x = loa, normalize = TRUE,eps=1e-12)$loadings*1000000
                PC1       PC2       PC3       PC4      
Sentences       871716.80 216619.74 198174.40 175207.85
Vocabulary      855662.55 294147.52 152928.86 180519.36

In a related question, I have asked why there are differences between stats::varimax and GPArotation::Varimax, both of which psych::principal calls, depending on the option set for rotate =.

The differences between these two (see other question) account for some, but not all of the differences from psych::principal. It appears that these differences somehow get exacerbated by psych::principal. (I have a simple theory why, and I'd like to get that confirmed).

library(GPArotation)
library(psych)
data("Thurstone")

principal.unrotated <- principal(r = Thurstone, nfactors = 4, rotate = "none")  # find unrotated PCs first
loa <- unclass(principal.unrotated$loadings)

# just to compare that the rot.mat is correct
varimax.stats <- stats::varimax(x = loa, normalize = TRUE)
varimax.GPA <- GPArotation::Varimax(L = loa, normalize = TRUE)
# notice we're here NOT interested in the difference between stats and GPA, that's the other question

diff.from.rot.meth <- unclass(varimax.stats$loadings - varimax.GPA$loadings)  # very small differences, see this question: https://stackoverflow.com/questions/32350891/why-are-there-differences-between-gparotationvarimax-and-statsvarimax
mean(abs(diff.from.rot.meth))
#> [1] 8.036863e-05

principal.varimax.stats <- principal(r = Thurstone, nfactors = 4, rotate = "varimax")
principal.Varimax.GPA <- principal(r = Thurstone, nfactors = 4, rotate = "Varimax")

diff.from.princ <- principal.Varimax.GPA$rot.mat - principal.varimax.stats$rot.mat  # quite a substantial change, because Theta is NOT a rotmat, that makes sense
mean(abs(diff.from.princ))
#> [1] 0.021233

mean(abs(diff.from.rot.meth)) - mean(abs(diff.from.princ))  # principal has MUCH bigger differences
#> [1] -0.02115263

This seems way too big for just a floating point artefact or something.

My hypothesis would be that the (additional) difference stems from the fact that GPArotation::Varimax defaults to (Kaiser) normalize == FALSE, whereas **stats::varimax defaults to (Kaiser) normalize == TRUE, which cannot be set differently from within `principal::psych``.

stats::varimax manual:

## varimax with normalize = TRUE is the default

GPArotation::Varimax / GPArotation::GPForth manual:

The argument normalize gives an indication of if and how any normalization should be done before rotation, and then undone after rotation. If normalize is FALSE (the default) no normalization is done. If normalize is TRUE then Kaiser normalization is done. (So squared row entries of normalized A sum to 1.0. This is sometimes called Horst normalization.)

Also, they psych::Kaiser manual warns:

The GPArotation package does not (by default) normalize, nor does the fa function. Then, to make it more confusing, varimax in stats does,Varimax in GPArotation does not.

Can anyone confirm that the difference is, in fact, accounted for by the normalization options?

解决方案

The problem of different loadings occurs because of different accuracy in the procedures (of course, since in psych::principal the normalize option is not evaluated, all the other procedures must used with that option switched to TRUE). While the accuracy in stats::varimax and GPArotation::Varimax can be configured (parameter eps) this is ignored in psych::principal, and seems to be implicitely equal to stats::varimax with eps=1e-5 .

If we increase accuracy in stats::varimax and GPArotations::Varimax to eps=1e-15 then we get the same result (at least up to 8 digits) as in a concurring implementation in my MatMate-program which I've tested to be very accurately equal to the SPSS computations too.

The missing handling of the explicite option eps in psych::principal seems to be a bug, and its poor implicite default value is surely unsatisfying.

Interestingly, the GPArotation::Varimax needs very many rotations with eps=1e-15, see the output at the end; so either there is another internal procedure implemented or the eps-parameter is evaluated differently in the decision when the iterations can be stopped. One example, see at the end of this answer, might suggest such an effect.

Below the protocol of the comparision; only the first two rows of the loadings are displayed.

The first three computations with low accuracy (eps=1e-5) 
all procedures give equal or comparable results, 
except GPArotation, which is already near the exact answer
--------------------------------
>  principal(r = Thurstone, nfactors = 4, rotate = "varimax")$loadings    *1000000
>  stats::varimax(x = loa, normalize = TRUE,eps=1e-5)$loadings            *1000000
>  GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-5)$loadings      *1000000

The second three computations with (attempted) high accuracy (eps=1e-15) 
all procedures except psych::principal give equal results, and they
agree also with external crosscheck using MatMate
--------------------------------
>  principal(r = Thurstone, nfactors = 4, rotate = "varimax",eps=1e-15)$loadings*1000000
>  stats::varimax(x = loa, normalize = TRUE,eps=1e-15)$loadings*1000000
>  GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-10)$loadings*1000000

The attempts to little/default accurate results with large eps or without

# ===== Numerical documentation (only first two rows are displayed)================
> principal(r = Thurstone, nfactors = 4, rotate = "varimax")$loadings*1000000
                RC1       RC2       RC3       RC4      
Sentences       871655.72 216638.46 198427.07 175202.57
Vocabulary      855609.28 294166.99 153181.45 180525.99


> stats::varimax(x = loa, normalize = TRUE,eps=1e-5)$loadings         *1000000
                PC1       PC2       PC3       PC4      
Sentences       871655.72 216638.46 198427.07 175202.57
Vocabulary      855609.28 294166.99 153181.45 180525.99

> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-5)$loadings   *1000000
                     PC1      PC2      PC3       PC4
Sentences       871717.3 216618.7 198176.3 175204.47
Vocabulary      855663.1 294146.3 152930.7 180517.21
# =============================================================================

Now the attempts to more accurate results with smaller eps

> principal(r = Thurstone, nfactors = 4, rotate = "varimax",eps=1e-15)$loadings  *1000000
                RC1       RC2       RC3       RC4      
Sentences       871655.72 216638.46 198427.07 175202.57
Vocabulary      855609.28 294166.99 153181.45 180525.99

> stats::varimax(x = loa, normalize = TRUE,eps=1e-15)$loadings                   *1000000
                PC1       PC2       PC3       PC4      
Sentences       871716.83 216619.69 198174.31 175207.86
Vocabulary      855662.58 294147.47 152928.77 180519.37

> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-10)$loadings             *1000000
                     PC1      PC2       PC3       PC4
Sentences       871716.8 216619.7 198174.31 175207.86
Vocabulary      855662.6 294147.5 152928.77 180519.37

Warnmeldung:
In GPForth(L, Tmat = Tmat, method = "varimax", normalize = normalize,  :
  convergence not obtained in GPForth. 1000 iterations used.

# Result by MatMate: --------------------------------------------------------
 lad = cholesky(Thurstone) 
 pc = rot(lad,"pca")
 pc4 = pc[*,1..4]                           // arrive at the first four pc's
     t = gettrans( normzl(pc4),"varimax")   // get rotation-matrix for row-normalized pc's
 vmx = pc4 * t                              // rotate pc4 by rotation-matrix 
 display = vmx     * 1000000
                     PC1      PC2       PC3       PC4
Sentences       871716.83    216619.68   198174.31   175207.87
Vocabulary      855662.58    294147.46   152928.77   180519.37


# ===============================================================================> 

A much better matching of the results by stats::varimax and GPArotation::Varimax could be achieved by setting eps to 1e-12 in stat:: and to 1e-6 in GPArotation , where one value is the square of the other. We get then

> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-6)$loadings*1000000
                     PC1      PC2       PC3       PC4
Sentences       871716.8 216619.8 198174.49 175207.63
Vocabulary      855662.5 294147.6 152928.94 180519.21

> stats::varimax(x = loa, normalize = TRUE,eps=1e-12)$loadings*1000000
                PC1       PC2       PC3       PC4      
Sentences       871716.80 216619.74 198174.40 175207.85
Vocabulary      855662.55 294147.52 152928.86 180519.36

这篇关于为什么"Varimax"和"Varimax"之间在心理::原理上有区别?和"varimax"?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆