R, Coefficient:3 由于奇点而未定义 [英] R, Coefficient:3 not defined because of singularities

查看:53
本文介绍了R, Coefficient:3 由于奇点而未定义的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在学习回归分析和 R 代码.这是我第一次遇到这个问题,我参考了另一篇关于这个问题的帖子.但是,我仍然找不到我的问题.
抱歉发了两次.

#Multiple 与所有交互项Multi.interact.reg<-lm(Y~X1+X2+X3+X4+X5+X6+I(X1*X2)+I(X1*X3)+I(X1*X4)+I(X1*X5)+I(X1*X6)+I(X2*X3)+I(X2*X4)+I(X2*X5)+I(X2*X6)+I(X3*X4)+I(X3*X5)+I(X3*X6)+I(X4*X5)+I(X4*X6)+I(X5*X6),data = assignment.data)```摘要(Multi.interact.reg)Multi.interact.anova<-anova(Multi.interact.reg)多交互方差分析

然后,我得到了这一堆结果,由于奇点,它出现了 3 个未定义

调用:lm(公式 = Y ~ X1 + X2 + X3 + X4 + X5 + X6 + I(X1 * X2) + I(X1 *X3) + I(X1 * X4) + I(X1 * X5) + I(X1 * X6) + I(X2 * X3) +I(X2 * X4) + I(X2 * X5) + I(X2 * X6) + I(X3 * X4) + I(X3 *X5) + I(X3 * X6) + I(X4 * X5) + I(X4 * X6) + I(X5 * X6),数据 = 赋值.数据)残差:最小 1Q 中值 3Q 最大值-36.471 -4.328 -0.738 3.290 70.526系数:(3 未定义,因为奇点)估计标准误差 t 值 Pr(>|t|)(拦截)-5.039e+04 1.724e+04 -2.922 0.00367 **X1 6.777e+00 4.415e+00 1.535 0.12560X2 3.325e+02 7.528e+02 0.442 0.65895X3 8.762e+00 4.172e+00 2.100 0.03634 *X4 5.420e+03 4.136e+03 1.311 0.19077X5 9.939e+02 1.631e+02 6.092 2.65e-09 ***X6 9.853e+01 1.178e+02 0.837 0.40337I(X1 * X2) 1.042e-01 1.260e-01 0.827 0.40879I(X1 * X3) -8.575e-04 1.352e-03 -0.634 0.52619I(X1 * X4) -3.882e-01 6.075e-01 -0.639 0.52321I(X1 * X5) NA NA NA NAI(X1 * X6) NA NA NA NAI(X2 * X3) 8.612e-06 7.877e-05 0.109 0.91300I(X2 * X4) 9.445e-03 1.384e-02 0.682 0.49552I(X2 * X5) -2.897e+00 4.374e+00 -0.662 0.50819I(X2 * X6) -3.869e+00 5.762e+00 -0.671 0.50233I(X3 * X4) -1.687e-03 3.852e-04 -4.378 1.54e-05 ***I(X3 * X5) -2.229e-01 4.396e-02 -5.072 6.08e-07 ***I(X3 * X6) -1.212e-02 2.702e-02 -0.449 0.65393I(X4 * X5) -1.071e+02 2.150e+01 -4.981 9.49e-07 ***I(X4 * X6) -1.615e+01 3.119e+01 -0.518 0.60474I(X5 * X6) NA NA NA NA---表示.代码:0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1剩余标准误差:395 自由度上的 7.903多个 R 平方:0.6773,调整后的 R 平方:0.6626F 统计量:18 和 395 DF 上的 46.06,p 值:<2.2e-16方差表分析回应:是Df Sum Sq Mean Sq F 值 Pr(>F)X1 1 585 585 9.3699 0.0023565 **X2 1 3441 3441 55.0854 7.148e-13 ***X3 1 34857 34857 558.0314 <2.2e-16 ***X4 1 3576 3576 57.2444 2.733e-13 ***X5 1 2065 2065 33.0577 1.793e-08 ***X6 1 5 5 0.0821 0.7745688I(X1 * X2) 1 217 217 3.4746 0.0630583.I(X1 * X3) 1 0 0 0.0016 0.9680901I(X1 * X4) 1 4 4 0.0704 0.7908357I(X2 * X3) 1 801 801 12.8267 0.0003843 ***I(X2 * X4) 1 94 94 1.5072 0.2202955I(X2 * X5) 1 1 1 0.0101 0.9198052I(X2 * X6) 1 1 1 0.0085 0.9265045I(X3 * X4) 1 3805 3805 60.9172 5.390e-14 ***I(X3 * X5) 1 481 481 7.6932 0.0058058 **I(X3 * X6) 1 297 297 4.7533 0.0298321 *I(X4 * X5) 1 1542 1542 24.6828 1.008e-06 ***I(X4 * X6) 1 17 17 0.2683 0.6047441残差 395 24673 62---表示.代码:0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

dput(head(assignment.data, 10))

非常感谢解决我问题的人.

原帖:R-3 由于奇点而未定义

dput(head(assignment.data, 10))输出:结构(列表(X1 = c(2012.917,2012.917,2013.583,2013.5,2012.833,2012.667, 2012.667, 2013.417, 2013.5, 2013.417), X2 = c(32, 19.5,13.3, 13.3, 5, 7.1, 34.5, 20.3, 31.7, 17.9), X3 = c(84.87882,306.5947、561.9845、561.9845、390.5684、2175.03、623.4731、287.6025、5512.038, 1783.18), X4 = c(10L, 9L, 5L, 5L, 5L, 3L, 7L, 6L, 1L,3L), X5 = c(24.98298, 24.98034, 24.98746, 24.98746, 24.97937,24.96305, 24.97933, 24.98042, 24.95095, 24.96731), X6 = c(121.54024,121.53951、121.54391、121.54391、121.54245、121.51254、121.53642、121.54228, 121.48458, 121.51486), Y = c(37.9, 42.2, 47.3, 54.8,43.1, 32.1, 40.3, 46.7, 18.8, 22.1)), row.names = c(NA, 10L), class = "data.frame")

解决方案

你只给了我们 10 行数据,所以这个答案是猜测,但看看缺少参数的 X5 和 X6 术语如何回答 NA这是:

您的独立数据行太少,无法容纳那么多参数.每个参数至少需要一个数据行.

此外,根据您的公式,我猜您的意思是在数据中包含所有二级交互.为此,公式应该是:

Multi.interact.reg<-lm(Y~X1*X2+X1*X3+X1*X4+X1*X5+X1*X6+X2*X3+X2*X4+X2*X5+X2*X6+X3*X4+X3*X5+X3*X6+X4*X5+X4*X6+X5*X6,data = assignment.data)

或者更简单:

Multi.interact.reg

或者更短:

Multi.interact.reg <- lm( Y ~ .*. , data=assignment.data )

注意以下,在公式中 X1*X2 等价于 X1 + X2 + X1:X2 其中 X1:X2 是X1和X2之间的相互作用项.

关于X1*X2X1:X2I(X1*X2)的话题:

试试这个:

# 添加一个变量 X1mX2,即 X1 乘以 X2assignment.data$X1mX2 <-assignment.data$X1 * assignment.data$X2# 试试这三个模型:lm( Y ~ I(X1*X2), data=assignment.data )lm( Y ~ X1mX2 , data=assignment.data )lm( Y ~ X1:X2 , data=assignment.data )

他们都给出了相同的结果.X1:X2 可以说比 I(X1*X2) 写得更好,但在这里实现了相同的结果.(话虽如此,了解诸如 I(X^2) 之类的二次项的 I(...) 语法是件好事)

尽管如此.您的 NA 是由于没有足够的独立数据来拟合更多系数.

但是您的问题是什么?我最不想做的就是为你做功课.

编辑:演示您的论坛如何完美地处理 414 条数据记录.

这是一些随机生成的数据,有 414 行和 6 个 X 变量和一个 Y:

ncol <- 7行 <- 414## 414 行的随机分配数据:assignment.data <-矩阵(rnorm(nrow*ncol),nrow=nrow)%>%as.data.frame %>%setNames( c( "Y", paste0( "X",1:6)))Multi.interact.reg <- lm( Y ~ .*. , data=assignment.data )摘要(Multi.interact.reg)

它的摘要如下:

<预><代码>>摘要(Multi.interact.reg)称呼:lm(公式 = Y ~ . * ., 数据 = assignment.data)残差:最小 1Q 中值 3Q 最大值-3.1748 -0.6586 0.0088 0.6075 3.4569系数:估计标准误差 t 值 Pr(>|t|)(拦截) 0.007730 0.048957 0.158 0.8746X1 -0.020282 0.050103 -0.405 0.6858X2 -0.027244 0.046135 -0.591 0.5552X3 -0.033377 0.051385 -0.650 0.5164X4 0.058613 0.053728 1.091 0.2760X5 0.001110 0.047508 0.023 0.9814X6 -0.033624 0.050272 -0.669 0.5040X1:X2 0.062613 0.051556 1.214 0.2253X1:X3 -0.108269 0.051451 -2.104 0.0360 *X1:X4 -0.106597 0.053761 -1.983 0.0481 *X1:X5 -0.012042 0.051551 -0.234 0.8154X1:X6 -0.036547 0.049531 -0.738 0.4610X2:X3 0.083803 0.045765 1.831 0.0678 .X2:X4 -0.024941 0.050207 -0.497 0.6196X2:X5 0.047340 0.041194 1.149 0.2512X2:X6 -0.075696 0.048885 -1.548 0.1223X3:X4 -0.018825 0.057152 -0.329 0.7420X3:X5 -0.024407 0.052880 -0.462 0.6447X3:X6 -0.110122 0.056965 -1.933 0.0539 .X4:X5 0.023246 0.056187 0.414 0.6793X4:X6 -0.009225 0.050770 -0.182 0.8559X5:X6 0.016736 0.048907 0.342 0.7324---表示.代码:0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1剩余标准误差:392 自由度上的 0.976多个 R 平方:0.05964,调整后的 R 平方:0.009264F 统计量:21 和 392 DF 上的 1.184,p 值:0.2611

同样 - 问题不是您使用的方法,而是您使用的数据.

I'm now currently learning regression analysis and R-code. This is the first time I encounter this problem and I've referred to another post regarding this problem. However, I still cant find my problem.
Sorry for post twice.

#Multiple with all interaction term
Multi.interact.reg<-lm(Y~X1+X2+X3+X4+X5+X6+I(X1*X2)+I(X1*X3)+I(X1*X4)+I(X1*X5)+I(X1*X6)+I(X2*X3)+I(X2*X4)+I(X2*X5)+I(X2*X6)+I(X3*X4)+I(X3*X5)+I(X3*X6)+I(X4*X5)+I(X4*X6)+I(X5*X6),data = assignment.data)```
summary(Multi.interact.reg)
Multi.interact.anova<-anova(Multi.interact.reg)
Multi.interact.anova

Then, I get this bunch of result and it appeared 3 not defined because of singularities

Call:
lm(formula = Y ~ X1 + X2 + X3 + X4 + X5 + X6 + I(X1 * X2) + I(X1 * 
    X3) + I(X1 * X4) + I(X1 * X5) + I(X1 * X6) + I(X2 * X3) + 
    I(X2 * X4) + I(X2 * X5) + I(X2 * X6) + I(X3 * X4) + I(X3 * 
    X5) + I(X3 * X6) + I(X4 * X5) + I(X4 * X6) + I(X5 * X6), 
    data = assignment.data)

Residuals:
    Min      1Q  Median      3Q     Max 
-36.471  -4.328  -0.738   3.290  70.526 

Coefficients: (3 not defined because of singularities)
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) -5.039e+04  1.724e+04  -2.922  0.00367 ** 
X1           6.777e+00  4.415e+00   1.535  0.12560    
X2           3.325e+02  7.528e+02   0.442  0.65895    
X3           8.762e+00  4.172e+00   2.100  0.03634 *  
X4           5.420e+03  4.136e+03   1.311  0.19077    
X5           9.939e+02  1.631e+02   6.092 2.65e-09 ***
X6           9.853e+01  1.178e+02   0.837  0.40337    
I(X1 * X2)   1.042e-01  1.260e-01   0.827  0.40879    
I(X1 * X3)  -8.575e-04  1.352e-03  -0.634  0.52619    
I(X1 * X4)  -3.882e-01  6.075e-01  -0.639  0.52321    
I(X1 * X5)          NA         NA      NA       NA    
I(X1 * X6)          NA         NA      NA       NA    
I(X2 * X3)   8.612e-06  7.877e-05   0.109  0.91300    
I(X2 * X4)   9.445e-03  1.384e-02   0.682  0.49552    
I(X2 * X5)  -2.897e+00  4.374e+00  -0.662  0.50819    
I(X2 * X6)  -3.869e+00  5.762e+00  -0.671  0.50233    
I(X3 * X4)  -1.687e-03  3.852e-04  -4.378 1.54e-05 ***
I(X3 * X5)  -2.229e-01  4.396e-02  -5.072 6.08e-07 ***
I(X3 * X6)  -1.212e-02  2.702e-02  -0.449  0.65393    
I(X4 * X5)  -1.071e+02  2.150e+01  -4.981 9.49e-07 ***
I(X4 * X6)  -1.615e+01  3.119e+01  -0.518  0.60474    
I(X5 * X6)          NA         NA      NA       NA    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 7.903 on 395 degrees of freedom
Multiple R-squared:  0.6773,    Adjusted R-squared:  0.6626 
F-statistic: 46.06 on 18 and 395 DF,  p-value: < 2.2e-16


Analysis of Variance Table

Response: Y
            Df Sum Sq Mean Sq  F value    Pr(>F)    
X1           1    585     585   9.3699 0.0023565 ** 
X2           1   3441    3441  55.0854 7.148e-13 ***
X3           1  34857   34857 558.0314 < 2.2e-16 ***
X4           1   3576    3576  57.2444 2.733e-13 ***
X5           1   2065    2065  33.0577 1.793e-08 ***
X6           1      5       5   0.0821 0.7745688    
I(X1 * X2)   1    217     217   3.4746 0.0630583 .  
I(X1 * X3)   1      0       0   0.0016 0.9680901    
I(X1 * X4)   1      4       4   0.0704 0.7908357    
I(X2 * X3)   1    801     801  12.8267 0.0003843 ***
I(X2 * X4)   1     94      94   1.5072 0.2202955    
I(X2 * X5)   1      1       1   0.0101 0.9198052    
I(X2 * X6)   1      1       1   0.0085 0.9265045    
I(X3 * X4)   1   3805    3805  60.9172 5.390e-14 ***
I(X3 * X5)   1    481     481   7.6932 0.0058058 ** 
I(X3 * X6)   1    297     297   4.7533 0.0298321 *  
I(X4 * X5)   1   1542    1542  24.6828 1.008e-06 ***
I(X4 * X6)   1     17      17   0.2683 0.6047441    
Residuals  395  24673      62                       
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

dput(head(assignment.data, 10))

Thank you very much for anyone that solve my problem.

Original Post:R-3 not defined because of singularities

dput(head(assignment.data, 10))
Output:
structure(list(X1 = c(2012.917, 2012.917, 2013.583, 2013.5, 2012.833, 
2012.667, 2012.667, 2013.417, 2013.5, 2013.417), X2 = c(32, 19.5, 
13.3, 13.3, 5, 7.1, 34.5, 20.3, 31.7, 17.9), X3 = c(84.87882, 
306.5947, 561.9845, 561.9845, 390.5684, 2175.03, 623.4731, 287.6025, 
5512.038, 1783.18), X4 = c(10L, 9L, 5L, 5L, 5L, 3L, 7L, 6L, 1L, 
3L), X5 = c(24.98298, 24.98034, 24.98746, 24.98746, 24.97937, 
24.96305, 24.97933, 24.98042, 24.95095, 24.96731), X6 = c(121.54024, 
121.53951, 121.54391, 121.54391, 121.54245, 121.51254, 121.53642, 
121.54228, 121.48458, 121.51486), Y = c(37.9, 42.2, 47.3, 54.8, 
43.1, 32.1, 40.3, 46.7, 18.8, 22.1)), row.names = c(NA, 10L), class = "data.frame")

解决方案

You only gave us 10 rows of data, so this answer is guesswork, but seeing how its the X5 and X6 terms that are missing parameters the answer to the NA's is this:

You have too few independent data rows to fit that many parameters. You need at the very least one data row for each parameter.

Furthermore, based on your formula I am guessing you mean to include all second degree interactions in the data. To that effect the formula should have been:

Multi.interact.reg<-lm(Y~X1*X2+X1*X3+X1*X4+X1*X5+X1*X6+X2*X3+X2*X4+X2*X5+X2*X6+X3*X4+X3*X5+X3*X6+X4*X5+X4*X6+X5*X6,data = assignment.data)

Or easier:

Multi.interact.reg< lm( Y ~ (X1 + X2 + X3 + X4 + X5 + X6) * (X1 + X2 + X3 + X4 + X5 + X6) , data = assignment.data )

Or even shorter:

Multi.interact.reg <- lm( Y ~ .*. , data=assignment.data )

Note the following, in a formula X1*X2 is equivalent to X1 + X2 + X1:X2 where X1:X2 is the interaction term between X1 and X2.

On the topic of X1*X2, X1:X2 and I(X1*X2):

Try this:


# add a variable X1mX2 that is X1 multiplied by X2
assignment.data$X1mX2 <-
    assignment.data$X1 * assignment.data$X2

# try these three models:
lm( Y ~ I(X1*X2), data=assignment.data )
lm( Y ~ X1mX2 , data=assignment.data )
lm( Y ~ X1:X2 , data=assignment.data )

They all give the same result. X1:X2 is arguably better to write than I(X1*X2) but achieve the same outcome here. (That being said, it's good to know about the I(...) syntax for things like quadratic terms like I(X^2))

Nevertheless. Your NA's are due to not having enough independent data to fit more coefficients.

But what is your problem here? The last thing I would want to do, is to do your homework for you.

EDIT: A demonstration on how your forumla can perfectly well work for 414 data records.

Here is some randomly generated data with 414 rows and your 6 X variables and a Y:



ncol <- 7
nrow <- 414

## random assignment data with 414 rows:
assignment.data <-
    matrix( rnorm( nrow*ncol ), nrow=nrow ) %>%
    as.data.frame %>%
    setNames( c( "Y", paste0( "X",1:6) ))

Multi.interact.reg <- lm( Y ~ .*. , data=assignment.data )
summary(Multi.interact.reg)

The summary of it looks like this:


> summary(Multi.interact.reg)

Call:
lm(formula = Y ~ . * ., data = assignment.data)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.1748 -0.6586  0.0088  0.6075  3.4569 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)  
(Intercept)  0.007730   0.048957   0.158   0.8746  
X1          -0.020282   0.050103  -0.405   0.6858  
X2          -0.027244   0.046135  -0.591   0.5552  
X3          -0.033377   0.051385  -0.650   0.5164  
X4           0.058613   0.053728   1.091   0.2760  
X5           0.001110   0.047508   0.023   0.9814  
X6          -0.033624   0.050272  -0.669   0.5040  
X1:X2        0.062613   0.051556   1.214   0.2253  
X1:X3       -0.108269   0.051451  -2.104   0.0360 *
X1:X4       -0.106597   0.053761  -1.983   0.0481 *
X1:X5       -0.012042   0.051551  -0.234   0.8154  
X1:X6       -0.036547   0.049531  -0.738   0.4610  
X2:X3        0.083803   0.045765   1.831   0.0678 .
X2:X4       -0.024941   0.050207  -0.497   0.6196  
X2:X5        0.047340   0.041194   1.149   0.2512  
X2:X6       -0.075696   0.048885  -1.548   0.1223  
X3:X4       -0.018825   0.057152  -0.329   0.7420  
X3:X5       -0.024407   0.052880  -0.462   0.6447  
X3:X6       -0.110122   0.056965  -1.933   0.0539 .
X4:X5        0.023246   0.056187   0.414   0.6793  
X4:X6       -0.009225   0.050770  -0.182   0.8559  
X5:X6        0.016736   0.048907   0.342   0.7324  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.976 on 392 degrees of freedom
Multiple R-squared:  0.05964,   Adjusted R-squared:  0.009264 
F-statistic: 1.184 on 21 and 392 DF,  p-value: 0.2611

Again - it's not the method you use that is the problem, it's the data you are working with.

这篇关于R, Coefficient:3 由于奇点而未定义的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆