R中需要TRUE/FALSE的缺失值 [英] missing value where TRUE/FALSE needed in R

查看:281
本文介绍了R中需要TRUE/FALSE的缺失值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我运行以下代码时未注释gr.ascent(MMSE, 0.5, verbose=TRUE)时,我收到此错误Error in b1 * x : 'b1' is missing,但是当我注释该行时,使用这些参数MMSE(2,1,farmland$farm,farmland$area)测试MMSE时会收到以下错误.你知道我的问题在哪里吗?

When I run the following code without commenting gr.ascent(MMSE, 0.5, verbose=TRUE) I receive this error Error in b1 * x : 'b1' is missing but when I comment that line I receive the following error when testing MMSE with these arguments MMSE(2,1,farmland$farm,farmland$area). Do you know where my problem is lying?

Error in if (abs(t[i]) <= k) { : missing value where TRUE/FALSE needed

这是我的代码:

farmland <- read.csv("FarmLandArea.csv")
str(farmland)
fit=lm(farm~land,data=farmland)
mean.squared.residuals <- sum((lm(farm~land,data=farmland)$residuals)^2)/(length(farmland$farm)-2)

#gradient descent

#things I should possibly use: solve(t(X)%*%X, t(X)%*%y)
gr.ascent<- function(df, x0, alpha=0.2, eps=0.001, max.it = 50, verbose = FALSE){
  X1 <- x0
  cond <- TRUE
  iteration <- 0
  if(verbose) cat("X0 =",X1,"\n")
  while(cond){
    iteration <- iteration + 1
    X0 <- X1
    X1 <- X0 + alpha * df(X0)
    cond <- sum((X1 - X0)^2) > eps & iteration < max.it
    if(verbose) cat(paste(sep="","X",iteration," ="), X1, "\n")
  }
  return(X1)
}


k=19000

#rho <- function(t, k=19000){
#  for (i in seq(1,length(t))){
#    if (abs(t[i]) <= k)
#      return(t[i]^2)
#    else 
#      return(2*k*abs(t[i])-k^2)

#  }

#}

#nicer implementation of rho. ifelse works on vector
rho<-function(t,k) ifelse(abs(t)<=k,t^2,(2*k*abs(t))-k^2)
rho.prime <- function(t, k=19000){
  out <- rep(NA, length(t))
  for (i in seq(1,length(t))){
    if (abs(t[i]) <= k)
    { print(2*t[i])
      out[i] <- 2*t[i] 
    }
    else 
    {
      print(2*k*sign(t[i]))
      out[i] <- 2*k*sign(t[i])
    }
  }
  return(out)
}
MMSE <- function(b0, b1, y=farmland$farm, x=farmland$land){
   # Calls rho.prime() here with argument y-b0-b1*x


   #Why should we call rho.prime? in the html page you have used rho!?
  n = length(y)
  total = 0
  for (i in seq(1,n)) {
    #total = total + rho(t,k)*(y[i]-b0-b1*x[i])
    total = total + rho.prime(y-b0-b1*x,k)*(y[i]-b0-b1*x[i])
  }
  return(total/n)
}

gr.ascent(MMSE(1,2), 0.5, verbose=TRUE)

其中的FarmLand csv数据如下:

In which FarmLand csv data is like the following:

state,land,farm
Alabama,50744,14062
Alaska,567400,1375
Arizona,113635,40781
Arkansas,52068,21406
California,155959,39688
Colorado,103718,48750
Connecticut,4845,625
Delaware,1954,766
Florida,53927,14453
Georgia,57906,16094
Hawaii,6423,1734
Idaho,82747,17812
Illinois,55584,41719
Indiana,35867,23125
Iowa,55869,48125
Kansas,81815,72188
Kentucky,39728,21875
Louisiana,43562,12578
Maine,30862,2109
Maryland,9774,3203
Massachusetts,7840,812
Michigan,58110,15625
Minnesota,79610,42031
Mississippi,46907,17422
...

这是dput(农田)的结果:

Here's the result of dput(farmland):

> dput(farmland)
structure(list(state = structure(1:50, .Label = c("Alabama", 
"Alaska", "Arizona", "Arkansas", "California", "Colorado", "Connecticut", 
"Delaware", "Florida", "Georgia", "Hawaii", "Idaho", "Illinois", 
"Indiana", "Iowa", "Kansas", "Kentucky", "Louisiana", "Maine", 
"Maryland", "Massachusetts", "Michigan", "Minnesota", "Mississippi", 
"Missouri", "Montana", "Nebraska", "Nevada", "New Hampshire", 
"New Jersey", "New Mexico", "New York", "North Carolina", "North Dakota", 
"Ohio", "Oklahoma", "Oregon", "Pennsylvania", "Rhode Island", 
"South Carolina", "South Dakota", "Tennessee", "Texas", "Utah", 
"Vermont", "Virginia", "Washington", "West Virginia", "Wisconsin", 
"Wyoming"), class = "factor"), land = c(50744L, 567400L, 113635L, 
52068L, 155959L, 103718L, 4845L, 1954L, 53927L, 57906L, 6423L, 
82747L, 55584L, 35867L, 55869L, 81815L, 39728L, 43562L, 30862L, 
9774L, 7840L, 58110L, 79610L, 46907L, 68886L, 145552L, 76872L, 
109826L, 8968L, 7417L, 121356L, 47214L, 48711L, 68976L, 40948L, 
68667L, 95997L, 44817L, 1045L, 30109L, 75885L, 41217L, 261797L, 
82144L, 9250L, 39594L, 66544L, 24230L, 54310L, 97105L), farm = c(14062L, 
1375L, 40781L, 21406L, 39688L, 48750L, 625L, 766L, 14453L, 16094L, 
1734L, 17812L, 41719L, 23125L, 48125L, 72188L, 21875L, 12578L, 
2109L, 3203L, 812L, 15625L, 42031L, 17422L, 45469L, 95000L, 71250L, 
9219L, 734L, 1141L, 67500L, 10938L, 13438L, 61875L, 21406L, 55000L, 
25625L, 12109L, 109L, 7656L, 68281L, 17031L, 203750L, 17344L, 
1906L, 12578L, 23125L, 5703L, 23750L, 47188L)), .Names = c("state", 
"land", "farm"), class = "data.frame", row.names = c(NA, -50L
))

推荐答案

好的,按数字:

  1. 在调用gr.ascent(...)的过程中,您传递了 function (MMSE)作为第一个参数.在gr.ascent(...)内部,您将此功能称为df(...).
  2. 函数MMSE(...)有2个参数,b0b1没有默认值-因此必须指定这些默认值,否则会出错,但是
  3. 当您在gr.ascent(...)内部调用函数df(...)时,在以下行中:X1 <- X0 + alpha * df(X0)您仅传递了一个参数,即b0.
  4. 因此第二个参数b1丢失了,因此出现了错误.
  1. In your call to gr.ascent(...) you pass a function, MMSE as the first argument. Inside gr.ascent(...) you refer to this function as df(...).
  2. The function MMSE(...) has 2 arguments, b0 and b1 for which there are no defaults - so these must be specified or there will be an error, but
  3. When you call the function df(...) inside gr.ascent(...), in the line: X1 <- X0 + alpha * df(X0) you pass only 1 argument, which is b0.
  4. So the second argument, b1 is missing, hence the error.

直接致电MMSE(...)时,如下所示:

When you call MMSE(...) directly, as in:

MMSE(2,1,farmland$farm,farmland$area)

您将farmland$area作为第四个参数传递.但是farmland数据框中没有area !因此,它以NA的形式传递,在中使用时

you pass farmland$area as the 4th argument. But there is no column area in the farmland data frame! So this gets passed as NA, which, when used in

total = total + rho.prime(y-b0-b1*x,k)*(y[i]-b0-b1*x[i])

t自变量强制为rho.prime(...)NA,因此是第二个错误.

coerces the t argument to rho.prime(...) to NA, hence the second error.

我无法提出解决方案,因为我不知道您要在这里完成什么.

I can't suggest a solution because I have no clue what you are trying to accomplish here.

编辑(回复OP的评论).

EDIT (Response to OP's comment).

尽管我全心同意@thelatemail的评论,但您的新错误还是很晦涩.

Notwithstanding @thelatemail's comment, which I agree with wholeheartedly, your new error is rather obscure.

在您的早期版本中,您将函数,MSEE(...)传递给gr.ascent(...),并且使用不正确.这次,您正在将传递给gr.ascent(...),该值是调用MSEE(1,2)时的返回值.因此,当您尝试将此当作一个函数来对待时,会发生什么情况,例如:

In your earlier version, you were passing a function, MSEE(...) to gr.ascent(...), and using it incorrectly. This time, you are passing a value to gr.ascent(...), that value being the return value when you call MSEE(1,2). So what happens when you try to treat this value as a function, as in:

X1 <- X0 + alpha * df(X0)

好吧,通常这会引发一个错误,告诉您df不是函数.在这种情况下,df 是一个函数,这只是您的不幸.它是F分布的概率密度函数,具有必需的自变量df1等(请键入?df以查看文档).这就是为什么您得到错误.

Well, normally this would throw an error telling you that df is not a function. In this case, it's just your bad luck that df is a function. It is the probability density function for the F distribution, which has a required argument df1, among others (type ?df to see the documentation). That's why you are getting the error.

要修复"此问题,您需要返回传递函数,如:

To "fix" this you need to go back to passing the function, as in:

gr.ascent(MSEE,...)

,然后在gr.ascent(...)内部正确使用它,如下所示:

and then use it correctly inside gr.ascent(...), as in:

X1 <- X0 + alpha * df(X0, <some other argument>).

这篇关于R中需要TRUE/FALSE的缺失值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆