R中线性判别分析中的分类函数 [英] Classification functions in linear discriminant analysis in R

查看:89
本文介绍了R中线性判别分析中的分类函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用 lda()在R中完成线性判别分析后,是否有一种方便的方法来提取每个组的分类功能

After completing a linear discriminant analysis in R using lda(), is there a convenient way to extract the classification functions for each group?

在链接中,


这些不要与判别式混淆功能。分类功能可用于确定每种情况最可能属于哪个组。分类功能和组一样多。每个函数都允许我们通过应用以下公式来计算每组病例的分类评分:

These are not to be confused with the discriminant functions. The classification functions can be used to determine to which group each case most likely belongs. There are as many classification functions as there are groups. Each function allows us to compute classification scores for each case for each group, by applying the formula:



Si = ci + wi1*x1 + wi2*x2 + ... + wim*xm




在此公式中,下标i表示相应的基团;下标1、2,...,m表示m个变量; ci是第i个组的常数,wij是第i个组的分类得分计算中第j个变量的权重; xj是第j个变量在各种情况下的观测值。 Si是结果分类分数。

In this formula, the subscript i denotes the respective group; the subscripts 1, 2, ..., m denote the m variables; ci is a constant for the i'th group, wij is the weight for the j'th variable in the computation of the classification score for the i'th group; xj is the observed value for the respective case for the j'th variable. Si is the resultant classification score.

我们可以使用分类函数直接计算一些新观测值的分类分数。

We can use the classification functions to directly compute classification scores for some new observations.

我可以使用教科书公式从头开始构建它们,但这需要从lda分析中重建许多中间步骤。有办法从lda对象中获取事实之后的信息吗?

I can build them from scratch using textbook formulas, but that requires rebuilding a number of intermediate steps from the lda analysis. Is there a way to get them after the fact from the lda object?

已添加:

除非我仍然对布兰登的答案有误解(对不起混乱!),否则看来答案是否定的。大概大多数用户可以从 predict()获得所需的信息,该信息基于 lda()提供分类

Unless I'm still misunderstanding something in Brandon's answer (sorry for the confusion!), it appears the answer is no. Presumably the majority of users can get the information they need from predict(), which provides classifications based on lda().

推荐答案

没有获取所需信息的内置方法,因此我编写了一个函数来实现:

There isn't a built-in way to get the information I needed, so I wrote a function to do it:

ty.lda <- function(x, groups){
  x.lda <- lda(groups ~ ., as.data.frame(x))

  gr <- length(unique(groups))   ## groups might be factors or numeric
  v <- ncol(x) ## variables
  m <- x.lda$means ## group means

  w <- array(NA, dim = c(v, v, gr))

  for(i in 1:gr){
    tmp <- scale(subset(x, groups == unique(groups)[i]), scale = FALSE)
    w[,,i] <- t(tmp) %*% tmp
  }

  W <- w[,,1]
  for(i in 2:gr)
    W <- W + w[,,i]

  V <- W/(nrow(x) - gr)
  iV <- solve(V)

  class.funs <- matrix(NA, nrow = v + 1, ncol = gr)
  colnames(class.funs) <- paste("group", 1:gr, sep=".")
  rownames(class.funs) <- c("constant", paste("var", 1:v, sep = "."))

  for(i in 1:gr) {
    class.funs[1, i] <- -0.5 * t(m[i,]) %*% iV %*% (m[i,])
    class.funs[2:(v+1) ,i] <- iV %*% (m[i,])
  }

  x.lda$class.funs <- class.funs

  return(x.lda)
}

此代码遵循Legendre和Legendre的数值生态学(1998)中的公式,第625页,并匹配从第626页开始的工作示例的结果。

This code follows the formulas in Legendre and Legendre's Numerical Ecology (1998), page 625, and matches the results of the worked example starting on page 626.

这篇关于R中线性判别分析中的分类函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆