如何使用R包Quadprog解决SVM? [英] How to use R package Quadprog to solve SVM?

查看:399
本文介绍了如何使用R包Quadprog解决SVM?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道实现Quadprog解决二次编程的正确方法是什么.

我有以下问题(从互联网上窃听),并且还在查看以下 http://www.r-bloggers.com/solvent-quadratic-progams-with-rs-quadprog-package/

解决方案

这是基于原始优化问题的线性C-SVM的实现:

min_{beta_0, beta, zeta} 1/2 w^T w + C sum_{i = 1}^N zeta_i
subject to:
    y_i (w^T x_i + b) >= 1 - zeta_i, for all i = 1, 2, ..., N
    zeta_i >= 0, for all i = 1, 2, ..., N

其中,N是数据点的数量.

请注意,使用quadprog解决此问题在某种程度上是一种教学法练习,因为quadprog依赖于内点算法,而实际上将使用专门的算法,例如Platt的SMO,这利用了SVM优化问题的特殊属性.

为了使用quadprog,给定上面的方程,所有这些都归结为设置指定优化问题的矩阵和向量.

但是,一个问题是quadprog要求出现在二次函数中的矩阵必须是正定的(例如,请参见http://cbio.ensmp.fr/~thocking/mines-course/2011-04-01-svm/svm-qp.pdf

What would be the proper way to solve this issue? Would this tutorial be useful to solve if i was given a question like above? http://www.r-bloggers.com/solving-quadratic-progams-with-rs-quadprog-package/

解决方案

Here is an implementation, for linear C-SVM, which is based on the primal optimization problem:

min_{beta_0, beta, zeta} 1/2 w^T w + C sum_{i = 1}^N zeta_i
subject to:
    y_i (w^T x_i + b) >= 1 - zeta_i, for all i = 1, 2, ..., N
    zeta_i >= 0, for all i = 1, 2, ..., N

where N is the number of data points.

Note that using quadprog to solve this is, to some degree, more of a pedagogical exercise, as quadprog relies on an interior point algorithm, while in practice a specialized algorithm would be used, such as Platt's SMO, which takes advantage of particular properties of the SVM optimization problem.

In order to use quadprog, given the equations above, it all boils down to setting up the matrices and vectors that specify the optimization problem.

One issue, however, is that quadprog requires the matrix appearing in the quadratic function to be positive definite (see, for example, http://www.r-bloggers.com/more-on-quadratic-progamming-in-r/), while the implementation used here leads to it being positive semi-definite, since the intercept beta_0 and the zeta_i do not appear in the quadratic function. To go around this issue, I set the diagonal elements corresponding to these values in the matrix to a very small value.

To setup the example code, using the spam dataset, a binary classification problem:

library(kernlab) # for the spam data
# Load the input data to be used
data(spam)

# Use only a subset of the data (20%)
spam <- spam[sample(nrow(spam), round(0.2 * nrow(spam)), replace = FALSE), ]

# Retrieve the features and data
X <- spam[, 1:(ncol(spam) - 1)]
Y_f <- spam[, ncol(spam)]
Y <- 2 * (as.numeric(Y_f) - 1.5) # {-1, 1}

# Sample size
N <- nrow(X)
# Number of dimensions
n_d <- ncol(X)

# Value of the regularization parameter
C <- 1

In order to setup the optimization problem, keep in mind the format employed by package quadprog:

#
# Formulation: min(−d^T * b + 0.5 * b^T * D * b) with the constraints A^T * b >= b_0
#
# solve.QP(Dmat, dvec, Amat, bvec, meq=0, factorized=FALSE)
#
# Arguments
#   Dmat: matrix appearing in the quadratic function to be minimized.
#   dvec: vector appearing in the quadratic function to be minimized.
#   Amat: matrix defining the constraints under which we want to minimize the quadratic function.
#   bvec: vector holding the values of b0 (defaults to zero).
#   meq: the first meq constraints are treated as equality constraints, all further as inequality 
#   constraints (defaults to 0).
#   factorized logical flag: if TRUE, then we are passing R−1 (where D = RT R) instead of the
#   matrix D in the argument Dmat.
#

Then, organizing the parameter vector as:

# b = (beta_0, beta, zeta), 
# where: beta_0 in R, beta in Re^n_d, zeta in Re^N

such that:

d <- c(0, rep(0, n_d), rep(-C, N)) # -C * sum(zeta)

# Need a work-around for the matrix D, which must be positive definite (being 
# positive semi-definite is not enough...)
# See http://www.r-bloggers.com/more-on-quadratic-progamming-in-r/
eps <- 1e-10 # this will ultimately be the lowest eigenvalue of matrix D (with multiplicity N + 1)
D <- diag(c(eps, rep(1, n_d), rep(eps, N))) # beta^T * beta

#
# Matrix specifying the constraints
# For zeta_i > 0: 
#          beta_0 |    beta       |     zeta
#   A_1 = [      0,   0, 0, ..., 0, 1, 0, 0, ..., 0]
#         [      0,   0, 0, ..., 0, 0, 1, 0, ..., 0]
#         [      0,   0, 0, ..., 0, 0, 0, 1, ..., 0]  
#          ...
#         [      0,   0, 0, ..., 0, 0, 0, 0, ..., 1]
# where matrix A_1 has N rows, and N + n_d + 1 columns
#
# For beta_0 * y_i + beta^T * x_i * y_i + zeta_i >= 1:
#          beta_0 |                         beta                        |        zeta
#   A_2 = [    y_1, y_1 * x_{1, 1}, y_1 * x_{2, 2}, ..., y_1 * x{i, n_d}, 1, 0, 0, ..., 0]
#         [    y_2, y_2 * x_{2, 1}, y_2 * x_{2, 2}, ..., y_2 * x{i, n_d}, 0, 1, 0, ..., 0]
#          ...        
#         [    y_N, y_N * x_{N, 1}, y_2 * x_{N, 2}, ..., y_N * x{N, n_d}, 0, 0, 0, ..., 1]
#
I_N <- diag(N) # N x N identity matrix

A_1 <- cbind(matrix(0, ncol = n_d + 1, nrow = N), I_N) # zeta_i > 0, for all i; N rows
A_2 <- as.matrix(cbind(as.matrix(Y), X * as.matrix(Y)[, rep(1, n_d)], I_N)) # zeta_i + beta_0 * y_i + beta^T * x_i * y_i >= 1, for all i; N rows
rownames(A_1) <- NULL; rownames(A_2) <- NULL
colnames(A_1) <- NULL; colnames(A_2) <- NULL

A <- t(rbind(A_1, A_2))
b_0 <- c(rep(0, N), rep(1, N))

Finally, solve the optimization problem and retrieve the parameter values:

library(quadprog)
results <- solve.QP(D, d, A, b_0)

# Retrieve the results
b_optim <- results$solution

beta_0 <- b_optim[1]
beta <- b_optim[1 + (1:n_d)]
zeta <- b_optim[(n_d + 1) + (1:N)]

Afterwards, given a matrix X_test, the model can be used to predict via:

Y_pred <- sign(apply(X_test, 1, function(x) beta_0 + sum(beta * as.vector(x))))

这篇关于如何使用R包Quadprog解决SVM?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆