如何使用 NNLS 进行非负多元线性回归? [英] How to use NNLS for non-negative multiple linear regression?

查看:63
本文介绍了如何使用 NNLS 进行非负多元线性回归?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试用 Java 解决非负多元线性回归问题.我找到了一个求解器类 org.apache.spark.mllib.optimization.NNLS 用 Scala 编写.但是,我不知道如何使用它.

I am trying to solve Non-negative multiple linear regression problem in Java. And I found a solver class org.apache.spark.mllib.optimization.NNLS written in Scala. However, I don't know how to use this.

让我困惑的是,下面这个方法的界面看起来很奇怪.我认为 A 是一个 MxN 矩阵,b 是一个 M 向量,参数 ataatb应该分别是 NxN 矩阵和 N 向量.然而,ata的实际类型是double[].

What makes me confused is that the interface of the following method seems strange. I thought that A is a MxN matrix and b is a M-vector, and the arguments ata and atb should be a NxN matrix and N-vector, respectively. However, the actual type of ata is double[].

public static double[] solve(double[] ata, double[] atb, NNLS.Workspace ws)

我搜索了示例代码,但找不到.谁能给我一个示例代码?该库是用 Scala 编写的,但如果可能的话,我想要 Java 代码.

I searched for an example code but I couldn't find. Can anyone give me a sample code? The library is written in Scala, but I want Java code if possible.

推荐答案

免责声明 我从未使用过 NNLS 并且不知道非负多元线性回归.

DISCLAIMER I've never used NNLS and got no idea about non-negative multiple linear regression.

你看看 Spark 2.1.1 的 NNLS 做你想做的事,但不是要走的路,因为 最新的 Spark 2.2.1 标记为 private[spark].

You look at Spark 2.1.1's NNLS that does what you want, but is not the way to go since the latest Spark 2.2.1 marked as private[spark].

private[spark] object NNLS {

更重要的是,从 Spark 2.0 开始,org.apache.spark.mllib 包(包括 org.apache.spark.mllib.optimization 那个 NNLS 属于)在 维护模式:

More importantly, as of Spark 2.0, org.apache.spark.mllib package (incl. org.apache.spark.mllib.optimization that NNLS belongs to) is in maintenance mode:

基于 MLlib RDD 的 API 现在处于维护模式.

The MLlib RDD-based API is now in maintenance mode.

从 Spark 2.0 开始,spark.mllib 包中基于 RDD 的 API 已进入维护模式.Spark 的主要机器学习 API 现在是 spark.ml 包中基于 DataFrame 的 API.

As of Spark 2.0, the RDD-based APIs in the spark.mllib package have entered maintenance mode. The primary Machine Learning API for Spark is now the DataFrame-based API in the spark.ml package.

换句话说,您应该远离包,尤其是 NNLS.

In other words, you should stay away from the package and NNLS in particular.

那么有哪些替代方案?

你可以看看NNLS的测试,即NNLSSuite 在这里您可以找到一些答案.

You could look at the tests of NNLS, i.e. NNLSSuite where you could find some answers.

然而,ata的实际类型是double[].

However, the actual type of ata is double[].

这是一个矩阵,所以元素再次翻倍.事实上,ata 是直接传递给 BLAS 的 dgemv (这里此处),在 LAPACK 文档:

That's a matrix so elements are doubles again. As a matter of fact, ata is passed directly to BLAS's dgemv (here and here) that is described in the LAPACK docs:

DGEMV 执行矩阵向量运算之一

DGEMV performs one of the matrix-vector operations

y := alpha*A*x + beta*y,   or   y := alpha*A**T*x + beta*y,

其中 alpha 和 beta 是标量,x 和 y 是向量,A 是一个m × n 矩阵.

where alpha and beta are scalars, x and y are vectors and A is an m by n matrix.

这应该会给你足够的答案.

That should give you enough answers.

另一个问题是 Spark MLlib 中对于 NNLS 类计算的推荐方法是什么?

Another question would be what the recommended way in Spark MLlib for NNLS-like computations is?

看起来像 Spark MLLib 的 ALS 算法 在幕后使用 NNLS(这对于机器学习从业者来说可能并不奇怪).

It looks like Spark MLLib's ALS algorithm uses NNLS under the covers (which may not be that surprising for machine learning practitioners).

当 ALS 配置为使用 nonnegative 参数开启,即true(默认情况下是禁用的).

That part of the code is used when ALS is configured to train a model with nonnegative parameter turned on, i.e. true (which is disabled by default).

nonnegative 是否应用非负约束的参数.

nonnegative Param for whether to apply nonnegativity constraints.

默认值:false

是否对最小二乘使用非负约束

whether to use nonnegative constraint for least squares

我建议您查看 Spark MLlib 的那部分内容,以更深入地了解 NNLS 用于解决非负线性回归问题的用途.

I would recommend reviewing that part of Spark MLlib to get deeper into the uses of NNLS for solving non-negative linear regression problem.

这篇关于如何使用 NNLS 进行非负多元线性回归?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆