NNLS如何用于非负多元线性回归? [英] How to use NNLS for non-negative multiple linear regression?
问题描述
我正在尝试解决Java中的非负多重线性回归问题.我找到了一个求解器类
I am trying to solve Non-negative multiple linear regression problem in Java.
And I found a solver class org.apache.spark.mllib.optimization.NNLS
written in Scala.
However, I don't know how to use this.
让我感到困惑的是,以下方法的界面似乎很奇怪.我以为 A
是一个MxN矩阵,而 b
是一个M向量,并且参数 ata
和 atb
应分别为NxN矩阵和N向量.但是, ata
的实际类型是 double []
.
What makes me confused is that the interface of the following method seems strange.
I thought that A
is a MxN matrix and b
is a M-vector, and the arguments ata
and atb
should be a NxN matrix and N-vector, respectively.
However, the actual type of ata
is double[]
.
public static double[] solve(double[] ata, double[] atb, NNLS.Workspace ws)
我搜索了示例代码,但找不到.谁能给我示例代码?该库是用Scala编写的,但如果可能,我需要Java代码.
I searched for an example code but I couldn't find. Can anyone give me a sample code? The library is written in Scala, but I want Java code if possible.
推荐答案
免责声明我从没使用过 NNLS
,并且对非负多元线性回归一无所知.
DISCLAIMER I've never used NNLS
and got no idea about non-negative multiple linear regression.
您看到的是Spark 2.1.1的 NNLS
,它可以满足您的要求,但由于
You look at Spark 2.1.1's NNLS
that does what you want, but is not the way to go since the latest Spark 2.2.1 marked as private[spark].
private[spark] object NNLS {
更重要的是,从Spark 2.0开始, org.apache.spark.mllib
包(包括 org.apache.spark.mllib.optimization
所属)位于维护模式:
More importantly, as of Spark 2.0, org.apache.spark.mllib
package (incl. org.apache.spark.mllib.optimization
that NNLS
belongs to) is in maintenance mode:
基于MLlib RDD的API现在处于维护模式.
The MLlib RDD-based API is now in maintenance mode.
从Spark 2.0开始,spark.mllib软件包中基于RDD的API已进入维护模式.Spark的主要机器学习API现在是spark.ml软件包中基于DataFrame的API.
As of Spark 2.0, the RDD-based APIs in the spark.mllib package have entered maintenance mode. The primary Machine Learning API for Spark is now the DataFrame-based API in the spark.ml package.
换句话说,您应该远离软件包,尤其是 NNLS
.
In other words, you should stay away from the package and NNLS
in particular.
那还有什么选择?
You could look at the tests of NNLS
, i.e. NNLSSuite where you could find some answers.
但是,ata的实际类型是double [].
However, the actual type of ata is double[].
这是一个矩阵,因此元素再次加倍.实际上, ata
直接传递给BLAS的 dgemv
( LAPACK 文档:
That's a matrix so elements are doubles again. As a matter of fact, ata
is passed directly to BLAS's dgemv
(here and here) that is described in the LAPACK docs:
DGEMV执行矩阵矢量运算之一
DGEMV performs one of the matrix-vector operations
y := alpha*A*x + beta*y, or y := alpha*A**T*x + beta*y,
其中alpha和beta是标量,x和y是向量,A是m×n矩阵.
where alpha and beta are scalars, x and y are vectors and A is an m by n matrix.
那应该给你足够的答案.
That should give you enough answers.
另一个问题是,对于类似 NNLS
的计算,Spark MLlib中的推荐方式是什么?
Another question would be what the recommended way in Spark MLlib for NNLS
-like computations is?
It looks like Spark MLLib's ALS algorithm uses NNLS
under the covers (which may not be that surprising for machine learning practitioners).
That part of the code is used when ALS is configured to train a model with nonnegative parameter turned on, i.e. true
(which is disabled by default).
否定性参数,用于是否应用非负性约束.
nonnegative Param for whether to apply nonnegativity constraints.
默认值:false
是否对最小二乘使用非负约束
whether to use nonnegative constraint for least squares
我建议您回顾一下Spark MLlib的这一部分,以更深入地利用 NNLS
解决非负线性回归问题.
I would recommend reviewing that part of Spark MLlib to get deeper into the uses of NNLS
for solving non-negative linear regression problem.
这篇关于NNLS如何用于非负多元线性回归?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!