趋势线(回归、曲线拟合)java库 [英] Trend lines ( regression, curve fitting) java library

查看:41
本文介绍了趋势线(回归、曲线拟合)java库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试开发一个应用程序,该应用程序可以计算与 excel 相同的趋势线,但适用于更大的数据集.

I'm trying to develop an application that would compute the same trend lines that excel does, but for larger datasets.

但我找不到任何计算此类回归的 Java 库.对于 linera 模型,我使用的是 Apache Commons 数学,对于另一个模型,Michael Thomas Flanagan 提供了一个很棒的数值库,但自 1 月以来它不再可用:

But I'm not able to find any java library that calculates such regressions. For the linera model I'm using Apache Commons math, and for the other there was a great numerical library from Michael Thomas Flanagan but since january it is no longer available:

http://www.ee.ucl.ac.uk/~mflanaga/java/

你知道在java中计算这些回归的任何其他库、代码库吗?最好的,

Do you know any other libraries, code repositories to calculate these regressions in java. Best,

推荐答案

由于它们都基于线性拟合,OLSMultipleLinearRegression 是线性、多项式、指数、对数和幂趋势线所需的全部.

Since they're all based on linear fits, OLSMultipleLinearRegression is all you need for linear, polynomial, exponential, logarithmic, and power trend lines.

你的问题给了我下载和使用公共数学回归工具的借口,我整理了一些趋势线工具:

Your question gave me an excuse to download and play with the commons math regression tools, and I put together some trend line tools:

一个接口:

public interface TrendLine {
    public void setValues(double[] y, double[] x); // y ~ f(x)
    public double predict(double x); // get a predicted y for a given x
}

基于回归的趋势线的抽象类:

An abstract class for regression-based trendlines:

public abstract class OLSTrendLine implements TrendLine {

    RealMatrix coef = null; // will hold prediction coefs once we get values

    protected abstract double[] xVector(double x); // create vector of values from x
    protected abstract boolean logY(); // set true to predict log of y (note: y must be positive)

    @Override
    public void setValues(double[] y, double[] x) {
        if (x.length != y.length) {
            throw new IllegalArgumentException(String.format("The numbers of y and x values must be equal (%d != %d)",y.length,x.length));
        }
        double[][] xData = new double[x.length][]; 
        for (int i = 0; i < x.length; i++) {
            // the implementation determines how to produce a vector of predictors from a single x
            xData[i] = xVector(x[i]);
        }
        if(logY()) { // in some models we are predicting ln y, so we replace each y with ln y
            y = Arrays.copyOf(y, y.length); // user might not be finished with the array we were given
            for (int i = 0; i < x.length; i++) {
                y[i] = Math.log(y[i]);
            }
        }
        OLSMultipleLinearRegression ols = new OLSMultipleLinearRegression();
        ols.setNoIntercept(true); // let the implementation include a constant in xVector if desired
        ols.newSampleData(y, xData); // provide the data to the model
        coef = MatrixUtils.createColumnRealMatrix(ols.estimateRegressionParameters()); // get our coefs
    }

    @Override
    public double predict(double x) {
        double yhat = coef.preMultiply(xVector(x))[0]; // apply coefs to xVector
        if (logY()) yhat = (Math.exp(yhat)); // if we predicted ln y, we still need to get y
        return yhat;
    }
}

多项式或线性模型的实现:

An implementation for polynomial or linear models:

(对于线性模型,调用构造函数时只需将度数设置为1即可.)

(For linear models, just set the degree to 1 when calling the constructor.)

public class PolyTrendLine extends OLSTrendLine {
    final int degree;
    public PolyTrendLine(int degree) {
        if (degree < 0) throw new IllegalArgumentException("The degree of the polynomial must not be negative");
        this.degree = degree;
    }
    protected double[] xVector(double x) { // {1, x, x*x, x*x*x, ...}
        double[] poly = new double[degree+1];
        double xi=1;
        for(int i=0; i<=degree; i++) {
            poly[i]=xi;
            xi*=x;
        }
        return poly;
    }
    @Override
    protected boolean logY() {return false;}
}

指数和幂模型更简单:

(注意:我们现在正在预测 log y —— 这很重要.这两个都只适用于正 y)

(note: we're predicting log y now -- that's important. Both of these are only suitable for positive y)

public class ExpTrendLine extends OLSTrendLine {
    @Override
    protected double[] xVector(double x) {
        return new double[]{1,x};
    }

    @Override
    protected boolean logY() {return true;}
}

public class PowerTrendLine extends OLSTrendLine {
    @Override
    protected double[] xVector(double x) {
        return new double[]{1,Math.log(x)};
    }

    @Override
    protected boolean logY() {return true;}

}

还有一个日志模型:

(取 x 的对数但预测 y,而不是 ln y)

(Which takes the log of x but predicts y, not ln y)

public class LogTrendLine extends OLSTrendLine {
    @Override
    protected double[] xVector(double x) {
        return new double[]{1,Math.log(x)};
    }

    @Override
    protected boolean logY() {return false;}
}

你可以这样使用它:

public static void main(String[] args) {
    TrendLine t = new PolyTrendLine(2);
    Random rand = new Random();
    double[] x = new double[1000*1000];
    double[] err = new double[x.length];
    double[] y = new double[x.length];
    for (int i=0; i<x.length; i++) { x[i] = 1000*rand.nextDouble(); }
    for (int i=0; i<x.length; i++) { err[i] = 100*rand.nextGaussian(); } 
    for (int i=0; i<x.length; i++) { y[i] = x[i]*x[i]+err[i]; } // quadratic model
    t.setValues(y,x);
    System.out.println(t.predict(12)); // when x=12, y should be... , eg 143.61380202745192
}

因为您只想要趋势线,所以我在完成 ols 模型后放弃了它们,但您可能希望保留一些关于拟合优度等的数据.

Since you just wanted trend lines, I dismissed the ols models when I was done with them, but you might want to keep some data on goodness of fit, etc.

对于使用移动平均、移动中位数等的实现,看起来您可以坚持使用公共数学.试试 DescriptiveStatistics 并指定一个窗口.您可能想要使用另一个答案中建议的插值来进行一些平滑处理.

For implementations using moving average, moving median, etc, it looks like you can stick with commons math. Try DescriptiveStatistics and specify a window. You might want to do some smoothing, using interpolation as suggested in another answer.

这篇关于趋势线(回归、曲线拟合)java库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆