如何获得P值logistic回归的火花mllib使用Java [英] how to get p value for logistic regression in spark mllib using java

查看：766 发布时间：2016/5/22 16:42:56 java apache-spark machine-learning logistic-regression

本文介绍了如何获得P值logistic回归的火花mllib使用Java的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我怎样才能获得使用Java在星火MLlib回归p值。如何找到分类类的概率。以下是code我有尝试：

  SparkConf sparkConf =新SparkConf（）setAppName（GRP）setMaster（本地[*]）。;
SparkContext CTX =新SparkContext（sparkConf）;LabeledPoint POS =新LabeledPoint（1.0，Vectors.dense（1.0，0.0，3.0））;
字符串路径=dataSetnew.txt;JavaRDD＆LT; LabeledPoint＆GT;数据= MLUtils.loadLibSVMFile（CTX，路径）.toJavaRDD（）;
JavaRDD＆LT; LabeledPoint＆GT; [] =分裂data.randomSplit（新双[] {0.6，0.4}，11L）;
JavaRDD＆LT; LabeledPoint＆GT;训练=拆分[0] .cache（）;
JavaRDD＆LT; LabeledPoint＆GT;测试=拆分[1];最后org.apache.spark.mllib.classification.LogisticRegressionModel模型=
    新LogisticRegressionWithLBFGS（）
        .setNumClasses（2）
        .setIntercept（真）
        .RUN（training.rdd（））;JavaRDD＆LT; Tuple2＆LT;对象，对象＆gt;＆GT; predictionAndLabels = test.map（
    新org.apache.spark.api.java.function.Function＆LT; LabeledPoint，Tuple2＆LT;对象，对象＆gt;＆GT;（）{
        公共Tuple2＆LT;对象，对象＆gt;调用（LabeledPoint P）{
          双prediction =模型$ P $ pdict（p.features（））;
         //的System.out.println（prediction：+ prediction）;
          返回新Tuple2＆下;对象，对象＆gt;（prediction，p.label（））;
        }
      }
    ）;矢量denseVecnew = Vectors.dense（112,110,110,0,0,0,0,0,0,0,0）;
双prediction =模型$ P $ pdict（denseVecnew）。
矢量weightVector = model.weights（）;
的System.out.println（砝码+ weightVector）;
的System.out.println（截距：+ model.intercept（））;
的System.out.println（预测+ prediction）;
ctx.stop（）;

解决方案

有关，您可以使用 LogisticRegressionModel.clearThreshold 方法二分类。它被称为后 predict 将返回原始分数

而不是标签。这些是在范围[0，1]和可PTED作为概率间$ P $

请参阅 clearThreshold 文档。

How can I get p-value for logistic regression in Spark MLlib using Java. How to find the probability of the classified class. The following is the code i have tried with:

SparkConf sparkConf = new SparkConf().setAppName("GRP").setMaster("local[*]");
SparkContext ctx = new SparkContext(sparkConf);

LabeledPoint pos = new LabeledPoint(1.0, Vectors.dense(1.0, 0.0, 3.0));
String path = "dataSetnew.txt";

JavaRDD<LabeledPoint> data = MLUtils.loadLibSVMFile(ctx, path).toJavaRDD();
JavaRDD<LabeledPoint>[] splits = data.randomSplit(new double[] {0.6, 0.4}, 11L);
JavaRDD<LabeledPoint> training = splits[0].cache();
JavaRDD<LabeledPoint> test = splits[1];   

final org.apache.spark.mllib.classification.LogisticRegressionModel model = 
    new LogisticRegressionWithLBFGS()
        .setNumClasses(2)
        .setIntercept(true)
        .run(training.rdd());    

JavaRDD<Tuple2<Object, Object>> predictionAndLabels = test.map(
    new org.apache.spark.api.java.function.Function<LabeledPoint, Tuple2<Object, Object>>() {
        public Tuple2<Object, Object> call(LabeledPoint p) {
          Double prediction = model.predict(p.features());
         // System.out.println("prediction :"+prediction);
          return new Tuple2<Object, Object>(prediction, p.label());
        }
      }
    );   

Vector denseVecnew = Vectors.dense(112,110,110,0,0,0,0,0,0,0,0);
Double prediction = model.predict(denseVecnew);
Vector weightVector = model.weights();          
System.out.println("weights : "+weightVector);           
System.out.println("intercept : "+model.intercept());       
System.out.println("forecast"+ prediction);    
ctx.stop();

解决方案

For binary classification you can use LogisticRegressionModel.clearThreshold method. After it is called predict will return raw scores

instead of labels. These are in range [0, 1] and can be interpreted as probabilities.

See clearThreshold docs.

这篇关于如何获得P值logistic回归的火花mllib使用Java的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何获得P值logistic回归的火花mllib使用Java [英] how to get p value for logistic regression in spark mllib using java

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

如何获得P值logistic回归的火花mllib使用Java [英] how to get p value for logistic regression in spark mllib using java

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭