使用自己的Java代码和模型获取WEKA中的预测百分比 [英] Get prediction percentage in WEKA using own Java code and a model

查看:176
本文介绍了使用自己的Java代码和模型获取WEKA中的预测百分比的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

概述



我知道可以通过GUI和命令行选项在训练有素的WEKA模型中获得每个预测的百分比,如文档中方便解释和演示的那样文章行



  • 我将展示如何获得预测的实例值和预测百分比(或分布)。示例模型是在Weka Explorer中创建并保存的J48决策树。它是根据Weka提供的名义天气数据建造的。它被称为tree.model。

      import weka.classifiers.Classifier; 
    import weka.core.Instances;

    公共类Main {

    public static void main(String [] args)throws Exception
    {
    String rootPath =/ some / where / ;
    Instances originalTrain = // instances here

    // load model
    Classifier cls =(Classifier)weka.core.SerializationHelper.read(rootPath +tree.model);

    //预测实例类值
    实例originalTrain = //加载或创建实例以预测

    //预测类值的实例
    int S1 = 0;

    //执行预测
    double value = cls.classifyInstance(originalTrain.instance(s1));

    //获得预测百分比或分布
    double [] percentage = cls.distributionForInstance(originalTrain.instance(s1));

    //获取类值的名称
    String prediction = originalTrain.classAttribute()。value((int)value);

    System.out.println(实例的预测值+
    Integer.toString(s1)+
    :+预测);

    //格式化分布
    String distribution =;
    for(int i = 0; i {
    if(i == value)
    {
    distribution =分布+ * + Double.toString(百分比[I])+ ;
    }
    其他
    {
    distribution = distribution + Double.toString(percentage [i])+,;
    }
    }
    distribution = distribution.substring(0,distribution.length() - 1);

    System.out.println(分布:+分布);
    }

    }

    此输出为:

     实例0的预测值:无
    分布:* 1,0


    Overview

    I know that one can get the percentages of each prediction in a trained WEKA model through the GUI and command line options as conveniently explained and demonstrated in the documentation article "Making predictions".

    Predictions

    I know that there are three ways documented to get these predictions:

    1. command line
    2. GUI
    3. Java code/using the WEKA API, which I was able to do in the answer to "Get risk predictions in WEKA using own Java code"
    4. this fourth one requires a generated WEKA .MODEL file

    I have a trained .MODEL file and now I want to classify new instances using this together with the prediction percentages similar to the one below (an output of the GUI's Explorer, in CSV format):

    inst#,actual,predicted,error,distribution,
    1,1:0,2:1,+,0.399409,*0.7811
    2,1:0,2:1,+,0.3932409,*0.8191
    3,1:0,2:1,+,0.399409,*0.600591
    4,1:0,2:1,+,0.139409,*0.64
    5,1:0,2:1,+,0.399409,*0.600593
    6,1:0,2:1,+,0.3993209,*0.600594
    7,1:0,2:1,+,0.500129,*0.600594
    8,1:0,2:1,+,0.399409,*0.90011
    9,1:0,2:1,+,0.211409,*0.60182
    10,1:0,2:1,+,0.21909,*0.11101
    

    The predicted column is what I want to get from a .MODEL file.


    What I know

    Based from my experience with the WEKA API approach, one can get these predictions using the following code (the PlainText inserted into an Evaluation object) BUT I do not want to do k-fold cross-validation that is provided by the Evaluation object.

    StringBuffer predictionSB = new StringBuffer();
    Range attributesToShow = null;
    Boolean outputDistributions = new Boolean(true);
    
    PlainText predictionOutput = new PlainText();
    predictionOutput.setBuffer(predictionSB);
    predictionOutput.setOutputDistribution(true);
    
    Evaluation evaluation = new Evaluation(data);
    evaluation.crossValidateModel(j48Model, data, numberOfFolds,
            randomNumber, predictionOutput, attributesToShow,
            outputDistributions);
    
    System.out.println(predictionOutput.getBuffer());
    

    From the WEKA documentation

    Note that a .MODEL file classifies data from an .ARFF or related input is discussed in "Use Weka in your Java code" and "Serialization" a.k.a. "How to use a .MODEL file in your own Java code to classify new instances" (why the vague title smfh).

    Using own Java code to classify

    Loading a .MODEL file is through "Deserialization" and the following is for versions > 3.5.5:

    // deserialize model
    Classifier cls = (Classifier) weka.core.SerializationHelper.read("/some/where/j48.model");
    

    An Instance object is the data and it is fed to the classifyInstance. An output is provided here (depending on the data type of the outcome attribute):

    // classify an Instance object (testData)
    cls.classifyInstance(testData.instance(0));
    

    The question "How to reuse saved classifier created from explorer(in weka) in eclipse java" has a great answer too!

    Javadocs

    I have already checked the Javadocs for Classifier (the trained model) and Evaluation (just in case) but none directly and explicitly addresses this issue.

    The only thing closest to what I want is the classifyInstances method of the Classifier:

    Classifies the given test instance. The instance has to belong to a dataset when it's being classified. Note that a classifier MUST implement either this or distributionForInstance().


    How can I simultaneously use a WEKA .MODEL file to classify and get predictions of a new instance using my own Java code (aka using the WEKA API)?

    解决方案

    This answer simply updates my answer from How to reuse saved classifier created from explorer(in weka) in eclipse java.

    I will show how to obtain the predicted instance value and the prediction percentage (or distribution). The example model is a J48 decision tree created and saved in the Weka Explorer. It was built from the nominal weather data provided with Weka. It is called "tree.model".

    import weka.classifiers.Classifier;
    import weka.core.Instances;
    
    public class Main {
    
        public static void main(String[] args) throws Exception
        {
            String rootPath="/some/where/"; 
            Instances originalTrain= //instances here
    
            //load model
            Classifier cls = (Classifier) weka.core.SerializationHelper.read(rootPath+"tree.model");
    
            //predict instance class values
            Instances originalTrain= //load or create Instances to predict
    
            //which instance to predict class value
            int s1=0;
    
            //perform your prediction
            double value=cls.classifyInstance(originalTrain.instance(s1));
    
            //get the prediction percentage or distribution
            double[] percentage=cls.distributionForInstance(originalTrain.instance(s1));
    
            //get the name of the class value
            String prediction=originalTrain.classAttribute().value((int)value); 
    
            System.out.println("The predicted value of instance "+
                                    Integer.toString(s1)+
                                    ": "+prediction); 
    
            //Format the distribution
            String distribution="";
            for(int i=0; i <percentage.length; i=i+1)
            {
                if(i==value)
                {
                    distribution=distribution+"*"+Double.toString(percentage[i])+",";
                }
                else
                {
                    distribution=distribution+Double.toString(percentage[i])+",";
                }
            }
            distribution=distribution.substring(0, distribution.length()-1);
    
            System.out.println("Distribution:"+ distribution);
        }
    
    }

    The output from this is:

    The predicted value of instance 0: no  
    Distribution: *1, 0
    

    这篇关于使用自己的Java代码和模型获取WEKA中的预测百分比的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆