使用自己的Java代码获取WEKA中的风险预测 [英] Get risk predictions in WEKA using own Java code

查看:136
本文介绍了使用自己的Java代码获取WEKA中的风险预测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经检查了WEKA的"做出预测" 文档并且包含用于命令行和GUI预测的明确指令.

I already checked the "Making predictions" documentation of WEKA and it contains explicit instructions for command line and GUI predictions.

我想知道如何使用自己的Java代码中的Agrawal数据集(weka.datagenerators.classifiers.classification.Agrawal)从GUI中获得以下预测值:

I want to know how to get a prediction value like the one below I got from the GUI using the Agrawal dataset (weka.datagenerators.classifiers.classification.Agrawal) in my own Java code:

inst#,  actual,     predicted,  error,  prediction
1,      1:0,        2:1,        +,      0.941
2,      1:0,        1:0,        ,       1
3,      1:0,        1:0,        ,       1
4,      1:0,        1:0,        ,       1
5,      1:0,        1:0,        ,       1
6,      1:0,        1:0,        ,       1
7,      1:0,        2:1,        +,      0.941
8,      2:1,        2:1,        ,       0.941
9,      2:1,        2:1,        ,       0.941
10,     2:1,        2:1,        ,       0.941
1,      1:0,        1:0,        ,       1
2,      1:0,        1:0,        ,       1
3,      1:0,        1:0,        ,       1


即使说:

Java

如果要在自己的代码中执行分类,请参见这篇文章分类实例部分>,全面介绍了Weka API.

Java

If you want to perform the classification within your own code, see the classifying instances section of this article, explaining the Weka API in general.

我转到了链接,然后它说:

对实例进行分类

如果您要使用未经训练的分类器对未标记的数据集进行分类,则可以使用以下代码段.它将加载文件/some/where/unlabeled.arff,使用先前构建的分类树标记实例,并将标记的数据另存为/some/where/labeled.arff.

Classifying instances

In case you have an unlabeled dataset that you want to classify with your newly trained classifier, you can use the following code snippet. It loads the file /some/where/unlabeled.arff, uses the previously built classifier tree to label the instances, and saves the labeled data as /some/where/labeled.arff.

这不是我想要的情况,因为我只想对当前数据集中的 k 倍交叉验证预测建模.

This is not the case I want because I just want the k-fold cross validation predictions on my current dataset modeled.

predictions

predictions

public FastVector predictions()

返回已收集的预测.

返回:

FastVector的引用,其中包含已收集的预测.如果未收集到任何预测,则应为null.

a reference to the FastVector containing the predictions that have been collected. This should be null if no predictions have been collected.

我找到了 predictions() 方法键入 Evaluation 并使用以下代码:

I found the predictions() method for objects of type Evaluation and by using the code:

Object[] preds = evaluation.predictions().toArray();
for(Object pred : preds) {
    System.out.println(pred);
}

结果是:

...
NOM: 0.0 0.0 1.0 0.9466666666666667 0.05333333333333334
NOM: 0.0 0.0 1.0 0.8947368421052632 0.10526315789473684
NOM: 0.0 0.0 1.0 0.9934883720930232 0.0065116279069767444
NOM: 0.0 0.0 1.0 0.9466666666666667 0.05333333333333334
NOM: 0.0 0.0 1.0 0.9912575655682583 0.008742434431741762
NOM: 0.0 0.0 1.0 0.9934883720930232 0.0065116279069767444
...

这和上面的一样吗?

推荐答案

在Google进行深入搜索之后(并且因为

After deep Google searches (and because the documentation provides minimal help) I finally found the answer.

我希望这个明确的答案将来能对其他人有所帮助.

I hope this explicit answer helps others in the future.

  • For a sample code I saw the question "How to print out the predicted class after cross-validation in WEKA" and I'm glad I was able to decode the incomplete answer wherein some of it is hard to understand.

这是我的代码,其工作方式类似于GUI的输出

Here is my code that worked similar to the GUI's output

StringBuffer predictionSB = new StringBuffer();
Range attributesToShow = null;
Boolean outputDistributions = new Boolean(true);

PlainText predictionOutput = new PlainText();
predictionOutput.setBuffer(predictionSB);
predictionOutput.setOutputDistribution(true);

Evaluation evaluation = new Evaluation(data);
evaluation.crossValidateModel(j48Model, data, numberOfFolds,
        randomNumber, predictionOutput, attributesToShow,
        outputDistributions);

为帮助您理解,我们需要在

To help you understand, we need to implement the StringBuffer to be casted in an AbstractOutput object so that the function crossValidateModel can recognize it.

仅使用StringBuffer会导致与问题中的问题类似的java.lang.ClassCastException,而使用不带StringBufferPlainText将显示java.lang.IllegalStateException.

Using StringBuffer only will cause a java.lang.ClassCastException similar the one in the question while using a PlainText without a StringBuffer will show a java.lang.IllegalStateException.

我要感谢 ManChon U (凯文)及其问题 如何识别输入数据集中与其对应实例的交叉评估结果?"为我提供了这意味着什么的线索:

I would like to thank ManChon U (Kevin) and their question "How to identify the cross-evaluation result to its corresponding instance in the input data set?" for giving me a clue on what this meant:

...您只需要一个加法参数,它是weka.classifiers.evaluation.output.prediction.AbstractOutput的具体子类. weka.classifiers.evaluation.output.prediction.PlainText可能是 您要使用的一种. 来源

... you just need a single addition argument that is a concrete subclass of weka.classifiers.evaluation.output.prediction.AbstractOutput. weka.classifiers.evaluation.output.prediction.PlainText is probably the one you want to use. Source

...尝试创建一个PlainText对象,该对象扩展了AbstractOutput(例如称为输出)实例,并调用output.setBuffer(forPredictionsPrinting)并将其传递给缓冲区. 来源

... Try creating a PlainText object, which extends AbstractOutput (called output for example) instance and calling output.setBuffer(forPredictionsPrinting) and passing that in instead of the buffer. Source

这些实际上只是要创建一个PlainText对象,在其中放置一个StringBuffer,并使用它通过方法setOutput(boolean)和其他方法来调整输出.

These just actually meant to create a PlainText object, put a StringBuffer in it and use it to tweak the output with methods setOutput(boolean) and others.

最后,要获得我们想要的预测,只需使用:

Finally, to get our desired predictions, just use:

System.out.println(predictionOutput.getBuffer());

其中predictionOutput此外,evaluation.predictions()的结果与WEKA GUI中提供的结果不同.幸运的是,马克·霍尔(Mark Hall)在问题交叉验证后打印出预测类"

Additionally, the results of evaluation.predictions() is different from the one provided in the WEKA GUI. Fortunately Mark Hall explained this in the question "Print out the predict class after cross-validation"

Evaluation.predictions()返回一个FastVector,其中包含weka.classifiers.evaluation包中的NominalPredictionNumericPrediction对象.呼唤 Evaluation.crossValidateModel()与附加的AbstractOutput对象一起导致评估对象以在资源管理器或命令行中看到的格式,将Nominal/NumericPrediction对象的预测/分布信息打印到StringBuffer.

Evaluation.predictions() returns a FastVector containing either NominalPrediction or NumericPrediction objects from the weka.classifiers.evaluation package. Calling Evaluation.crossValidateModel() with the additional AbstractOutput object results in the evaluation object printing the prediction/distribution information from Nominal/NumericPrediction objects to the StringBuffer in the format that you see in the Explorer or from the command line.

参考文献:

这篇关于使用自己的Java代码获取WEKA中的风险预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆