为什么 WEKA 评估类需要训练实例? [英] Why WEKA Evaluation class need train instances?

查看:25
本文介绍了为什么 WEKA 评估类需要训练实例?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不明白为什么 Weka 评估类构造函数需要列车实例才能工作.

I do not understand why Weka Evaluation class constructor needs the train instances to work.

谁能给我解释一下?

理论上,评估仅取决于训练模型(下一段代码中的cls)和测试数据(TestingSet).

In theory, the evaluation depends only of the trained model (cls in the next code) and test data (TestingSet).

谢谢!

这是一个例子:

// TrainingSet is the training Instances

// TestingSet is the testingInstances

// Build de classifier

Classifier cls = (Classifier) new NaiveBayes();

cls.buildClassifier(TrainingSet);

// Test the model

Evaluation eTest = new Evaluation(**TrainingSet**); 

eTest.evaluateModel(cls, TestingSet);

推荐答案

用于映射结果

大多数算法都适用于数字数据.所以一个特征的所有非数字值都必须转换成数字形式.此映射必须是唯一的.这意味着所有具有特定非数字值的值都将映射到相同的数字值.
在训练数据时,数据预处理器第一次看到数据.在转换非数字数据时,预处理器使用 maps 来记住映射.

例如如果一个特征的所有可能值都是 {yes, no,maybe},那么这些值可以映射为:
{是":1,否":2,可能":3}

因此,看起来像 [yes,yes,no,yes,maybe,yes] 的输入特征/列现在将转换为 [1,1,2,1,3,1].这些数值由算法使用.
现在这些信息存储在 Weka 的 Instances(trained) 中.因此,当评估器预测某个特征的数值时,它需要将该数值转换为其实际值.
即如果算法吐出值 2,则需要映射来确定 2 对应于否".为此,算法需要在训练之前创建映射.因此它要求训练实例.

注意:AFAIK 相同的逻辑适用于所有 ML 框架,如 weka、dl4j 等.

Most of the algorithms work on numeric data. So all the non-numeric values of a feature have to converted into a numeric form. This mapping has to be unique. What this means is that all the values which have a specific non-numeric value will be mapped to the same numeric value.
While training the data, the data pre-processor sees the data for the very first time. While converting the non-numeric data the pre-processor uses maps to remember the mapping.

For e.g. If all possible values for a feature are {yes, no, maybe} then these values could be mapped like :
{"yes":1, "no":2, "maybe":3}

So, the input feature/column which looked like [yes,yes,no,yes,maybe,yes] would now be converted into an internal form of [1,1,2,1,3,1]. These numeric values are used by the algorithms.
Now this information is stored in Instances(trained) in Weka. So when the evaluator predicts a numeric value for a feature it needs to convert this numeric value to its actual value.
i.e. If the algo spits out a value of 2 it needs the map to figure out that 2 corresponds to 'no'. To do this the algorithm needs the mapping created before training. Hence it asks for training Instances.

Note : AFAIK same logic applies in all ML frameworks like weka, dl4j, etc.

这篇关于为什么 WEKA 评估类需要训练实例?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆