如何使用Rapidminer在测试集上进行测试? [英] How to test on testset using Rapidminer?

查看:196
本文介绍了如何使用Rapidminer在测试集上进行测试?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Rapidminer进行分析.我在多个模型上使用了交叉验证,以获得最佳的工作模型.现在,我想使用此模型在使用分割数据估算性能的单独测试集上进行测试.

I'm using Rapidminer to do an analysis. I used cross-validation on several models to get the best working model. Now I want to use this model to test on a separate testset that I made using Split Data to estimate the performance.

如何使用测试仪?据我所知,所有验证模块都使用建立模型的训练集.我可以使用哪种性能指标来纳入模型和测试集?

How do I use the test set? As far as I can tell, all the validation modules use the training set that the model was made on. Which performance measure can I use that takes in a model and my test set?

推荐答案

将应用模型"运算符与模型一起用作第一个输入,将测试集用作第二个输入.此运算符将返回带有标签的数据集,该数据集是您的数据输入,带有一些其他特殊属性,例如预测和信心. 性能"运算符需要此属性来度量应用于测试集的模型的性能.

Use the "Apply Model" operator with your model as the first input and your test set as the second input. This operator will return a labelled data set which is your data input with some additional special attributes, e.g. the prediction and the confidence. The "Performance" operator needs this attributes to measure the performance of the model applied on your test set.

这是一个小示例,它使用样本"存储库中的训练和测试集.

Here is one small example which uses the a training and test set from the "Samples" repository.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.007">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.3.007" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="retrieve" compatibility="5.3.007" expanded="true" height="60" name="Golf" width="90" x="45" y="30">
        <parameter key="repository_entry" value="//Samples/data/Golf"/>
      </operator>
      <operator activated="true" class="decision_tree" compatibility="5.3.007" expanded="true" height="76" name="Decision Tree" width="90" x="179" y="30"/>
      <operator activated="true" class="retrieve" compatibility="5.3.007" expanded="true" height="60" name="Golf-Testset" width="90" x="179" y="120">
        <parameter key="repository_entry" value="//Samples/data/Golf-Testset"/>
      </operator>
      <operator activated="true" breakpoints="before,after" class="apply_model" compatibility="5.3.007" expanded="true" height="76" name="Apply Model" width="90" x="313" y="30">
        <list key="application_parameters"/>
      </operator>
      <operator activated="true" class="performance" compatibility="5.3.007" expanded="true" height="76" name="Performance" width="90" x="447" y="30"/>
      <connect from_op="Golf" from_port="output" to_op="Decision Tree" to_port="training set"/>
      <connect from_op="Decision Tree" from_port="model" to_op="Apply Model" to_port="model"/>
      <connect from_op="Golf-Testset" from_port="output" to_op="Apply Model" to_port="unlabelled data"/>
      <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
      <connect from_op="Performance" from_port="performance" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

这篇关于如何使用Rapidminer在测试集上进行测试?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆