Weka 输出预测 [英] Weka output predictions

查看:40
本文介绍了Weka 输出预测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经使用 Weka GUI 来训练和测试文件(进行预测),但不能用 API 做同样的事情.我得到的错误表明训练和测试文件中有不同数量的属性.在 GUI 中,这可以通过检查输出预测"来解决.

I've used the Weka GUI for training and testing a file (making predictions), but can't do the same with the API. The error I'm getting says there's a different number of attributes in the train and test files. In the GUI, this can be solved by checking "Output predictions".

如何使用 API 做类似的事情?你知道那里有任何样品吗?

How to do something similar using the API? do you know of any samples out there?

import weka.classifiers.bayes.NaiveBayes;
import weka.classifiers.meta.FilteredClassifier;
import weka.classifiers.trees.J48;
import weka.core.Instances;
import weka.core.converters.ConverterUtils.DataSource;
import weka.filters.Filter;
import weka.filters.unsupervised.attribute.NominalToBinary;
import weka.filters.unsupervised.attribute.Remove;

public class WekaTutorial
{

  public static void main(String[] args) throws Exception
  {
    DataSource trainSource = new DataSource("/tmp/classes - edited.arff"); // training
    Instances trainData = trainSource.getDataSet();

    DataSource testSource = new DataSource("/tmp/classes_testing.arff");
    Instances testData = testSource.getDataSet();

    if (trainData.classIndex() == -1)
    {
      trainData.setClassIndex(trainData.numAttributes() - 1);
    }

    if (testData.classIndex() == -1)
    {
      testData.setClassIndex(testData.numAttributes() - 1);
    }    

    String[] options = weka.core.Utils.splitOptions("weka.filters.unsupervised.attribute.StringToWordVector -R first-last -W 1000 -prune-rate -1.0 -N 0 -stemmer weka.core.stemmers.NullStemmer -M 1 "
            + "-tokenizer \"weka.core.tokenizers.WordTokenizer -delimiters \" \\r\\n\\t.,;:\\\'\\\"()?!\"");

    Remove remove = new Remove();
    remove.setOptions(options);
    remove.setInputFormat(trainData);

    NominalToBinary filter = new NominalToBinary(); 

    NaiveBayes nb = new NaiveBayes();

    FilteredClassifier fc = new FilteredClassifier();
    fc.setFilter(filter);
    fc.setClassifier(nb);
    // train and make predictions
    fc.buildClassifier(trainData);

    for (int i = 0; i < testData.numInstances(); i++)
    {
      double pred = fc.classifyInstance(testData.instance(i));
      System.out.print("ID: " + testData.instance(i).value(0));
      System.out.print(", actual: " + testData.classAttribute().value((int) testData.instance(i).classValue()));
      System.out.println(", predicted: " + testData.classAttribute().value((int) pred));
    }

  }

}

错误:
线程main"中的异常 java.lang.IllegalArgumentException:Src 和 Dest 的属性数量不同:2 != 17152

这不是 GUI 的问题.

This was not an issue for the GUI.

推荐答案

你需要保证train和test set中的categories是兼容的,尽量

You need to ensure that categories in train and test sets are compatible, try to

  • 结合训练和测试集
  • 列表项
  • 预处理它们
  • 将它们保存为 arff
  • 打开两个空文件
  • 将标题从顶部复制到@data"行
  • 将训练集复制到第一个文件中,将测试集复制到第二个文件中

这篇关于Weka 输出预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆