Java,Weka:如何预测数字属性? [英] Java, Weka: How to predict numeric attribute?

查看:321
本文介绍了Java,Weka:如何预测数字属性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图使用Weka的NaiveBayesUpdateable分类器。我的数据包含名义和数字属性:

I was trying to use NaiveBayesUpdateable classifier from Weka. My data contains both nominal and numeric attributes:

  @relation cars
  @attribute country {FR, UK, ...}
  @attribute city {London, Paris, ...}
  @attribute car_make {Toyota, BMW, ...}
  @attribute price numeric   %% car price 
  @attribute sales numeric   %% number of cars sold

我需要预测销售数量(数字! )基于其他属性。

I need to predict the number of sales (numeric!) based on other attributes.

据我所知,我不能在Weka中使用数字属性进行贝叶斯分类。一种技术是在长度为k的N个区间中拆分数值属性的值,而使用标称属性,其中n是类名,如下所示:@attribute class {1,2,3,... N}。

I understand that I can not use numeric attribute for Bayes classification in Weka. One technique is to split value of numeric attribute in N intervals of length k and use instead nominal attribute, where n is a class name, like this: @attribute class {1,2,3,...N}.

我需要预测的数字属性范围从0到1 000 000.创建1 000 000个类根本没有意义。如何用Weka预测数字属性或者在Weka没有这个任务工具的情况下查找哪些算法?

Yet numeric attribute that I need to predict ranges from 0 to 1 000 000. Creating 1 000 000 classes make no sense at all. How to predict numeric attribute with Weka or what algorithms to look for in case Weka has no tools for this task?

推荐答案

你是什么想要做的是回归,而不是分类。区别正是您描述/想要的:

What you want to do is regression, not classification. The difference is exactly what you describe/want:


  • 分类具有离散的类/标签,任何名义属性都可以在这里被用作课程

  • 回归有连续标签,这里的课程将是一个错误的术语。

  • Classification has discrete classes/labels, any nominal attribute could be used as class here
  • Regression has continuous labels, classes would be a wrong term here.

大多数基于回归的技术可以通过定义阈值转换为二元分类,类别由预测值是高于还是低于此阈值来确定。

Most regression based techniques can be transformed into a binary classification by defining a threshold and the class is determined by whether the predicted value is above or below this threshold.

我不知道所有提供回归的WEKA分类器,但你可以先看看这两个:

I don't know all of WEKA's classifiers that offer regression, but you can start by looking at those two:

  • MultilayerPerceptron: Basically a neural network.
  • LinearRegression: As the name says, linear regression.

您可能必须使用 NominalToBinary 过滤将标称属性转换为数字(二进制​​)属性。

You might have to use the NominalToBinary filter to convert your nominal attributes to numerical (binary) ones.

这篇关于Java,Weka:如何预测数字属性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆