Weka中K-means算法的不同结果 [英] Different results of K-means algorithm in Weka

查看:118
本文介绍了Weka中K-means算法的不同结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我使用Weka中的任何算法,则结果格式如下:

If i use any of the algorithms in Weka i have reults of the following format:

=== Stratified cross-validation ===
=== Summary ===

Correctly Classified Instances         302               63.3124 %
Incorrectly Classified Instances       175               36.6876 %
Kappa statistic                          0.3536
Mean absolute error                      0.3464
Root mean squared error                  0.4176
Relative absolute error                 85.5832 %
Root relative squared error             92.8684 %
Total Number of Instances              477     

=== Detailed Accuracy By Class ===

           TP Rate   FP Rate   Precision   Recall  F-Measure   ROC Area  Class
             0.801     0.407      0.686     0.801     0.739      0.659    1
             0.748     0.243      0.549     0.748     0.633      0.718    2
             0         0          0         0         0          0.478    3
Weighted Avg.    0.633     0.283      0.516     0.633     0.568      0.641

=== Confusion Matrix ===

     a   b   c   <-- classified as
   201  50   0 |   a = 1
    34 101   0 |   b = 2
    58  33   0 |   c = 3

但是,如果我使用k均值,则结果的格式如下:

But if i use k-means my results are of the following format:

=== Model and evaluation on training set ===


kMeans
======

Number of iterations: 9
Within cluster sum of squared errors: 297.46622082142716
Missing values globally replaced with mean/mode

Cluster centroids:
                            Cluster#
Attribute        Full Data         0         1         2
                     (477)     (136)     (172)     (169)
========================================================
Religion            8.6939    7.6691    8.9709    9.2367
Vote_Criterion      2.7736    2.8971    2.4942    2.9586
Sex                 1.4906    1.4559         2         1
DateBirth        1930.7652 1937.5147 1920.2965 1935.9882
Educ                3.2201    3.2721    3.2209    3.1775
Immigrant           1.6415    1.6838    1.5872    1.6627
Income              2.4675       2.5    2.5523     2.355
Occupation          3.6184    3.8162    3.2907    3.7929
Vote2013                 1         2         1         1




Time taken to build model (full training data) : 0.06 seconds

=== Model and evaluation on training set ===

    Clustered Instances

    0       136 ( 29%)
    1      172 ( 36%)
    2      169 ( 35%)

..但是我想知道正确分类的实例,精度,召回率等其他算法向我展示的原因.为什么会发生这种情况,我如何让weka以k均值的第一种格式向我展示结果?/p>

..But i want to know the correctly classified instances,the precision,the recall etc as other algorithms show me.Why is that happening and how can i make weka show me results in the first format for k-means?

推荐答案

K-Means本身就是集群算法:

K-Means is by itself a clustering algorithm:

集群分析或集群是将一组对象的方式应使同一组中的对象(称为群集)彼此之间(在某种意义上或彼此之间)比彼此更相似其他组(集群)中的人

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters)

因此它没有""的概念,因此不用于分类(当然可以这样做),但是性能可能不太好.您确定在这里正确使用它吗?

so it does not have a notion of "class", thus is not used for classification (it could be made to, of course, but the performance might not be too good). Are you sure you are using it correctly here?

此外,请参见

Also, see here (bold is mine):

您可以按顺序使用元分类器 ClassificationViaClustering 在有监督的环境中使用群集器.

You could use the meta-classifier ClassificationViaClustering in order to use the clusterers in a supervised environment.

这篇关于Weka中K-means算法的不同结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆