实现K-均值聚类算法 [英] Implementation of k-means clustering algorithm

查看:156
本文介绍了实现K-均值聚类算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的计划,我要带K = 2的K均值算法,即我只想要2集群。 我曾在一个非常简单和直接的方式实现,我依然无法理解,为什么我的程序进入死循环。 任何人都可以请指导我在那里我犯了一个错误..?

为了简单起见,我甲肝采取的输入C本身的程序$ C $。 这里是我的code:

 进口java.io. *;
进口的java.lang。*;
类Kmean
{
公共静态无效的主要(字符串的args [])
{
INT N = 9;
INT ARR [] = {2,4,10,12,3,20,30,11,25}; //初始数据
INT I,M1,M2,A,B中,n = 0;
布尔标志= TRUE;
浮SUM1 = 0,SUM2 = 0;
一个改编= [0]; B =改编[1];
M1 = A; M2 = B;
INT cluster1中[] =新INT [9],Cluster2中[] =新INT [9];
对于(i = 0; I< 9;我++)
    System.out.print(ARR [我] +\ t的);
的System.out.println();

做
{
 ñ++;
 时int k = 0,J = 0;
 对于(i = 0; I< 9;我++)
 {
    如果(Math.abs(ARR [I] -M1)< = Math.abs(ARR [I] -m2))
    {cluster1中[K] =改编[I]
        ķ++;
    }
    其他
    {Cluster2中[J] =改编[I]
        J ++;
    }
 }
    的System.out.println();
    对于(i = 0; I< 9;我++)
        SUM1 = SUM1 + cluster1中[I]
    对于(i = 0; I< 9;我++)
        SUM2 = SUM1 + Cluster2中[I]
    A = M1;
    B = 2;
    M1 = Math.round(SUM1 / k)的;
    M2 = Math.round(SUM2 / J);
    如果(M1 == A和&放大器; M2 == B)
        标志= FALSE;
    其他
        标志=真正的;

    的System.out.println(迭代后+ N +,群1:\ N); //打印每个迭代的簇
    对于(i = 0; I< 9;我++)
        System.out.print(cluster1中[我] +\ t的);

    的System.out.println(\ N);
    的System.out.println(迭代后+ N +,第2组:\ N);
    对于(i = 0; I< 9;我++)
        System.out.print(Cluster2中[我] +\ t的);

}而(标志);

    的System.out.println(最终集群1:\ N); //最后集群
    对于(i = 0; I< 9;我++)
        System.out.print(cluster1中[我] +\ t的);

    的System.out.println();
    的System.out.println(最终集群2:\ N);
    对于(i = 0; I< 9;我++)
        System.out.print(Cluster2中[我] +\ t的);
 }
}
 

解决方案

您有一堆错误的:

  1. 在你的 DO 循环的开始,你应该重新 SUM1 SUM2 0。

  2. 您应该循环,直到 K Ĵ分别计算时, SUM1 SUM2 (或清除 cluster1中 Cluster2中 DO 循环的开始。

  3. 在计算 SUM2 意外使用了 SUM1

当我把这些修复了code运行正常,产生的输出:

 决赛第1组:
2 4 10 12 3 11 0 0 0

最后一簇2:
20 30 25 0 0 0 0 0 0
 

我一般的提醒:学习如何使用调试器。计算器是不是意味着这样的问题:预计你可以找到你自己的错误,只有来到这里时,一切都失败了......

In my program, i'm taking k=2 for k-mean algorithm i.e i want only 2 clusters. I have implemented in a very simple and straightforward way, still i'm unable to understand why my program is getting into infinite loop. can anyone please guide me where i'm making a mistake..?

for simplicity, i hav taken the input in the program code itself. here is my code :

import java.io.*;
import java.lang.*;
class Kmean
{
public static void main(String args[])
{
int N=9;
int arr[]={2,4,10,12,3,20,30,11,25};    // initial data
int i,m1,m2,a,b,n=0;
boolean flag=true;
float sum1=0,sum2=0;
a=arr[0];b=arr[1];
m1=a; m2=b;
int cluster1[]=new int[9],cluster2[]=new int[9];
for(i=0;i<9;i++)
    System.out.print(arr[i]+ "\t");
System.out.println();

do
{
 n++;
 int k=0,j=0;
 for(i=0;i<9;i++)
 {
    if(Math.abs(arr[i]-m1)<=Math.abs(arr[i]-m2))
    {   cluster1[k]=arr[i];
        k++;
    }
    else
    {   cluster2[j]=arr[i];
        j++;
    }
 }
    System.out.println();
    for(i=0;i<9;i++)
        sum1=sum1+cluster1[i];
    for(i=0;i<9;i++)
        sum2=sum1+cluster2[i];
    a=m1;
    b=m2;
    m1=Math.round(sum1/k);
    m2=Math.round(sum2/j);
    if(m1==a && m2==b)
        flag=false;
    else
        flag=true;

    System.out.println("After iteration "+ n +" , cluster 1 :\n");    //printing the clusters of each iteration
    for(i=0;i<9;i++)
        System.out.print(cluster1[i]+ "\t");

    System.out.println("\n");
    System.out.println("After iteration "+ n +" , cluster 2 :\n");
    for(i=0;i<9;i++)
        System.out.print(cluster2[i]+ "\t");

}while(flag);

    System.out.println("Final cluster 1 :\n");            // final clusters
    for(i=0;i<9;i++)
        System.out.print(cluster1[i]+ "\t");

    System.out.println();
    System.out.println("Final cluster 2 :\n");
    for(i=0;i<9;i++)
        System.out.print(cluster2[i]+ "\t");
 }
}

解决方案

You have a bunch of errors:

  1. At the start of your do loop you should reset sum1 and sum2 to 0.

  2. You should loop until k and j respectively when calculating sum1 and sum2 (or clear cluster1 and cluster2 at the start of your do loop.

  3. In the calculation of sum2 you accidentally use sum1.

When I make those fixes the code runs fine, yielding the output:

Final cluster 1 :   
2   4   10   12  3   11  0   0   0

Final cluster 2 :
20  30  25   0   0   0   0   0   0

My general advise: learn how to use a debugger. Stackoverflow is not meant for questions like this: it is expected that you can find your own bugs and only come here when everything else fails...

这篇关于实现K-均值聚类算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆