图像聚类以评估多样性(Weka?) [英] Clustering of images to evaluate diversity (Weka?)

查看:95
本文介绍了图像聚类以评估多样性(Weka?)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在大学课程中,我具有图像的某些功能(如 text 文件).我必须根据它们的多样性对这些图像进行排名.#

我想到的想法是为k-均值分类器提供图像,然后计算从群集中的图像到群集中心的欧氏距离.然后在簇之间进行旋转,并始终获取(下一个)最接近中心的图像.也就是说,返回最接近中心1的位置,然后最接近中心2的位置,然后返回3 ....然后第二最接近中心1、2、3的位置,依此类推.

第一个问题:这是一个聪明的方法吗?还是我走错了路?

第二个问题:我有点困惑.我以为我会将数据提供给Weka,它会告诉我嘿,如果我是你,我会将这些数据分为7个簇"或类似的东西.我的意思是,它能够为我提供一些有关所需集群的信息.相反,要使用simplekmeans,我应该先验地知道我将使用多少个簇……我怎么可能知道呢?

我的意思的一个例子:假设我有3张单色图像:浅蓝色,蓝色,红色. 我以为Weka会注意到这2个蓝调相似并且将它们聚在一起.

顺便说一句,我是Weka的新手(如您所见),因此,如果您能提供一些有关我想使用哪些函数的信息(以及为什么:P),我将不胜感激! 谢谢!

解决方案

简单K均值-是一种算法,您必须在数据集中指定许多可能的聚类.

如果您不知道可能有多少个群集,最好使用不同的算法或找出多个群集.

您可以使用 X-均值-无需指定 k 参数. ( http://weka.sourceforge.net/doc.packages/XMeans/weka/clusterers/XMeans.html )

X均值是通过改进结构部分扩展的K均值.在该算法的这一部分中,尝试将中心在其区域中进行拆分.比较两个结构的BIC值,就可以确定每个中心的子级与其自身之间的决定.

或者您可以观察基于AHC的切入点图-层次聚类算法( https://en. wikipedia.org/wiki/Hierarchical_clustering ) 然后减去一些簇

Within a university course I have some features of images (as text files). I have to rank those images according to their diversity.#

The idea I have in mind is to feed a k-means classifier with the images and then compute the euclidian-distance from the images within a cluster to the cluster's centroïd. Then do a rotation between clusters and take always the (next) closest image to the centroïd. I.e., return closest to centroïd 1, then closest to centroïd 2, then 3.... then second closest to centroïd 1, 2, 3 and so on.

First question: would this be a clever approach? Or am I on the wrong path?

Second question: I'm a bit confused. I thought I'd feed the data to Weka and it'd tell me "hey, if I were you, I'd split this data into 7 clusters", or something like that. I mean, that it'd be able to give me some information about the clusters I need. Instead, to use simplekmeans I'm supposed to know a priori how many clusters I'll use... how could I possibly know that?

One example of what I mean: let's say I have 3 mono-color images: light-blue, blue, red. I thought Weka would notice that the 2 blues are similar and cluster them together.

Btw I'm kind of new to Weka (as you might have seen) so if you could provide some information on which functions I miggt want to use (and why :P) I'd be grateful! Thank you!

解决方案

Simple K-means - is an algorithm where you have to specify a number of the possible clusters in the data set.

If you don't know how many clusters there might be, it's better to get different algorithm or find out a number of the clusters.

You can use X-means -there you don't need to specify k parameter. (http://weka.sourceforge.net/doc.packages/XMeans/weka/clusterers/XMeans.html)

X-Means is K-Means extended by an Improve-Structure part In this part of the algorithm the centers are attempted to be split in its region. The decision between the children of each center and itself is done comparing the BIC-values of the two structures.

or you can observe a cut point chart based on AHC - hierarchical clustering algorithm (https://en.wikipedia.org/wiki/Hierarchical_clustering) and then deduct a number of the clusters

这篇关于图像聚类以评估多样性(Weka?)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆