ValueError:具有多个元素的数组的真值不明确.使用a.any()或a.all():轮廓表现算法 [英] ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all(): Silhouette performance algorithm

查看:125
本文介绍了ValueError:具有多个元素的数组的真值不明确.使用a.any()或a.all():轮廓表现算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在python中实现了kmeans算法,并试图计算剪影的性能各种k值的簇的集合. 这是数据集中一小部分的几个变量.

I implemented kmeans algorithm in python and was trying to compute the silhouette performance of cluster for various values of k. Here are few variables for a small part of the dataset.

def avgdist(pt, clust):
    dists = []
    for elem in clust:
        dists.append(np.linalg.norm(pt-elem))
    return np.mean(dists)

def silhouette(data, clusts):
    s = []
    print("data-")
    print(data)
    for i in range(len(clusts)):
        for j in range(len(clusts[i])):
            clusts[i][j] = clusts[i][j].tolist()
    print("Clusters")
    print(clusts)
    for elem in data:
        a = []
        b = []
        print(elem)
        for clust in clusts:
            print(clust)
            if elem in clust: #Error in this line
                b.append(avgdist(elem, clust))
            else:
                a.append(avgdist(elem, clust))

        s.append((min(b)-min(a)/(max(min(b), min(a)))))
    return np.mean(s)

获得的终端输出如下-

data-
[[  0.   0.   5.]
 [  0.   0.   0.]
 [  0.   0.   0.]
 [  0.   0.   7.]
 [  0.   0.   0.]
 [  0.   0.  12.]
 [  0.   0.   0.]
 [  0.   0.   7.]
 [  0.   0.   9.]
 [  0.   0.  11.]]
Clusters
[[array([ 0.,  0.,  5.]), array([ 0.,  0.,  0.]), array([ 0.,  0.,  0.]), array([ 0.,  0.,  0.]), array([ 0.,  0.,  0.])], [array([ 0.,  0.,  7.]), array([  0.,   0.,  12.]), array([ 0.,  0.,  7.]), array([ 0.,  0.,  9.]), array([  0.,   0.,  11.])]]
[ 0.  0.  5.]
[[0.0, 0.0, 5.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]

这与注释行中的错误一起获得-

This is obtained along with the error in the commented line-

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

请提供帮助,因为我不确定该错误在我的上下文中意味着什么.类似的问题使我对错误的性质有所了解,但我认为此处不适用.

Please help as I am not sure what that error means in my context. The similar questions gave me some idea of the error nature, but I believe is not applicable here.

编辑-我通过将错误行更改为-

Edit - I solved this question by changing in the line of error as-

.....
if elem.tolist() in clust: #Error in this line
    .....

推荐答案

您的问题是,您试图在有问题的那一行进行评估,如果列表列表(集群)包含另一个列表(元素),从而导致列表/true/false值的数组,因为评估是逐元素进行的: 有问题的代码行将按照这些行进行评估

your problem is that you try to evaluate at the line in question if a list of lists (clust) contains another list (elem), which results in an list/array of Truth/False values because the evaluation is done elementwise: The code-line in question will evaluate along the lines

   if [True, False, ...]: #<- error here
       code

这将产生有问题的错误

与其保留列表列表,不如将数据和集群元素转换/打包为元组列表,此评估将起作用.

instead of of holding lists of lists, convert/pack your data and cluster elements into lists of tuples and this evaluation will work.

这篇关于ValueError:具有多个元素的数组的真值不明确.使用a.any()或a.all():轮廓表现算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆