ValueError:具有多个元素的数组的真值不明确.使用a.any()或a.all():轮廓表现算法 [英] ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all(): Silhouette performance algorithm
问题描述
我在python中实现了kmeans算法,并试图计算剪影的性能各种k值的簇的集合. 这是数据集中一小部分的几个变量.
I implemented kmeans algorithm in python and was trying to compute the silhouette performance of cluster for various values of k. Here are few variables for a small part of the dataset.
def avgdist(pt, clust):
dists = []
for elem in clust:
dists.append(np.linalg.norm(pt-elem))
return np.mean(dists)
def silhouette(data, clusts):
s = []
print("data-")
print(data)
for i in range(len(clusts)):
for j in range(len(clusts[i])):
clusts[i][j] = clusts[i][j].tolist()
print("Clusters")
print(clusts)
for elem in data:
a = []
b = []
print(elem)
for clust in clusts:
print(clust)
if elem in clust: #Error in this line
b.append(avgdist(elem, clust))
else:
a.append(avgdist(elem, clust))
s.append((min(b)-min(a)/(max(min(b), min(a)))))
return np.mean(s)
获得的终端输出如下-
data-
[[ 0. 0. 5.]
[ 0. 0. 0.]
[ 0. 0. 0.]
[ 0. 0. 7.]
[ 0. 0. 0.]
[ 0. 0. 12.]
[ 0. 0. 0.]
[ 0. 0. 7.]
[ 0. 0. 9.]
[ 0. 0. 11.]]
Clusters
[[array([ 0., 0., 5.]), array([ 0., 0., 0.]), array([ 0., 0., 0.]), array([ 0., 0., 0.]), array([ 0., 0., 0.])], [array([ 0., 0., 7.]), array([ 0., 0., 12.]), array([ 0., 0., 7.]), array([ 0., 0., 9.]), array([ 0., 0., 11.])]]
[ 0. 0. 5.]
[[0.0, 0.0, 5.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]
这与注释行中的错误一起获得-
This is obtained along with the error in the commented line-
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
请提供帮助,因为我不确定该错误在我的上下文中意味着什么.类似的问题使我对错误的性质有所了解,但我认为此处不适用.
Please help as I am not sure what that error means in my context. The similar questions gave me some idea of the error nature, but I believe is not applicable here.
编辑-我通过将错误行更改为-
Edit - I solved this question by changing in the line of error as-
.....
if elem.tolist() in clust: #Error in this line
.....
推荐答案
您的问题是,您试图在有问题的那一行进行评估,如果列表列表(集群)包含另一个列表(元素),从而导致列表/true/false值的数组,因为评估是逐元素进行的: 有问题的代码行将按照这些行进行评估
your problem is that you try to evaluate at the line in question if a list of lists (clust) contains another list (elem), which results in an list/array of Truth/False values because the evaluation is done elementwise: The code-line in question will evaluate along the lines
if [True, False, ...]: #<- error here
code
这将产生有问题的错误
与其保留列表列表,不如将数据和集群元素转换/打包为元组列表,此评估将起作用.
instead of of holding lists of lists, convert/pack your data and cluster elements into lists of tuples and this evaluation will work.
这篇关于ValueError:具有多个元素的数组的真值不明确.使用a.any()或a.all():轮廓表现算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!