如何使用离散的二进制属性对数据进行聚类? [英] How to cluster data with discrete binary attributes?

查看：536 发布时间：2020/5/4 9:55:41 machine-learning data-mining cluster-analysis

本文介绍了如何使用离散的二进制属性对数据进行聚类?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在我的数据中，有一千万个二进制属性，但是只有其中一些是有益的，大多数是零.

In my data, there are ten millions of binary attributes, But only some of them are informative, most of them are zeros.

格式如下:

data  attribute1 attribute2 attribute3 attribute4   .........
A          0          1           0         1       .........
B          1          0           1         0       .........
C          1          1           0         1       .........
D          1          1           0         0       .........

将其集群化的聪明方法是什么? 我知道K均值聚类.但我认为这种情况不适合. 因为二进制值使距离不那么明显. 它将遭受高维诅咒的困扰. 前夕，如果我基于这些少量的信息属性进行聚类，那么它仍然具有许多属性.

What is a smart way to cluster this? I know K-means clustering. But I don't think it's suitable in this case. Because the binary value makes distances less obvious. And it will suffer form the curse of high-dimensionality. Eeve if I cluster based on those few informative attribute, it's still to many attributes.

我认为决策树很好地将这些数据聚类. 但这是一种分类算法！

I think the decision tree is nice to cluster this data. But it's a Classification algorithm!

我该怎么办?

如何使用离散的二进制属性对数据进行聚类? [英] How to cluster data with discrete binary attributes?

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

如何使用离散的二进制属性对数据进行聚类? [英] How to cluster data with discrete binary attributes?

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭