如何在sklearn中编码分类变量? [英] How to encode a categorical variable in sklearn?

查看：459 发布时间：2020/5/4 9:07:04 python machine-learning scikit-learn

本文介绍了如何在sklearn中编码分类变量?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用UCI存储库中的汽车评估数据集，我想知道是否存在方便的方法来对sklearn中的分类变量进行二值化.一种方法是使用LabelBinarizer的DictVectorizer，但在这里我得到了k个不同的特征，而为了避免共线性化，您应该只有k-1个. 我想我可以编写自己的函数并删除一列，但这种簿记工作很繁琐，是否有一种简单的方法来执行此类转换并因此获得稀疏矩阵?

I'm trying to use the car evaluation dataset from the UCI repository and I wonder whether there is a convenient way to binarize categorical variables in sklearn. One approach would be to use the DictVectorizer of LabelBinarizer but here I'm getting k different features whereas you should have just k-1 in order to avoid collinearization. I guess I could write my own function and drop one column but this bookkeeping is tedious, is there an easy way to perform such transformations and get as a result a sparse matrix?

如何在sklearn中编码分类变量? [英] How to encode a categorical variable in sklearn?

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

如何在sklearn中编码分类变量? [英] How to encode a categorical variable in sklearn?

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭