在SKLearn中使用类权重的KNN [英] KNN with class weights in SKLearn

查看:584
本文介绍了在SKLearn中使用类权重的KNN的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以在SKLearn中为K最近邻分类器定义类权重?我看过API,但无法解决.我有一个knn问题,它的类数非常不平衡(某些类的数量为10000,其他类的数量为1).

Is it possible to define class weights for a K-nearest neighbour classifier in SKLearn? I have looked at the API but cannot work it out. I have a knn problem which has very imbalanced numbers of classes (10000 of some, to 1 of others).

推荐答案

sklearn中的原始knn似乎不提供该选项.您可以通过在距离方程式中添加系数(权重)来更改源代码,以便对属于多数类别(例如,系数为1.5)的记录扩大距离.

The original knn in sklearn does not seem to offer that option. You can alter the source code though by adding coefficients (weights) to the distance equation such that the distance is amplified for records belonging to the majority class (e.g., with a coefficient of 1.5).

https://github .com/scikit-learn/scikit-learn/blob/7b136e9/sklearn/neighbors/classification.py#L23

或者,不平衡学习模块是scikit-learn-contrib项目的一部分,可以用于类间不平衡程度很高的数据集:

Alternatively, the imbalanced-learn module, which is part of scikit-learn-contrib projects, can be used for data sets with high between-class imbalance:

http://contrib.scikit-learn.org/imbalanced -learn/stable/introduction.html

(在进行二进制分类的情况下,您可以选择将该问题视为无监督的异常检测问题,并使用sklearn中的一类SVM等方法进行分类)

(in case of binary classification, you may alternatively treat the problem as an unsupervised outlier detection problem, and use methods like one-class SVM in sklearn to perform the classification)

这篇关于在SKLearn中使用类权重的KNN的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆