具有 l1 的逻辑的 SGD 分类器结果和 statsmodels 结果的差异 [英] Difference in SGD classifier results and statsmodels results for logistic with l1

查看:29
本文介绍了具有 l1 的逻辑的 SGD 分类器结果和 statsmodels 结果的差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为对我工作的检查,我一直在比较 scikit learn 的 SGDClassifier 逻辑实现与 statsmodels 逻辑的输出.一旦我将一些 l1 与分类变量相结合,我就会得到非常不同的结果.这是不同解决方法的结果还是我没有使用正确的参数?

As a check on my work, I've been comparing the output of scikit learn's SGDClassifier logistic implementation with statsmodels logistic. Once I add some l1 in combination with categorical variables, I'm getting very different results. Is this a result of different solution techniques or am I not using the correct parameter?

我自己的数据集差异更大,但使用 mtcars 仍然相当大:

Much bigger differences on my own dataset, but still pretty large using mtcars:

 df = sm.datasets.get_rdataset("mtcars", "datasets").data

 y, X = patsy.dmatrices('am~standardize(wt) + standardize(disp) + C(cyl) - 1', df)

 logit = sm.Logit(y, X).fit_regularized(alpha=.0035)

 clf = SGDClassifier(alpha=.0035, penalty='l1', loss='log', l1_ratio=1,
                n_iter=1000, fit_intercept=False)
 clf.fit(X, y)

给出:

sklearn: [-3.79663192 -1.16145654  0.95744308 -5.90284803 -0.67666106]
statsmodels: [-7.28440744 -2.53098894  3.33574042 -7.50604097 -3.15087396]

推荐答案

我一直在解决一些类似的问题.我认为简短的回答可能是 SGD 仅在少量样本时效果不佳,但在处理较大数据时(更多)性能良好.我有兴趣听取 sklearn 开发人员的意见.比较,例如,使用 LogisticRegression

I've been working through some similar issues. I think the short answer might be that SGD doesn't work so well with only a few samples, but is (much more) performant with larger data. I'd be interested in hearing from sklearn devs. Compare, for example, using LogisticRegression

clf2 = LogisticRegression(penalty='l1', C=1/.0035, fit_intercept=False)
clf2.fit(X, y)

给出非常类似于 l1 惩罚的 Logit.

gives very similar to l1 penalized Logit.

array([[-7.27275526, -2.52638167,  3.32801895, -7.50119041, -3.14198402]])

这篇关于具有 l1 的逻辑的 SGD 分类器结果和 statsmodels 结果的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆