将SVM分类器从sklearn导出到Java代码库 [英] Exporting SVM classifiers from sklearn to Java codebase

查看:712
本文介绍了将SVM分类器从sklearn导出到Java代码库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经使用 sklearn 来训练一组SVM分类器(大多数是线性的,使用 LinearSVM 但其中一些是使用带有rbf内核的 SVC 类,我对结果非常满意。现在我需要将 production 中的分类器导出到另一个使用Java的代码库中。我正在寻找可以在maven中发布的可以轻松合并到这个新代码库中的库。

I have used sklearn to train a set of SVM classifiers (mostly linear using LinearSVM but some of them are using the SVC class with rbf kernel) and I am pretty happy with the results. Now I need to export the classifiers in production into another codebase that uses Java. I am looking for possible libraries, that are published in maven, that can be easily incorporated in this new codebase.

你有什么建议?

推荐答案

线性分类器很简单:它们有 coef _ intercept_ ,在类docstrings中描述。这些是常规的NumPy数组,因此您可以使用标准的NumPy函数将它们转储到磁盘。

Linear classifiers are easy: they have a coef_ and an intercept_, described in the class docstrings. Those are regular NumPy arrays, so you can dump them to disk with standard NumPy functions.

>>> from sklearn.datasets import load_iris
>>> iris = load_iris()
>>> from sklearn.svm import LinearSVC
>>> clf = LinearSVC().fit(iris.data, iris.target)

现在让我们将其转储到伪文件:

Now let's dump this to a pseudo-file:

>>> from io import BytesIO
>>> outfile = BytesIO()
>>> np.savetxt(outfile, clf.coef_)
>>> print(outfile.getvalue())
1.842426121444650788e-01 4.512319840786759295e-01 -8.079381916413134190e-01 -4.507115611351246720e-01
5.201335313639676022e-02 -8.941985347763323766e-01 4.052446671573840531e-01 -9.380586070674181709e-01
-8.506908158338851722e-01 -9.867329247779884627e-01 1.380997337625912147e+00 1.865393234038096981e+00

这是你可以用Java解析的,对吧?

That's something you can parse from Java, right?

现在得到 k 的得分对于样本 x 的课程,您需要评估

Now to get a score for the k'th class on a sample x, you need to evaluate

np.dot(x, clf.coef_[k]) + clf.intercept_[k]
# ==
(sum(x[i] * clf.coef_[k, i] for i in xrange(clf.coef_.shape[1]))
 + clf.intercept_[k])

我希望,这也是可行的。分数最高的类获胜。

which is also doable, I hope. The class with the highest score wins.

对于内核SVM,情况更复杂,因为您需要复制一对一决策函数,以及Java代码中的内核。 SVM模型存储在属性 support_vectors _ dual_coef _ SVC 对象中$ c>。

For kernel SVMs, the situation is more complicated because you need to replicate the one-vs-one decision function, as well as the kernels, in the Java code. The SVM model is stored on SVC objects in the attributes support_vectors_ and dual_coef_.

这篇关于将SVM分类器从sklearn导出到Java代码库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆