sklearn中的transformer和estimator有什么区别? [英] what is the difference between transformer and estimator in sklearn?

查看:56
本文介绍了sklearn中的transformer和estimator有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我看到 sklearn 文档中提到了 transformerestimator.

这两个词有什么区别吗?

解决方案

基本区别在于:

  • Transformer 以某种方式转换输入数据 (X).
  • Estimator 使用输入数据 (X) 预测一个(或多个)新值(y).

TransformerEstimator 都应该有一个 fit() 方法可以用来训练它们(它们学习了数据).签名是:

fit(X, y)

fit() 不返回任何值,只是将学习到的数据存储在对象内部.

这里X代表样本(特征向量),y是目标向量(X中每个对应样本可能有一个或多个值)代码>).请注意, y 在某些不需要它的转换器中可以是可选的,但对于大多数估计器(监督估计器)来说它是强制性的.查看StandardScaler 例如.它需要初始数据X来寻找数据的均值和标准差(它学习X的特征,不需要y).

每个 Transformer 应该有一个 transform(X, y) 函数,它像 fit() 接受输入 X 并返回 X 的新转换版本(通常应该具有相同数量的样本,但可能具有或可能不具有相同的特征).

另一方面,Estimator 应该有一个 predict(X) 方法,它应该从给定的 y 输出预测值代码>X.

scikit-learn 中会有一些类同时实现 transform()predict(),比如 KMeans,在这种情况下,仔细阅读文档应该可以解决您的疑虑.>

I saw both transformer and estimator were mentioned in the sklearn documentation.

Is there any difference between these two words?

解决方案

The basic difference is that a:

  • Transformer transforms the input data (X) in some ways.
  • Estimator predicts a new value (or values) (y) by using the input data (X).

Both the Transformer and Estimator should have a fit() method which can be used to train them (they learn some characteristics of the data). The signature is:

fit(X, y)

fit() does not return any value, just stores the learnt data inside the object.

Here X represents the samples (feature vectors) and y is the target vector (which may have single or multiple values per corresponding sample in X). Note that y can be optional in some transformers where its not needed, but its mandatory for most estimators (supervised estimators). Look at StandardScaler for example. It needs the initial data X for finding the mean and std of the data (it learns the characteristics of X, y is not needed).

Each Transformer should have a transform(X, y) function which like fit() takes the input X and returns a new transformed version of X (which generally should have same number samples but may or may not have same features).

On the other hand, Estimator should have a predict(X) method which should output the predicted value of y from the given X.

There will be some classes in scikit-learn which implement both transform() and predict(), like KMeans, in that case carefully reading the documentation should solve your doubts.

这篇关于sklearn中的transformer和estimator有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆