sklearn中的transformer和estimator有什么区别? [英] what is the difference between transformer and estimator in sklearn?
问题描述
我看到 sklearn 文档中提到了 transformer 和 estimator.
这两个词有什么区别吗?
基本区别在于:
Transformer
以某种方式转换输入数据 (X
).Estimator
使用输入数据 (X
) 预测一个(或多个)新值(y
).
Transformer
和 Estimator
都应该有一个 fit()
方法可以用来训练它们(它们学习了数据).签名是:
fit(X, y)
fit()
不返回任何值,只是将学习到的数据存储在对象内部.
这里X
代表样本(特征向量),y
是目标向量(X
中每个对应样本可能有一个或多个值)代码>).请注意, y
在某些不需要它的转换器中可以是可选的,但对于大多数估计器(监督估计器)来说它是强制性的.查看StandardScaler
例如.它需要初始数据X
来寻找数据的均值和标准差(它学习X
的特征,不需要y
).
每个 Transformer
应该有一个 transform(X, y)
函数,它像 fit()
接受输入 X
并返回 X
的新转换版本(通常应该具有相同数量的样本,但可能具有或可能不具有相同的特征).
另一方面,Estimator
应该有一个 predict(X)
方法,它应该从给定的 y
输出预测值代码>X.
scikit-learn 中会有一些类同时实现 transform()
和 predict()
,比如 KMeans
,在这种情况下,仔细阅读文档应该可以解决您的疑虑.>
I saw both transformer and estimator were mentioned in the sklearn documentation.
Is there any difference between these two words?
The basic difference is that a:
Transformer
transforms the input data (X
) in some ways.Estimator
predicts a new value (or values) (y
) by using the input data (X
).
Both the Transformer
and Estimator
should have a fit()
method which can be used to train them (they learn some characteristics of the data). The signature is:
fit(X, y)
fit()
does not return any value, just stores the learnt data inside the object.
Here X
represents the samples (feature vectors) and y
is the target vector (which may have single or multiple values per corresponding sample in X
). Note that y
can be optional in some transformers where its not needed, but its mandatory for most estimators (supervised estimators). Look at StandardScaler
for example. It needs the initial data X
for finding the mean and std of the data (it learns the characteristics of X
, y
is not needed).
Each Transformer
should have a transform(X, y)
function which like fit()
takes the input X
and returns a new transformed version of X
(which generally should have same number samples but may or may not have same features).
On the other hand, Estimator
should have a predict(X)
method which should output the predicted value of y
from the given X
.
There will be some classes in scikit-learn which implement both transform()
and predict()
, like KMeans
, in that case carefully reading the documentation should solve your doubts.
这篇关于sklearn中的transformer和estimator有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!