sklearn中的'transform'和'fit_transform'有什么区别 [英] what is the difference between 'transform' and 'fit_transform' in sklearn
问题描述
在sklearn-python工具箱中,有关于sklearn.decomposition.RandomizedPCA
的两个函数transform
和fit_transform
.两个函数的说明如下
但是它们之间有什么区别呢?
.transform
方法适用于当您已经计算了 PCA
时,即如果您已经调用它的 .fit
方法.
在[12]中:pc2 = RandomizedPCA(n_components=3)In [13]: pc2.transform(X) # 不能转换,因为它不知道怎么做.---------------------------------------------------------------------------AttributeError 回溯(最近一次调用最后一次)<ipython-input-13-e3b6b8ea2aff>在 <module>()---->1 pc2.transform(X)/usr/local/lib/python3.4/dist-packages/sklearn/decomposition/pca.py in transform(self, X, y)714 # XXX 在 0.16 中移除 scipy.sparse 支持715 X = atleast2d_or_csr(X)-->716 如果 self.mean_ 不是 None:717 X = X - self.mean_718AttributeError: 'RandomizedPCA' 对象没有属性 'mean_'在 [14]: pc2.ftransform(X)pc2.fit pc2.fit_transform在 [14]: pc2.fit_transform(X)出[14]:数组([[-1.38340578, -0.2935787 ],[-2.22189802, 0.25133484],[-3.6053038,-0.04224385],[ 1.38340578, 0.2935787 ],[ 2.22189802, -0.25133484],[ 3.6053038 , 0.04224385]])
所以你想 fit
RandomizedPCA
然后 transform
为:
在[20]中:pca = RandomizedPCA(n_components=3)在 [21] 中:pca.fit(X)出[21]:RandomizedPCA(copy=True, iterated_power=3, n_components=3, random_state=None,白=假)在 [22]: pca.transform(z)出[22]:数组([[ 2.76681156, 0.58715739],[ 1.92831932, 1.13207093],[ 0.54491354, 0.83849224],[ 5.53362311, 1.17431479],[ 6.37211535, 0.62940125],[ 7.75552113, 0.92297994]])在 [23] 中:
特别是 PCA .transform
将通过矩阵 X
的 PCA 分解获得的基变化应用于矩阵 Z
.>
In the sklearn-python toolbox, there are two functions transform
and fit_transform
about sklearn.decomposition.RandomizedPCA
. The description of two functions are as follows
But what is the difference between them ?
The .transform
method is meant for when you have already computed PCA
, i.e. if you have already called its .fit
method.
In [12]: pc2 = RandomizedPCA(n_components=3)
In [13]: pc2.transform(X) # can't transform because it does not know how to do it.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-13-e3b6b8ea2aff> in <module>()
----> 1 pc2.transform(X)
/usr/local/lib/python3.4/dist-packages/sklearn/decomposition/pca.py in transform(self, X, y)
714 # XXX remove scipy.sparse support here in 0.16
715 X = atleast2d_or_csr(X)
--> 716 if self.mean_ is not None:
717 X = X - self.mean_
718
AttributeError: 'RandomizedPCA' object has no attribute 'mean_'
In [14]: pc2.ftransform(X)
pc2.fit pc2.fit_transform
In [14]: pc2.fit_transform(X)
Out[14]:
array([[-1.38340578, -0.2935787 ],
[-2.22189802, 0.25133484],
[-3.6053038 , -0.04224385],
[ 1.38340578, 0.2935787 ],
[ 2.22189802, -0.25133484],
[ 3.6053038 , 0.04224385]])
So you want to fit
RandomizedPCA
and then transform
as:
In [20]: pca = RandomizedPCA(n_components=3)
In [21]: pca.fit(X)
Out[21]:
RandomizedPCA(copy=True, iterated_power=3, n_components=3, random_state=None,
whiten=False)
In [22]: pca.transform(z)
Out[22]:
array([[ 2.76681156, 0.58715739],
[ 1.92831932, 1.13207093],
[ 0.54491354, 0.83849224],
[ 5.53362311, 1.17431479],
[ 6.37211535, 0.62940125],
[ 7.75552113, 0.92297994]])
In [23]:
In particular PCA .transform
applies the change of basis obtained through the PCA decomposition of the matrix X
to the matrix Z
.
这篇关于sklearn中的'transform'和'fit_transform'有什么区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!