"with_std = False或True"之间的StandardScaler差异和"with_mean = False或True"; [英] StandardScaler difference between "with_std=False or True" and "with_mean=False or True"

查看:446
本文介绍了"with_std = False或True"之间的StandardScaler差异和"with_mean = False或True";的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试标准化一些数据,以便能够对其应用PCA.我正在使用sklearn.preprocessing.StandardScaler.我很难理解在参数with_meanwith_std中使用TrueFalse之间的区别(

I am trying to standardize some data to be able to apply PCA to it. I am using sklearn.preprocessing.StandardScaler. I am having trouble to understand the difference between using True or False in the parameters with_mean and with_std (documentation).

有人可以提供更详细的解释吗?

Can someone give a more extended explanation?

推荐答案

我在此线程中提供了更多详细信息,但我也要在这里解释一下.

I have provided more details in this thread, but let me just explain this here as well.

数据标准化(每个列/功能/每个变量)涉及以下方程式:

The standardation of the data (each column/feature/variable indivivually) involves the following equations:

说明:

如果将with_meanwith_std设置为False,则将平均值μ设置为0,将std设置为1,假定列/特征来自正态高斯.分布(平均值为0,标准差为1).

If you set with_mean and with_std to False, then the mean μ is set to 0 and the std to 1, assuming that the columns/features are coming from the normal gaussian distribution (which has 0 mean and 1 std).

如果将with_meanwith_std设置为True,则实际上将使用数据的真实μσ.这是最常见的方法.

If you set with_mean and with_std to True, then you will actually use the true μ and σ of your data. This is the most common approach.

这篇关于"with_std = False或True"之间的StandardScaler差异和"with_mean = False或True";的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆