如何使用MinMaxScaler sklearn归一化训练和测试数据 [英] How to normalize the Train and Test data using MinMaxScaler sklearn

查看：1099 发布时间：2020/5/4 9:07:07 python machine-learning scikit-learn normalization sklearn-pandas

本文介绍了如何使用MinMaxScaler sklearn归一化训练和测试数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

所以，我对此有疑问，一直在寻找答案.所以问题是我何时使用

So, I have this doubt and have been looking for answers. So the question is when I use,

from sklearn import preprocessing
min_max_scaler = preprocessing.MinMaxScaler()

df = pd.DataFrame({'A':[1,2,3,7,9,15,16,1,5,6,2,4,8,9],'B':[15,12,10,11,8,14,17,20,4,12,4,5,17,19],'C':['Y','Y','Y','Y','N','N','N','Y','N','Y','N','N','Y','Y']})

df[['A','B']] = min_max_scaler.fit_transform(df[['A','B']])
df['C'] = df['C'].apply(lambda x: 0 if x.strip()=='N' else 1)

之后，我将训练和测试模型(A，B作为特征，C作为Label)并获得一些准确性得分.现在我的疑问是，当我必须预测新数据集的标签时会发生什么.说，

After which I will train and test the model (A,B as features, C as Label) and get some accuracy score. Now my doubt is, what happens when I have to predict the label for new set of data. Say,

df = pd.DataFrame({'A':[25,67,24,76,23],'B':[2,54,22,75,19]})

因为当我规范化列时，A和B的值将根据新数据而不是将在其上训练模型的数据进行更改. 因此，现在经过下面的数据准备步骤后，我的数据将成为.

Because when I normalize the column the values of A and B will be changed according to the new data, not the data which the model will be trained on. So, now my data after the data preparation step that is as below, will be.

data[['A','B']] = min_max_scaler.fit_transform(data[['A','B']])

A和B的值将相对于df[['A','B']]的Max和Min值而改变. df[['A','B']]的数据准备相对于df[['A','B']]的Min Max.

Values of A and B will change with respect to the Max and Min value of df[['A','B']]. The data prep of df[['A','B']] is with respect to Min Max of df[['A','B']].

关于不同数字的数据准备如何有效?我不明白这个预测在这里如何正确.

How can the data preparation be valid with respect to different numbers relate? I don't understand how the prediction will be correct here.

您应该使用`training`数据拟合`MinMaxScaler`，然后在进行预测之前将定标器应用于`testing`数据.

摘要:

You should fit the `MinMaxScaler` using the `training` data and then apply the scaler on the `testing` data before the prediction.

In summary:

步骤1:将scaler放在TRAINING data
第2步:使用scaler至transform the training data
第3步:使用transformed training data至fit the predictive model
第4步:使用scaler至transform the TEST data
步骤5:predict使用trained model和transformed TEST data

Step 1: fit the scaler on the TRAINING data
Step 2: use the scaler to transform the training data
Step 3: use the transformed training data to fit the predictive model
Step 4: use the scaler to transform the TEST data
Step 5: predict using the trained model and the transformed TEST data

使用数据的示例:

from sklearn import preprocessing
min_max_scaler = preprocessing.MinMaxScaler()
#training data
df = pd.DataFrame({'A':[1,2,3,7,9,15,16,1,5,6,2,4,8,9],'B':[15,12,10,11,8,14,17,20,4,12,4,5,17,19],'C':['Y','Y','Y','Y','N','N','N','Y','N','Y','N','N','Y','Y']})
#fit and transform the training data and use them for the model training
df[['A','B']] = min_max_scaler.fit_transform(df[['A','B']])
df['C'] = df['C'].apply(lambda x: 0 if x.strip()=='N' else 1)

#fit the model
model.fit(df['A','B'])

#after the model training on the transformed training data define the testing data df_test
df_test = pd.DataFrame({'A':[25,67,24,76,23],'B':[2,54,22,75,19]})

#before the prediction of the test data, ONLY APPLY the scaler on them
df_test[['A','B']] = min_max_scaler.transform(df_test[['A','B']])

#test the model
y_predicted_from_model = model.predict(df_test['A','B'])

使用虹膜数据的示例:

import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.svm import SVC

data = datasets.load_iris()
X = data.data
y = data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)

scaler = MinMaxScaler()
X_train_scaled = scaler.fit_transform(X_train)

model = SVC()
model.fit(X_train_scaled, y_train)

X_test_scaled = scaler.transform(X_test)
y_pred = model.predict(X_test_scaled)

希望这会有所帮助.

这篇关于如何使用MinMaxScaler sklearn归一化训练和测试数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用MinMaxScaler sklearn归一化训练和测试数据 [英] How to normalize the Train and Test data using MinMaxScaler sklearn

问题描述

推荐答案

您应该使用`training`数据拟合`MinMaxScaler`，然后在进行预测之前将定标器应用于`testing`数据.

You should fit the `MinMaxScaler` using the `training` data and then apply the scaler on the `testing` data before the prediction.

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

如何使用MinMaxScaler sklearn归一化训练和测试数据 [英] How to normalize the Train and Test data using MinMaxScaler sklearn

问题描述

推荐答案

您应该使用training数据拟合MinMaxScaler，然后在进行预测之前将定标器应用于testing数据.

You should fit the MinMaxScaler using the training data and then apply the scaler on the testing data before the prediction.

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

您应该使用`training`数据拟合`MinMaxScaler`，然后在进行预测之前将定标器应用于`testing`数据.

You should fit the `MinMaxScaler` using the `training` data and then apply the scaler on the `testing` data before the prediction.

登录关闭