加载模型时,如何使用最小最大缩放器拟合测试数据? [英] How can I fit the test data using min max scaler when I am loading the model?

查看:235
本文介绍了加载模型时,如何使用最小最大缩放器拟合测试数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在做自动编码器模型.我已经保存了模型,然后使用最小最大缩放器缩放了数据.

X_train = df.values
scaler = MinMaxScaler()
X_train_scaled = scaler.fit_transform(X_train)

完成此操作后,我对模型进行了拟合并将其保存为'h5'文件.现在,当我给出测试数据时,在自然地加载保存的模型之后,也应该对其进行缩放.

因此,当我加载模型并使用

对其进行缩放时

X_test_scaled  = scaler.transform(X_test)

出现错误

NotFittedError: This MinMaxScaler instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.

所以我给了X_test_scaled = scaler.fit_transform(X_test) (我有种预感,这是愚蠢的)在我训练它并一起测试时,给出的结果(在加载保存的模型和测试之后)是不同的.我现在已经为我的目的保存了大约4000个模型(因此,由于花费大量时间,因此我无法再次训练并保存所有模型,所以我想找到一条出路).

是否有一种方法可以通过按照训练的方式对其进行转换来缩放测试数据(可能不保存缩放值,我不知道).或者可以对模型进行缩放以使我可以在非规模的数据.

如果我在任何方面都未加强调或过分强调,请在评论中让我知道!

解决方案

X_test_scaled  = scaler.fit_transform(X_test)

给定X_test X_train中要素的最小值和最大值,

将缩放X_test.

您的原始代码无效的原因是 在将scaler装入X_train后或以某种方式改写(例如,通过重新初始化),可能没有保存.这就是为什么由于scaler不适合任何数据而引发错误的原因.

当您调用X_test_scaled = scaler.fit_transform(X_test)时,您将scaler适配到X_test并同时转换X_test,这就是为什么代码能够运行的原因,但是此步骤不正确,因为您已经推测. >

你想要的是

X_train = df.values
scaler = MinMaxScaler()
X_train_scaled = scaler.fit_transform(X_train)

# Save scaler
import pickle as pkl
with open("scaler.pkl", "wb") as outfile:
    pkl.dump(scaler, outfile)

# Some other code for training your autoencoder
# ...

然后在您的测试脚本中

# During test time
# Load scaler that was fitted on training data
with open("scaler.pkl", "rb") as infile:
    scaler = pkl.load(infile)
    X_test_scaled = scaler.transform(X_test)  # Note: not fit_transform.

请注意,从磁盘重新加载scaler对象后,不必重新安装它.它包含从训练数据中获得的所有信息(比例因子等).您只需在X_test上调用它即可.

I am doing auto encoder model.I have saved the model before which I scaled the data using min max scaler.

X_train = df.values
scaler = MinMaxScaler()
X_train_scaled = scaler.fit_transform(X_train)

After doing this I fitted the model and saved it as 'h5' file.Now when I give test data, after loading the saved model naturally it should be scaled as well.

So when I load the model and scale it by using

X_test_scaled  = scaler.transform(X_test)

It gives the error

NotFittedError: This MinMaxScaler instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.

So I gave X_test_scaled = scaler.fit_transform(X_test) (Which I had a hunch that it is foolish)did gave a result(after loading saved model and test) which was different when I trained it and test it together. I have saved around 4000 models now for my purpose(So I cant train and save it all again as it costs a lot time,So I want a way out).

Is there a way I can scale the test data by transforming it the way I trained it(may be saving the scaled values, I do not know).Or may be descale the model so that I can test the model on non-scaled data.

If I under-emphasized or over-emphasized any point ,please let me know in the comments!

解决方案

X_test_scaled  = scaler.fit_transform(X_test)

will scale X_test given the minimum and maximum values of features in X_test and not X_train.

The reason your original code did not work is because you probably did not save scaler after fitting it to X_train or overwrote it somehow (for e.g., by re-initializing it). This is why the error was thrown as scaler was not fitted to any data.

When you then call X_test_scaled = scaler.fit_transform(X_test), you are fitting scaler to X_test and simultaneously tranforming X_test, which was why the code was able to run, but this step is incorrect as you already surmised.

What you want is

X_train = df.values
scaler = MinMaxScaler()
X_train_scaled = scaler.fit_transform(X_train)

# Save scaler
import pickle as pkl
with open("scaler.pkl", "wb") as outfile:
    pkl.dump(scaler, outfile)

# Some other code for training your autoencoder
# ...

Then in your test script

# During test time
# Load scaler that was fitted on training data
with open("scaler.pkl", "rb") as infile:
    scaler = pkl.load(infile)
    X_test_scaled = scaler.transform(X_test)  # Note: not fit_transform.

Note you don't have to re-fit the scaler object after loading it back from disk. It contains all the information (the scaling factors etc.) obtained from the training data. You just call it on X_test.

这篇关于加载模型时,如何使用最小最大缩放器拟合测试数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆