如果validation_data ValueError:具有多个元素的数组的真值不明确,则在model.fit()中引发错误.使用 a.any() 或 a.all() [英] Error raised in model.fit() if validation_data ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
问题描述
我正在尝试运行一个简单的自动编码器模型.我正在从包含词嵌入的 csv 中读取训练数据.我有这个代码,但标题中的错误是在 model.fit()
函数中引发的,并且与我的 validation data
相关.我尝试了很多东西,但错误仍然存在.我是 NLP 新手,也许我的逻辑完全错误,我不知道.因此,如果有人可以提供帮助,我将不胜感激.这是我的代码:
I'm trying to run a simple autoencoder model. I'm reading training data from a csv which consists of word embeddings. I have this code, but the error in the title is raised in model.fit()
function and connected with my validation data
. I tried many things however the error remained. I'm new in NLP and maybe my logic is totally wrong I don't know. So, I'd be appreciated if anybody can help. Here is my code:
def train_predict(df):
X_train, X_validation = train_test_split(df, test_size=0.3, random_state=42, shuffle=True)
X = X_train.iloc[:, :-1].to_numpy() #shape is (1880,220) in here
X = tf.expand_dims(X, axis=-1) #shape is (1880,220,1)
X_val = X_validation.iloc[:,:-1].to_numpy() #shape is (300,220)
X_val= tf.expand_dims(X_val, axis=-1) #shape is (300,220,1)
inputs, decoder_output, visualization = autoEncoder(X)
model = Model(inputs=inputs, outputs=decoder_output)
encoder_model = Model(inputs=inputs, outputs=visualization)
batch_size = 128
train_steps = len(X) // batch_size
val_steps = len(X_val) // batch_size
model.summary()
model.compile(optimizer='adam', metrics=['accuracy'], loss='mean_squared_error')
model.fit(X, steps_per_epoch=train_steps, validation_data=X_val, validation_steps=val_steps,epochs=100)
result = model.evaluate(X_val, steps=10)
另外我的autoEncoder函数代码的细节如下:
Also the detail of my autoEncoder function code is as follows:
def autoEncoder(X_train):
inputs = tf.keras.layers.Input(shape=(X_train.shape[1],1))
# parameters
conv_1 = Conv1D(filters=64, kernel_size=3, activation='relu', padding='same')(inputs)
max_pool_1 = MaxPool1D(pool_size=2)(conv_1)
conv_2 = Conv1D(filters=128, kernel_size=3, activation='relu', padding='same')(max_pool_1)
max_pool_2 = MaxPool1D(pool_size=2)(conv_2)
# BOTTLE NECK
bottle_neck = Conv1D(filters=256, kernel_size=3, activation='relu', padding='same')(max_pool_2)
visualization = Conv1D(filters=1, kernel_size=3, activation='sigmoid', padding='same')(bottle_neck)
# DECODER
conv_3 = Conv1D(filters=128, kernel_size=3, activation='relu', padding='same')(bottle_neck)
upsample_1 = UpSampling1D(size=2)(conv_3)
conv_4 = Conv1D(filters=64, kernel_size=3, activation='relu', padding='same')(upsample_1)
upsample_2 = UpSampling1D(size=2)(conv_4)
decoder_output = Conv1D(filters=1, kernel_size=3, activation='sigmoid', padding='same')(upsample_2)
return inputs, decoder_output, visualization
推荐答案
如果你能复制粘贴你的代码产生的整个错误堆栈,那就太好了,每个人都应该遵循与错误相关的问题,因为那使调试变得更容易.
It'd be excellent if you could copy-paste the entire stack of error that your code produces, something that everyone should follow for error-related questions because that makes debugging that much easier.
这是使用虚拟数据集重现相同错误的尝试:
Here's an attempt to reproduce the same error using a dummy dataset:
import numpy as np
import tensorflow as tf
np.random.seed(11)
np.set_printoptions(precision=2)
def autoEncoder(X_train):
inputs = tf.keras.layers.Input(shape=(X_train.shape[1], 1))
conv_1 = tf.keras.layers.Conv1D(filters=64, kernel_size=3, activation='relu', padding='same')(inputs)
max_pool_1 = tf.keras.layers.MaxPool1D(pool_size=2)(conv_1)
conv_2 = tf.keras.layers.Conv1D(filters=128, kernel_size=3, activation='relu', padding='same')(max_pool_1)
max_pool_2 = tf.keras.layers.MaxPool1D(pool_size=2)(conv_2)
bottle_neck = tf.keras.layers.Conv1D(filters=256, kernel_size=3, activation='relu', padding='same')(max_pool_2)
visualization = tf.keras.layers.Conv1D(filters=1, kernel_size=3, activation='sigmoid', padding='same')(bottle_neck)
conv_3 = tf.keras.layers.Conv1D(filters=128, kernel_size=3, activation='relu', padding='same')(bottle_neck)
upsample_1 = tf.keras.layers.UpSampling1D(size=2)(conv_3)
conv_4 = tf.keras.layers.Conv1D(filters=64, kernel_size=3, activation='relu', padding='same')(upsample_1)
upsample_2 = tf.keras.layers.UpSampling1D(size=2)(conv_4)
decoder_output = tf.keras.layers.Conv1D(filters=1, kernel_size=3, activation='sigmoid', padding='same')(upsample_2)
return inputs, decoder_output, visualization
X = np.random.randn(1880, 220)
X_val = np.random.randn(300, 220)
X = np.expand_dims(X, axis=-1)
X = tf.convert_to_tensor(X) # (1880, 220, 1)
X_val = np.expand_dims(X_val, axis=-1)
X_val = tf.convert_to_tensor(X_val) # (300, 220, 1)
inputs, decoder_output, visualization = autoEncoder(X)
model = tf.keras.Model(inputs=inputs, outputs=decoder_output)
encoder_model = tf.keras.Model(inputs=inputs, outputs=visualization)
batch_size = 128
train_steps = len(X) // batch_size
val_steps = len(X_val) // batch_size
model.compile(optimizer='adam', metrics=['accuracy'], loss='mean_squared_error')
model.fit(X, steps_per_epoch=train_steps, validation_data = X_val, validation_steps=val_steps, epochs=100)
在 google-colab 上,这会出现以下错误:
On google-colab this gives the following error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-29-a889c5a46f35> in <module>()
3 val_steps = len(X_val) // batch_size
4 model.compile(optimizer='adam', metrics=['accuracy'], loss='mean_squared_error')
----> 5 model.fit(X, steps_per_epoch=train_steps, validation_data = X_val, validation_steps=val_steps, epochs=100)
1 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
1041 (x, y, sample_weight), validation_split=validation_split))
1042
-> 1043 if validation_data:
1044 val_x, val_y, val_sample_weight = (
1045 data_adapter.unpack_x_y_sample_weight(validation_data))
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py in __bool__(self)
990
991 def __bool__(self):
--> 992 return bool(self._numpy())
993
994 __nonzero__ = __bool__
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
这与您的 OP 相同.最好发布错误堆栈的原因是因为答案隐藏在这些行中,特别是:
which is identical to your OP. The reason it'd be better to post the error stack is because the answer is hidden in these lines, specifically:
1043 if validation_data:
1044 val_x, val_y, val_sample_weight = (
1045 data_adapter.unpack_x_y_sample_weight(validation_data))
validation_data
的格式与 (x, y, sample_weight)
相同.以下是 fit method 文档的说明:
The format of validation_data
is identical to (x, y, sample_weight)
. Here's what fit method documentation has to say:
validation_data
将覆盖 validation_split
.validation_data
可以是: - Numpy 数组或张量的元组 (x_val, y_val)
- Numpy 数组的元组 (x_val, y_val, val_sample_weights)
-dataset 对于前两种情况,必须提供batch_size.对于最后一种情况,可以提供 validation_steps
.
validation_data
will overridevalidation_split
.validation_data
could be: - tuple(x_val, y_val)
of Numpy arrays or tensors - tuple(x_val, y_val, val_sample_weights)
of Numpy arrays - dataset For the first two cases, batch_size must be provided. For the last case,validation_steps
could be provided.
我想您现在明白为什么会出现错误了,您的自动编码器没有 Y
.这不应该有任何问题,因为您的 X
本身就是您的 Y
.这是 编码器教程 可以在这种情况下帮助我们:
I think you now understand why you're getting an error, there's no Y
for the your autoencoder. Which shouldn't be of any concern since your X
itself is your Y
. Here's a line from an encoder tutorial that would help us in this situation:
使用 x_train
作为输入和目标来训练模型.encoder
将学习将数据集从 784 维压缩到潜在空间,decoder
将学习重建原始图像.
Train the model using
x_train
as both the input and the target. Theencoder
will learn to compress the dataset from 784 dimensions to the latent space, and thedecoder
will learn to reconstruct the original images.
因此,您应该编写以下内容:
So, what you were expected to do is to write the following:
model.fit(X, X, steps_per_epoch=train_steps, validation_data=(X_val, X_val), validation_steps=val_steps, epochs=100)
这确实开始了训练!
这篇关于如果validation_data ValueError:具有多个元素的数组的真值不明确,则在model.fit()中引发错误.使用 a.any() 或 a.all()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!