如果validation_data ValueError:具有多个元素的数组的真值不明确,则在model.fit()中引发错误.使用 a.any() 或 a.all() [英] Error raised in model.fit() if validation_data ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

查看:46
本文介绍了如果validation_data ValueError:具有多个元素的数组的真值不明确,则在model.fit()中引发错误.使用 a.any() 或 a.all()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试运行一个简单的自动编码器模型.我正在从包含词嵌入的 csv 中读取训练数据.我有这个代码,但标题中的错误是在 model.fit() 函数中引发的,并且与我的 validation data 相关.我尝试了很多东西,但错误仍然存​​在.我是 NLP 新手,也许我的逻辑完全错误,我不知道.因此,如果有人可以提供帮助,我将不胜感激.这是我的代码:

I'm trying to run a simple autoencoder model. I'm reading training data from a csv which consists of word embeddings. I have this code, but the error in the title is raised in model.fit() function and connected with my validation data. I tried many things however the error remained. I'm new in NLP and maybe my logic is totally wrong I don't know. So, I'd be appreciated if anybody can help. Here is my code:

def train_predict(df):
X_train, X_validation = train_test_split(df, test_size=0.3, random_state=42, shuffle=True)
X = X_train.iloc[:, :-1].to_numpy()           #shape is (1880,220) in here
X = tf.expand_dims(X, axis=-1)                #shape is (1880,220,1)
X_val = X_validation.iloc[:,:-1].to_numpy()   #shape is (300,220)
X_val= tf.expand_dims(X_val, axis=-1)         #shape is (300,220,1)

inputs, decoder_output, visualization = autoEncoder(X)

model = Model(inputs=inputs, outputs=decoder_output)
encoder_model = Model(inputs=inputs, outputs=visualization)

batch_size = 128
train_steps = len(X) // batch_size
val_steps = len(X_val) // batch_size
model.summary()
model.compile(optimizer='adam', metrics=['accuracy'], loss='mean_squared_error')
model.fit(X, steps_per_epoch=train_steps, validation_data=X_val, validation_steps=val_steps,epochs=100) 
result = model.evaluate(X_val, steps=10)

另外我的autoEncoder函数代码的细节如下:

Also the detail of my autoEncoder function code is as follows:

def autoEncoder(X_train):
inputs = tf.keras.layers.Input(shape=(X_train.shape[1],1))
# parameters
conv_1 = Conv1D(filters=64, kernel_size=3, activation='relu', padding='same')(inputs)
max_pool_1 = MaxPool1D(pool_size=2)(conv_1)

conv_2 = Conv1D(filters=128, kernel_size=3, activation='relu', padding='same')(max_pool_1)
max_pool_2 = MaxPool1D(pool_size=2)(conv_2)

# BOTTLE NECK

bottle_neck = Conv1D(filters=256, kernel_size=3, activation='relu', padding='same')(max_pool_2)
visualization = Conv1D(filters=1, kernel_size=3, activation='sigmoid', padding='same')(bottle_neck)

# DECODER
conv_3 = Conv1D(filters=128, kernel_size=3, activation='relu', padding='same')(bottle_neck)
upsample_1 = UpSampling1D(size=2)(conv_3)

conv_4 = Conv1D(filters=64, kernel_size=3, activation='relu', padding='same')(upsample_1)
upsample_2 = UpSampling1D(size=2)(conv_4)

decoder_output = Conv1D(filters=1, kernel_size=3, activation='sigmoid', padding='same')(upsample_2)

return inputs, decoder_output, visualization

推荐答案

如果你能复制粘贴你的代码产生的整个错误堆栈,那就太好了,每个人都应该遵循与错误相关的问题,因为那使调试变得更容易.

It'd be excellent if you could copy-paste the entire stack of error that your code produces, something that everyone should follow for error-related questions because that makes debugging that much easier.

这是使用虚拟数据集重现相同错误的尝试:

Here's an attempt to reproduce the same error using a dummy dataset:

import numpy as np
import tensorflow as tf

np.random.seed(11)
np.set_printoptions(precision=2)

def autoEncoder(X_train):
    inputs = tf.keras.layers.Input(shape=(X_train.shape[1], 1))
    conv_1 = tf.keras.layers.Conv1D(filters=64, kernel_size=3, activation='relu', padding='same')(inputs)
    max_pool_1 = tf.keras.layers.MaxPool1D(pool_size=2)(conv_1)

    conv_2 = tf.keras.layers.Conv1D(filters=128, kernel_size=3, activation='relu', padding='same')(max_pool_1)
    max_pool_2 = tf.keras.layers.MaxPool1D(pool_size=2)(conv_2)

    bottle_neck = tf.keras.layers.Conv1D(filters=256, kernel_size=3, activation='relu', padding='same')(max_pool_2)
    visualization = tf.keras.layers.Conv1D(filters=1, kernel_size=3, activation='sigmoid', padding='same')(bottle_neck)

    conv_3 = tf.keras.layers.Conv1D(filters=128, kernel_size=3, activation='relu', padding='same')(bottle_neck)
    upsample_1 = tf.keras.layers.UpSampling1D(size=2)(conv_3)

    conv_4 = tf.keras.layers.Conv1D(filters=64, kernel_size=3, activation='relu', padding='same')(upsample_1)
    upsample_2 = tf.keras.layers.UpSampling1D(size=2)(conv_4)

    decoder_output = tf.keras.layers.Conv1D(filters=1, kernel_size=3, activation='sigmoid', padding='same')(upsample_2)

    return inputs, decoder_output, visualization

X = np.random.randn(1880, 220)
X_val = np.random.randn(300, 220)

X = np.expand_dims(X, axis=-1)
X = tf.convert_to_tensor(X)   # (1880, 220, 1)
X_val = np.expand_dims(X_val, axis=-1)
X_val = tf.convert_to_tensor(X_val)  # (300, 220, 1)

inputs, decoder_output, visualization = autoEncoder(X)
model = tf.keras.Model(inputs=inputs, outputs=decoder_output)
encoder_model = tf.keras.Model(inputs=inputs, outputs=visualization)

batch_size = 128
train_steps = len(X) // batch_size
val_steps = len(X_val) // batch_size
model.compile(optimizer='adam', metrics=['accuracy'], loss='mean_squared_error')
model.fit(X, steps_per_epoch=train_steps, validation_data = X_val, validation_steps=val_steps, epochs=100)

在 google-colab 上,这会出现以下错误:

On google-colab this gives the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-29-a889c5a46f35> in <module>()
      3 val_steps = len(X_val) // batch_size
      4 model.compile(optimizer='adam', metrics=['accuracy'], loss='mean_squared_error')
----> 5 model.fit(X, steps_per_epoch=train_steps, validation_data = X_val, validation_steps=val_steps, epochs=100)

1 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
   1041               (x, y, sample_weight), validation_split=validation_split))
   1042 
-> 1043     if validation_data:
   1044       val_x, val_y, val_sample_weight = (
   1045           data_adapter.unpack_x_y_sample_weight(validation_data))

/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py in __bool__(self)
    990 
    991   def __bool__(self):
--> 992     return bool(self._numpy())
    993 
    994   __nonzero__ = __bool__

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

这与您的 OP 相同.最好发布错误堆栈的原因是因为答案隐藏在这些行中,特别是:

which is identical to your OP. The reason it'd be better to post the error stack is because the answer is hidden in these lines, specifically:

1043     if validation_data:
1044       val_x, val_y, val_sample_weight = (
1045           data_adapter.unpack_x_y_sample_weight(validation_data))

validation_data 的格式与 (x, y, sample_weight) 相同.以下是 fit method 文档的说明:

The format of validation_data is identical to (x, y, sample_weight). Here's what fit method documentation has to say:

validation_data 将覆盖 validation_split.validation_data 可以是: - Numpy 数组或张量的元组 (x_val, y_val) - Numpy 数组的元组 (x_val, y_val, val_sample_weights) -dataset 对于前两种情况,必须提供batch_size.对于最后一种情况,可以提供 validation_steps.

validation_data will override validation_split. validation_data could be: - tuple (x_val, y_val) of Numpy arrays or tensors - tuple (x_val, y_val, val_sample_weights) of Numpy arrays - dataset For the first two cases, batch_size must be provided. For the last case, validation_steps could be provided.

我想您现在明白为什么会出现错误了,您的自动编码器没有 Y.这不应该有任何问题,因为您的 X 本身就是您的 Y.这是 编码器教程 可以在这种情况下帮助我们:

I think you now understand why you're getting an error, there's no Y for the your autoencoder. Which shouldn't be of any concern since your X itself is your Y. Here's a line from an encoder tutorial that would help us in this situation:

使用 x_train 作为输入和目标来训练模型.encoder 将学习将数据集从 784 维压缩到潜在空间,decoder 将学习重建原始图像.

Train the model using x_train as both the input and the target. The encoder will learn to compress the dataset from 784 dimensions to the latent space, and the decoder will learn to reconstruct the original images.

因此,您应该编写以下内容:

So, what you were expected to do is to write the following:

model.fit(X, X, steps_per_epoch=train_steps, validation_data=(X_val, X_val), validation_steps=val_steps, epochs=100)

这确实开始了训练!

这篇关于如果validation_data ValueError:具有多个元素的数组的真值不明确,则在model.fit()中引发错误.使用 a.any() 或 a.all()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆