Tensorflow 模型正确拟合格式数据——TypeError:无法将符号 Keras 输入/输出转换为 numpy 数组 [英] Tensorflow Model fit format data correctly -- TypeError: Cannot convert a symbolic Keras input/output to a numpy array
问题描述
对于 NLP 任务,我的输入数据集被转换为如下所示:整数列表.特征和标签是同一个数据集.
>>>training_data = [[ 0 4 79 3179 11 44 8 1 11245 173 152 101 1138 1079][ 0 0 4 79 3179 11 44 8 11566 173 152 81 1138 1079][ 0 0 0 0 0 0 0 9 15 333 44 361 63 533][ 0 0 0 0 0 0 3 19 253 28 44 361 63 533][ 0 0 0 0 0 0 0 0 0 0 0 23 49 4395][ 0 0 0 0 0 0 0 0 0 0 0 075 65 4395][ 3 1 7128 3388 289 10 446 200 675 8 3320 1432 82 234][ 7 74 268 577 23 49 31 5 1032 98 10 42705026 12 6570][ 0 0 0 0 0 0 0 2 3 39 7 27155 29 4534][ 0 0 0 0 0 2 3 19 39 7 27 15529 34 4534]]
验证数据集是主数据集的摘录,格式相同.
然后我调用 fit()
方法 - 我的模型是 vae
n_steps = (800000/2)/batch_size对于范围内的计数器(nb_epoch):print('-------epoch: ',counter,'--------')vae.fit(x=np.array(training_data),y=np.array(training_data),steps_per_epoch=n_steps,epochs=1,回调=[checkpointer],validation_data=(data_1_val, data_1_val))
这个错误
TypeError:无法将符号 Keras 输入/输出转换为 numpy 数组.此错误可能表明您正在尝试将符号值传递给 NumPy 调用,不支持.或者,您可能正在尝试将 Keras 符号输入/输出传递给一个不注册调度的 TF API,防止 Keras 自动将 API 调用转换为功能模型中的 lambda 层.
我试过了
vae.fit(x=training_data,y=training_data, steps_per_epoch=n_steps,epochs=1,回调=[checkpointer],validation_data=(data_1_val, data_1_val))
同样的错误.
欢迎使用列表、np.arrays 或生成器来为训练格式化数据提供任何好的解决方案或提示.
一些代码
training_data = pad_sequences(sequences, maxlen = MAX_SEQUENCE_LENGTH)len_val = int(np.floor ( len(texts) * 0.2 )) # 用于验证的样本数data_1_val = data_1[-len_val:] #选择len_val句子作为验证数据
构建和训练模型
x = Input(batch_shape=(None, max_len))x_embed = 嵌入(NB_WORDS,emb_dim,权重=[glove_embedding_matrix],输入长度=最大长度,可训练=假)(x)
[...]
loss_layer = CustomVariationalLayer()([x, x_decoded_mean])vae = 模型(x,[loss_layer])opt = Adam(lr=0.01) #SGD(lr=1e-2,衰减=1e-6,动量=0.9,nesterov=True)vae.compile(optimizer='adam', loss=[zero_loss])nb_epoch = 100n_steps = (800000/2)/batch_size对于范围内的计数器(nb_epoch):print('-------epoch: ',counter,'--------')vae.fit(training_data,training_data,steps_per_epoch=n_steps,epochs=1,回调=[checkpointer],validation_data=(data_1_val, data_1_val))
在原始 github 代码中使用生成器作为 fit()
带有 Keras 中已弃用的方法,fit_generator
范围内的计数器(nb_epoch):print('-------epoch: ',counter,'--------')vae.fit_generator(sent_generator(TRAIN_DATA_FILE, batch_size/2),steps_per_epoch=n_steps, epochs=1, callbacks=[checkpointer],验证数据=(数据_1_val,data_1_val))
因为 fit() 也支持我第一次尝试的生成器参数
范围内的计数器(nb_epoch):print('-------epoch: ',counter,'--------')vae.fit(sent_generator(TRAIN_DATA_FILE, batch_size/2),steps_per_epoch=n_steps, epochs=1, callbacks=[checkpointer],验证数据=(数据_1_val,data_1_val))
正在崩溃,与上述相同的错误.
问题:
<块引用>TypeError:无法将符号 Keras 输入/输出转换为 numpy大批.
此错误可能表明您正在尝试通过NumPy 调用的符号值,不支持.或者您可能正在尝试将 Keras 符号输入/输出传递给 TF API不注册调度,阻止 Keras自动地将 API 调用转换为功能模型中的 lambda 层.
调查
由于这个 TypeError 的性质,我建议你在禁用 Eager Execution 时检查错误代码:
从 tensorflow.python.framework.ops 导入 disable_eager_executiondisable_eager_execution()
您没有错误,但出现以下警告:WARNING:tensorflow:当将输入数据作为数组传递时,请勿指定 steps_per_epoch/steps 参数.请改用 batch_size.
了解问题
我将首先解释这个建议的原因.使用功能 API 创建的模型的行为在启用急切执行的情况下似乎相当不可预测.但我们会理解它发生的原因以及如何解决它.
在这里您会发现来自 KerasTensor 类的 TypeError:https://github.com/keras-team/keras/blob/4a978914d2298db2c79baa4012af5ceff4a4e203/keras/engine/keras_tensor.py#L244
为什么禁用急切执行似乎可以解决问题:
让我们首先阅读 https://www.tensorflow.org/guide/eager一个>
<块引用>启用急切执行会改变 TensorFlow 操作的行为方式——现在它们会立即评估并将其值返回给Python.tf.Tensor 对象引用具体值而不是计算图中节点的符号句柄.由于没有在会话中稍后构建和运行的计算图,很容易使用 print() 或调试器检查结果.评估、印刷、并且检查张量值不会破坏计算流程渐变.
急切执行与 NumPy 配合得很好.NumPy 操作接受 tf.Tensor 参数.TensorFlow tf.math 操作转换 Python对象和 NumPy 数组到 tf.Tensor 对象.tf.Tensor.numpy方法将对象的值作为 NumPy ndarray 返回.
但是 Eager Execution 应该可以很好地与 Numpy 一起工作,为什么在使用 numpy 数组时似乎会发生错误?
Tensorflow 的急切执行不会引发此错误.这个错误是由 Keras 引发的,更具体地说是由 KerasTensor 引发的.
在您的模型的函数式 API 构建期间,会创建 KerasTensor
来表示符号输入";和输出每个 Keras 层.您的输入是一个 np.ndarray.Keras 将您的数组放入 tf.keras.Input` 层,生成 KerasTensor.抛出错误是因为您的模型将尝试将此符号输入/输出转换为 np.ndarray.
但是为什么会有这种行为呢?
记住在急切执行期间tf.Tensor 对象引用具体值而不是计算图中节点的符号句柄.
因此,急切的执行将尝试从您的 KerasTensor 获取一个具体值,这将引发此错误 TypeError: Cannot convert a symbolic Keras input/output to a numpy array.
禁用急切执行后,您将永远不会尝试从 KerasTensor 获取具体值,并且永远不会抛出此错误.
如果您想更好地了解函数模型内部发生的情况,请阅读 KerasTensor 课程中的这 2 条引文:
<块引用>将 KerasTensor
传递给 tf.keras.Layer
__call__
让层知道您正在构建功能模型.层__call__
将推断输出签名并返回 KerasTensor
s,其中 tf.TypeSpec
s 对应于其符号输出层调用.这些输出 KerasTensor
将包含所有Keras 需要的附加到它们的内部 KerasHistory 元数据构建功能模型.目前,层推断输出署名:
* 创建一个草稿 FuncGraph
* 在暂存图中制作与输入类型规范匹配的占位符
* 在这些占位符上调用 layer.call
* 在清除暂存图之前提取输出的签名
<块引用>
如果您将 KerasTensor
传递给支持调度的 TF API,Keras 会自动将该 API 调用转换为 lambda 层函数模型,并返回代表
的 KerasTensors符号输出.
建议的解决方案:
禁用急切执行并不是一个令人满意的解决方案.
我建议您尝试将 training_data
转换为带有 tf.data.Dataset
类的数据集或带有 tf.Tensor
类的张量在 model.fit
之前.
此外,如果问题仍未解决,如果您能够提供一些代码来重现错误,这将有所帮助.
For a NLP task, my input dataset is transformed to look like this : a list of list of integers. Features and Labels are the same dataset.
>>>training_data = [[ 0 4 79 3179 11 44 8 1 11245 173 152 10
1 1138 1079]
[ 0 0 4 79 3179 11 44 8 11566 173 152 8
1 1138 1079]
[ 0 0 0 0 0 0 0 9 15 333 44 3
61 63 533]
[ 0 0 0 0 0 0 3 19 253 28 44 3
61 63 533]
[ 0 0 0 0 0 0 0 0 0 0 0 2
3 49 4395]
[ 0 0 0 0 0 0 0 0 0 0 0 0
75 65 4395]
[ 3 1 7128 3388 289 10 446 200 675 8 3320 14
32 82 234]
[ 7 74 268 577 23 49 31 5 1032 98 10 4270
5026 12 6570]
[ 0 0 0 0 0 0 0 2 3 39 7 27
155 29 4534]
[ 0 0 0 0 0 2 3 19 39 7 27 155
29 34 4534]]
The validation dataset is an excerpt of the main dataset, same format.
I then call the fit()
method - my model is vae
n_steps = (800000 / 2) / batch_size
for counter in range(nb_epoch):
print('-------epoch: ',counter,'--------')
vae.fit(x=np.array(training_data),y=np.array(training_data), steps_per_epoch=n_steps,
epochs=1, callbacks=[checkpointer], validation_data=(data_1_val, data_1_val))
which gives this error
TypeError: Cannot convert a symbolic Keras input/output to a numpy array.
This error may indicate that you're trying to pass a symbolic value to a NumPy call,
which is not supported. Or, you may be trying to pass Keras symbolic inputs/outputs to
a TF API that does not register dispatching, preventing Keras from automatically
converting the API call to a lambda layer in the Functional Model.
I tried
vae.fit(x=training_data,y=training_data, steps_per_epoch=n_steps,
epochs=1, callbacks=[checkpointer], validation_data=(data_1_val, data_1_val))
as well with the same error.
Any nice solution or hint on how to format data towards training is welcome, using lists, np.arrays or generators.
EDIT: some code
training_data = pad_sequences(sequences, maxlen = MAX_SEQUENCE_LENGTH)
len_val = int(np.floor ( len(texts) * 0.2 )) # num samples for validation
data_1_val = data_1[-len_val:] #select len_val sentences as validation data
Building and training the model
x = Input(batch_shape=(None, max_len))
x_embed = Embedding(NB_WORDS, emb_dim, weights=[glove_embedding_matrix],
input_length=max_len, trainable=False)(x)
[...]
loss_layer = CustomVariationalLayer()([x, x_decoded_mean])
vae = Model(x, [loss_layer])
opt = Adam(lr=0.01) #SGD(lr=1e-2, decay=1e-6, momentum=0.9, nesterov=True)
vae.compile(optimizer='adam', loss=[zero_loss])
nb_epoch = 100
n_steps = (800000 / 2) / batch_size
for counter in range(nb_epoch):
print('-------epoch: ',counter,'--------')
vae.fit(training_data,training_data, steps_per_epoch=n_steps,
epochs=1, callbacks=[checkpointer], validation_data=(data_1_val, data_1_val))
In the original github code a generator was used as an input for fit()
with a deprecated method in Keras, fit_generator
for counter in range(nb_epoch):
print('-------epoch: ',counter,'--------')
vae.fit_generator(sent_generator(TRAIN_DATA_FILE, batch_size/2),
steps_per_epoch=n_steps, epochs=1, callbacks=[checkpointer],
validation_data=(data_1_val, data_1_val))
Since fit() also supports a generator argument I first tried
for counter in range(nb_epoch):
print('-------epoch: ',counter,'--------')
vae.fit(sent_generator(TRAIN_DATA_FILE, batch_size/2),
steps_per_epoch=n_steps, epochs=1, callbacks=[checkpointer],
validation_data=(data_1_val, data_1_val))
which is crashing, with the same error as above.
Issue:
TypeError: Cannot convert a symbolic Keras input/output to a numpy array.
This error may indicate that you're trying to pass a symbolic value to a NumPy call, which is not supported. Or, you may be trying to pass Keras symbolic inputs/outputs to a TF API that does not register dispatching, preventing Keras from automatically converting the API call to a lambda layer in the Functional Model.
Investigation
Because of the nature of this TypeError, I suggested you check the error code when disabling eager execution:
from tensorflow.python.framework.ops import disable_eager_execution
disable_eager_execution()
You had no error but this warning: WARNING:tensorflow:When passing input data as arrays, do not specify steps_per_epoch/steps argument. Please use batch_size instead.
Understand the issue
I'll first explain the reason of this suggestion. The behavior of models created with the Functional API, could seem rather unpredictable whith eager execution enabled. But we will understand why it occurs and how to fix it.
Here you'll find the TypeError coming from the KerasTensor class: https://github.com/keras-team/keras/blob/4a978914d2298db2c79baa4012af5ceff4a4e203/keras/engine/keras_tensor.py#L244
Why disabling eager execution seems to solve the problem:
Let's first read this quotation from https://www.tensorflow.org/guide/eager
Enabling eager execution changes how TensorFlow operations behave—now they immediately evaluate and return their values to Python. tf.Tensor objects reference concrete values instead of symbolic handles to nodes in a computational graph. Since there isn't a computational graph to build and run later in a session, it's easy to inspect results using print() or a debugger. Evaluating, printing, and checking tensor values does not break the flow for computing gradients.
Eager execution works nicely with NumPy. NumPy operations accept tf.Tensor arguments. The TensorFlow tf.math operations convert Python objects and NumPy arrays to tf.Tensor objects. The tf.Tensor.numpy method returns the object's value as a NumPy ndarray.
But eager execution should work nicely with Numpy, why the error seems to happen while working with an numpy array?
This error is not thrown by Tensorflow's eager execution. This error is thrown by Keras and more specifically by KerasTensor.
During the Functional API construction of your model, KerasTensor
s are created to represent the "symbolic inputs" and outputs
of each Keras layers.
Your input is an np.ndarray. Keras takes your array and put it in a tf.keras.Input` layer, producing a KerasTensor.
The error is thrown because your model will try converting this symbolic input/output into an np.ndarray.
But why this behavior?
Remember during eager execution tf.Tensor objects reference concrete values instead of symbolic handles to nodes in a computational graph.
Therefore eager execution will try and get a concrete value from your KerasTensor which would throw this error TypeError: Cannot convert a symbolic Keras input/output to a numpy array.
When disabling eager execution you'll never try to get a concrete value from your KerasTensor and this error will never be thrown.
Pleas read this 2 quotations from the KerasTensor's class if you'd like to better understand what's happening inside your Functional model:
Passing a
KerasTensor
to atf.keras.Layer
__call__
lets the layer know that you are building a Functional model. The layer__call__
will infer the output signature and returnKerasTensor
s withtf.TypeSpec
s corresponding to the symbolic outputs of that layer call. These outputKerasTensor
s will have all of the internal KerasHistory metadata attached to them that Keras needs to construct a Functional Model. Currently, layers infer the output signature by:
* creating a scratchFuncGraph
* making placeholders in the scratch graph that match the input typespecs
* Callinglayer.call
on these placeholders
* extracting the signatures of the outputs before clearing the scratch graph
If you are passing a
KerasTensor
to a TF API that supports dispatching, Keras will automatically turn that API call into a lambda layer in the Functional model, and return KerasTensors representing the
symbolic outputs.
Suggested solution:
Disabling eager execution is not a satisfying solution.
I suggest you try converting training_data
as a Dataset with tf.data.Dataset
class or as a tensor with the tf.Tensor
class prior to model.fit
.
Also, if the issue is still not resolved it would helps, if you were able to provide some code to reproduce the error.
这篇关于Tensorflow 模型正确拟合格式数据——TypeError:无法将符号 Keras 输入/输出转换为 numpy 数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!