Tensorflow 模型正确拟合格式数据--TypeError:无法将符号 Keras 输入/输出转换为 numpy 数组 [英] Tensorflow Model fit format data correctly -- TypeError: Cannot convert a symbolic Keras input/output to a numpy array

查看:138
本文介绍了Tensorflow 模型正确拟合格式数据--TypeError:无法将符号 Keras 输入/输出转换为 numpy 数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于 NLP 任务,我的输入数据集被转换为如下所示:整数列表.特征和标签是同一个数据集.

>>>training_data = [[ 0 4 79 3179 11 44​​ 8 1 11245 173 152 101 1138 1079][ 0 0 4 79 3179 11 44​​ 8 11566 173 152 81 1138 1079][ 0 0 0 0 0 0 0 9 15 333 44 361 63 533][ 0 0 0 0 0 0 3 19 253 28 44 361 63 533][ 0 0 0 0 0 0 0 0 0 0 0 23 49 4395][ 0 0 0 0 0 0 0 0 0 0 0 075 65 4395][ 3 1 7128 3388 289 10 446 200 675 8 3320 1432 82 234][ 7 74 268 577 23 49 31 5 1032 98 10 42705026 12 6570][ 0 0 0 0 0 0 0 2 3 39 7 27155 29 4534][ 0 0 0 0 0 2 3 19 39 7 27 15529 34 4534]]

验证数据集是主数据集的摘录,格式相同.

然后我调用 fit() 方法 - 我的模型是 vae

n_steps = (800000/2)/batch_size对于范围内的计数器(nb_epoch):打印('-------纪元:',计数器,'--------')vae.fit(x=np.array(training_data),y=np.array(training_data),steps_per_epoch=n_steps,epochs=1, callbacks=[checkpointer], validation_data=(data_1_val, data_1_val))

导致此错误

 TypeError:无法将符号 Keras 输入/输出转换为 numpy 数组.此错误可能表明您正在尝试将符号值传递给 NumPy 调用,不支持.或者,您可能试图将 Keras 符号输入/输出传递给一个不注册调度的 TF API,防止 Keras 自动将 API 调用转换为功能模型中的 lambda 层.

我试过了

vae.fit(x=training_data,y=training_data,steps_per_epoch=n_steps,epochs=1, callbacks=[checkpointer], validation_data=(data_1_val, data_1_val))

同样的错误.

欢迎使用列表、np.arrays 或生成器提供有关如何为训练格式化数据的任何好的解决方案或提示.

一些代码

training_data = pad_sequences(sequences, maxlen = MAX_SEQUENCE_LENGTH)len_val = int(np.floor ( len(texts) * 0.2 )) # 验证样本数量data_1_val = data_1[-len_val:] #选择len_val语句作为验证数据

构建和训练模型

x = Input(batch_shape=(None, max_len))x_embed = 嵌入(NB_WORDS,emb_dim,权重=[glove_embedding_matrix],input_length=max_len,trainable=False)(x)

[...]

loss_layer = CustomVariationalLayer()([x, x_decoded_mean])vae = 模型(x,[loss_layer])opt = Adam(lr=0.01) #SGD(lr=1e-2,decay=1e-6,momentum=0.9,nesterov=True)vae.compile(optimizer='adam', loss=[zero_loss])nb_epoch = 100n_steps = (800000/2)/batch_size对于范围内的计数器(nb_epoch):打印('-------纪元:',计数器,'--------')vae.fit(training_data,training_data,steps_per_epoch=n_steps,epochs=1, callbacks=[checkpointer], validation_data=(data_1_val, data_1_val))

在原始 github 代码中 一个生成器被用作 fit() 使用 Keras 中已弃用的方法,fit_generator

 范围内的计数器(nb_epoch):打印('-------纪元:',计数器,'--------')vae.fit_generator(sent_generator(TRAIN_DATA_FILE,batch_size/2),step_per_epoch=n_steps, epochs=1, callbacks=[checkpointer],验证数据=(data_1_val, data_1_val))

由于 fit() 也支持我第一次尝试的生成器参数

 范围内的计数器(nb_epoch):打印('-------纪元:',计数器,'--------')vae.fit(sent_generator(TRAIN_DATA_FILE,batch_size/2),step_per_epoch=n_steps, epochs=1, callbacks=[checkpointer],验证数据=(data_1_val, data_1_val))

它正在崩溃,与上面的错误相同.

解决方案

问题:

<块引用>

TypeError:无法将符号 Keras 输入/输出转换为 numpy大批.
此错误可能表明您正在尝试通过NumPy 调用的符号值,不支持.或者您可能试图将 Keras 符号输入/输出传递给 TF API不注册调度,阻止 Keras自动地将 API 调用转换为功能模型中的 lambda 层.

调查

由于此 TypeError 的性质,我建议您在禁用 Eager Execution 时检查错误代码:

from tensorflow.python.framework.ops import disable_eager_executiondisable_eager_execution()

您没有错误,但此警告:警告:tensorflow:将输入数据作为数组传递时,请勿指定 steps_per_epoch/steps 参数.请改用batch_size.

了解问题

我先解释一下这个建议的原因.使用 Functional API 创建的模型的行为在启用 Eager Execution 的情况下似乎相当不可预测.但是我们会理解它为什么会发生以及如何修复它.

在这里你会发现来自 KerasTensor 类的 TypeError:https://github.com/keras-team/keras/blob/4a978914d2298db2c79baa4012af5ceff4a4e203/keras/engine/keras_tensor.py#L244

为什么禁用 Eager Execution 似乎可以解决问题:

让我们首先阅读https://www.tensorflow.org/guide/eager

<块引用>

启用 Eager Execution 会改变 TensorFlow 操作的行为方式——现在它们会立即评估并将其值返回给Python.tf.Tensor 对象引用具体值而不是计算图中节点的符号句柄.由于没有稍后在会话中构建和运行的计算图,这很容易使用 print() 或调试器检查结果.评估、打印、并且检查张量值不会中断计算流程渐变.

Eager Execution 与 NumPy 配合得很好.NumPy 操作接受 tf.Tensor 参数.TensorFlow tf.math 操作转换 Python对象和 NumPy 数组到 tf.Tensor 对象.tf.Tensor.numpy方法将对象的值作为 NumPy ndarray 返回.

但是 Eager Execution 应该可以很好地与 Numpy 一起使用,为什么在使用 numpy 数组时似乎会发生错误?

这个错误不是由 Tensorflow 的急切执行引发​​的.这个错误是由 Keras 抛出的,更具体地说是由 KerasTensor 抛出.

在模型的函数式 API 构建过程中,会创建 KerasTensor 来表示符号输入".和输出每个 Keras 层.您的输入是一个 np.ndarray.Keras 将您的数组放入 tf.keras.Input` 层,生成一个 KerasTensor.抛出错误是因为您的模型将尝试将此符号输入/输出转换为 np.ndarray.

但为什么会出现这种行为?

记住在急切执行期间 tf.Tensor 对象引用具体值而不是计算图中节点的符号句柄.因此,急切执行将尝试从您的 KerasTensor 获取具体值,这将引发此错误 TypeError: Cannot convert a Symbolic Keras input/output to a numpy array.

禁用 Eager Execution 时,您永远不会尝试从 KerasTensor 获取具体值,也永远不会抛出此错误.

如果您想更好地了解功能模型中发生的事情,请阅读 KerasTensor 课程中的 2 段引文:

<块引用>

KerasTensor 传递给 tf.keras.Layer __call__ 让层知道您正在构建功能模型.图层__call__ 将推断输出签名并返回 KerasTensors 和 tf.TypeSpecs 对应的符号输出层调用.这些输出 KerasTensor 将具有所有Keras 需要附加到它们的内部 KerasHistory 元数据构建功能模型.目前,层推断输出签名:
* 创建一个草稿 FuncGraph
* 在草稿图中制作与输入类型规范匹配的占位符
* 在这些占位符上调用 layer.call
* 在清除划痕图之前提取输出的签名

<块引用>

如果您将 KerasTensor 传递给支持调度的 TF API,Keras 会自动将该 API 调用转换为 lambda 层函数模型,并返回代表
的KerasTensors符号输出.

建议的解决方案:

禁用 Eager Execution 并不是一个令人满意的解决方案.

我建议您尝试将 training_data 转换为带有 tf.data.Dataset 类的数据集或带有 tf.Tensor 类的张量model.fit 之前.

此外,如果问题仍未解决,如果您能够提供一些代码来重现错误,这将有所帮助.

For a NLP task, my input dataset is transformed to look like this : a list of list of integers. Features and Labels are the same dataset.

>>>training_data = [[    0     4    79  3179    11    44     8     1 11245   173   152    10
      1  1138  1079]
 [    0     0     4    79  3179    11    44     8 11566   173   152     8
      1  1138  1079]
 [    0     0     0     0     0     0     0     9    15   333    44     3
     61    63   533]
 [    0     0     0     0     0     0     3    19   253    28    44     3
     61    63   533]
 [    0     0     0     0     0     0     0     0     0     0     0     2
      3    49  4395]
 [    0     0     0     0     0     0     0     0     0     0     0     0
     75    65  4395]
 [    3     1  7128  3388   289    10   446   200   675     8  3320    14
     32    82   234]
 [    7    74   268   577    23    49    31     5  1032    98    10  4270
   5026    12  6570]
 [    0     0     0     0     0     0     0     2     3    39     7    27
    155    29  4534]
 [    0     0     0     0     0     2     3    19    39     7    27   155
     29    34  4534]]

The validation dataset is an excerpt of the main dataset, same format.

I then call the fit() method - my model is vae

n_steps = (800000 / 2) / batch_size   
for counter in range(nb_epoch):
    print('-------epoch: ',counter,'--------')
    vae.fit(x=np.array(training_data),y=np.array(training_data), steps_per_epoch=n_steps,
        epochs=1, callbacks=[checkpointer], validation_data=(data_1_val, data_1_val))

which gives this error

   TypeError: Cannot convert a symbolic Keras input/output to a numpy array. 
   This error may indicate that you're trying to pass a symbolic value to a NumPy call, 
   which is not supported. Or, you may be trying to pass Keras symbolic inputs/outputs to 
   a TF API that does not register dispatching, preventing Keras from automatically 
      converting the API call to a lambda layer in the Functional Model.

I tried

vae.fit(x=training_data,y=training_data, steps_per_epoch=n_steps,
            epochs=1, callbacks=[checkpointer], validation_data=(data_1_val, data_1_val))

as well with the same error.

Any nice solution or hint on how to format data towards training is welcome, using lists, np.arrays or generators.

EDIT: some code

training_data = pad_sequences(sequences, maxlen = MAX_SEQUENCE_LENGTH)
len_val = int(np.floor ( len(texts) * 0.2 )) # num samples for validation
data_1_val = data_1[-len_val:] #select len_val sentences as validation data

Building and training the model

x = Input(batch_shape=(None, max_len))
x_embed = Embedding(NB_WORDS, emb_dim, weights=[glove_embedding_matrix],
                        input_length=max_len, trainable=False)(x)

[...]

loss_layer = CustomVariationalLayer()([x, x_decoded_mean])
vae = Model(x, [loss_layer])
opt = Adam(lr=0.01) #SGD(lr=1e-2, decay=1e-6, momentum=0.9, nesterov=True)
vae.compile(optimizer='adam', loss=[zero_loss])

nb_epoch = 100
n_steps = (800000 / 2) / batch_size   

for counter in range(nb_epoch):
    print('-------epoch: ',counter,'--------')
    vae.fit(training_data,training_data, steps_per_epoch=n_steps,
        epochs=1, callbacks=[checkpointer], validation_data=(data_1_val, data_1_val))

In the original github code a generator was used as an input for fit() with a deprecated method in Keras, fit_generator

for counter in range(nb_epoch):
    print('-------epoch: ',counter,'--------')
    vae.fit_generator(sent_generator(TRAIN_DATA_FILE, batch_size/2),
                      steps_per_epoch=n_steps, epochs=1, callbacks=[checkpointer],
                      validation_data=(data_1_val, data_1_val))

Since fit() also supports a generator argument I first tried

for counter in range(nb_epoch):
    print('-------epoch: ',counter,'--------')
    vae.fit(sent_generator(TRAIN_DATA_FILE, batch_size/2),
                      steps_per_epoch=n_steps, epochs=1, callbacks=[checkpointer],
                      validation_data=(data_1_val, data_1_val))

which is crashing, with the same error as above.

解决方案

Issue:

TypeError: Cannot convert a symbolic Keras input/output to a numpy array.
This error may indicate that you're trying to pass a symbolic value to a NumPy call, which is not supported. Or, you may be trying to pass Keras symbolic inputs/outputs to a TF API that does not register dispatching, preventing Keras from automatically converting the API call to a lambda layer in the Functional Model.

Investigation

Because of the nature of this TypeError, I suggested you check the error code when disabling eager execution:

from tensorflow.python.framework.ops import disable_eager_execution 
disable_eager_execution()

You had no error but this warning: WARNING:tensorflow:When passing input data as arrays, do not specify steps_per_epoch/steps argument. Please use batch_size instead.

Understand the issue

I'll first explain the reason of this suggestion. The behavior of models created with the Functional API, could seem rather unpredictable whith eager execution enabled. But we will understand why it occurs and how to fix it.

Here you'll find the TypeError coming from the KerasTensor class: https://github.com/keras-team/keras/blob/4a978914d2298db2c79baa4012af5ceff4a4e203/keras/engine/keras_tensor.py#L244

Why disabling eager execution seems to solve the problem:

Let's first read this quotation from https://www.tensorflow.org/guide/eager

Enabling eager execution changes how TensorFlow operations behave—now they immediately evaluate and return their values to Python. tf.Tensor objects reference concrete values instead of symbolic handles to nodes in a computational graph. Since there isn't a computational graph to build and run later in a session, it's easy to inspect results using print() or a debugger. Evaluating, printing, and checking tensor values does not break the flow for computing gradients.

Eager execution works nicely with NumPy. NumPy operations accept tf.Tensor arguments. The TensorFlow tf.math operations convert Python objects and NumPy arrays to tf.Tensor objects. The tf.Tensor.numpy method returns the object's value as a NumPy ndarray.

But eager execution should work nicely with Numpy, why the error seems to happen while working with an numpy array?

This error is not thrown by Tensorflow's eager execution. This error is thrown by Keras and more specifically by KerasTensor.

During the Functional API construction of your model, KerasTensors are created to represent the "symbolic inputs" and outputs of each Keras layers. Your input is an np.ndarray. Keras takes your array and put it in a tf.keras.Input` layer, producing a KerasTensor. The error is thrown because your model will try converting this symbolic input/output into an np.ndarray.

But why this behavior?

Remember during eager execution tf.Tensor objects reference concrete values instead of symbolic handles to nodes in a computational graph. Therefore eager execution will try and get a concrete value from your KerasTensor which would throw this error TypeError: Cannot convert a symbolic Keras input/output to a numpy array.

When disabling eager execution you'll never try to get a concrete value from your KerasTensor and this error will never be thrown.

Pleas read this 2 quotations from the KerasTensor's class if you'd like to better understand what's happening inside your Functional model:

Passing a KerasTensor to a tf.keras.Layer __call__ lets the layer know that you are building a Functional model. The layer __call__ will infer the output signature and return KerasTensors with tf.TypeSpecs corresponding to the symbolic outputs of that layer call. These output KerasTensors will have all of the internal KerasHistory metadata attached to them that Keras needs to construct a Functional Model. Currently, layers infer the output signature by:
* creating a scratch FuncGraph
* making placeholders in the scratch graph that match the input typespecs
* Calling layer.call on these placeholders
* extracting the signatures of the outputs before clearing the scratch graph

If you are passing a KerasTensor to a TF API that supports dispatching, Keras will automatically turn that API call into a lambda layer in the Functional model, and return KerasTensors representing the
symbolic outputs.

Suggested solution:

Disabling eager execution is not a satisfying solution.

I suggest you try converting training_data as a Dataset with tf.data.Dataset class or as a tensor with the tf.Tensor class prior to model.fit.

Also, if the issue is still not resolved it would helps, if you were able to provide some code to reproduce the error.

这篇关于Tensorflow 模型正确拟合格式数据--TypeError:无法将符号 Keras 输入/输出转换为 numpy 数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆