调用fit_generator后Keras似乎挂起 [英] Keras seems to hang after call to fit_generator

查看:62
本文介绍了调用fit_generator后Keras似乎挂起的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将 SqueezeDet模型的Keras实现适合新数据集.在对配置文件进行适当的更改后,我尝试运行Train脚本,但是在调用fit_generator()之后,它似乎挂起了.当我得到以下输出时:

I am trying to fit the Keras implementation of the SqueezeDet model to a new dataset. After making the appropriate changes to my config file, I tried to run the train script, but it seems to hang after the call to fit_generator(). As I get the following output:

/anaconda/envs/py35/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
Number of images: 536
Number of epochs: 100
Number of batches: 53
Batch size: 10
2018-07-04 14:18:49.711606: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-07-04 14:18:54.080912: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 52a9:00:00.0
totalMemory: 11.17GiB freeMemory: 11.10GiB
2018-07-04 14:18:54.080958: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-07-04 14:18:54.333214: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-04 14:18:54.333270: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]      0
2018-07-04 14:18:54.333290: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N
2018-07-04 14:18:54.333559: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10764 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 52a9:00:00.0, compute capability: 3.7)
Learning rate: 0.01
Weights initialized by name from ../main/model/imagenet.h5
Using single GPU
Backend Qt5Agg is interactive backend. Turning interactive mode on.
Epoch 1/100

然后,即使将其搁置一天也没有任何反应.似乎冻结的呼叫是:

And then nothing happens even if it leave it alone for a day. The call that it seems to freeze on is:

squeeze.model.fit_generator(train_generator, epochs=EPOCHS, verbose=1,
                            steps_per_epoch=nbatches_train, callbacks=cb)

其中的参数是:

train_generator = generator_from_data_path(img_names, gt_names, config=cfg)
EPOCHS = 100
nbatches_train  = 53
callbacks = [# TensorBoard object, ReduceLROnPlateau object, ModelCheckpoint object #]

我的版本:

Python 3.5.4 :: Anaconda custom (64-bit)
tensorflow-gpu : 1.8.0
tensorflow : 1.8.0
Keras : 2.2.0

推荐答案

格式化注释中的对话以回答问题.

Formatting conversation in comments to answer.

罪魁祸首是train_generator.

前一段时间,我在Keras中研究了model.fit_generator的来源.它只是从生成器中检索一些数据,并将其提交到后端,没有什么神奇的:)

I have looked into sources of model.fit_generator in Keras some time ago. It just retrieves some data from the generator and submits it to the backend, nothing magical :)

因此,我的假设是由于生成器未生成任何内容,因此无法从生成器检索数据.

So, my hypothesis was that it cannot retrieve data from the generator because the generator does not generate anything.

@Barker已经确认,表明对next(train_generator)的呼叫已挂起.

@Barker has confirmed it, stating that call to next(train_generator) hangs.

我个人已经迁移到支持索引和长度并且比普通生成器方便得多的keras.utils.Sequence.尽管此注释与当前问题无关.

I personally have moved to keras.utils.Sequence that supports indexing and length and is much more convenient than ordinary generators. Though this note is not related to the current problem.

这篇关于调用fit_generator后Keras似乎挂起的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆