第一个训练时期很慢 [英] First training epoch is very slow

查看:25
本文介绍了第一个训练时期很慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

嗨……我在我的 P3 AWS 机器上运行 mnist 代码,与我之前的 P2 机器相比,初始化过程似乎很长(虽然 P3>P2)

Hi… I’m running mnist code in my P3 AWS machine and the initialization process seems to be very long compared to my previous P2 machine (although P3>P2)

Train on 60000 samples, validate on 10000 samples
Epoch 1/10
60000/60000 [==============================] - 265s 4ms/step - loss: 0.2674 - acc: 0.9175 - val_loss: 0.0602 - val_acc: 0.9811
Epoch 2/10
60000/60000 [==============================] - 3s 51us/step - loss: 0.0860 - acc: 0.9742 - val_loss: 0.0393 - val_acc: 0.9866
Epoch 3/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0647 - acc: 0.9808 - val_loss: 0.0338 - val_acc: 0.9884
Epoch 4/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0542 - acc: 0.9839 - val_loss: 0.0337 - val_acc: 0.9887
Epoch 5/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0453 - acc: 0.9863 - val_loss: 0.0311 - val_acc: 0.9900
Epoch 6/10
60000/60000 [==============================] - 3s 51us/step - loss: 0.0412 - acc: 0.9873 - val_loss: 0.0291 - val_acc: 0.9898
Epoch 7/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0368 - acc: 0.9891 - val_loss: 0.0300 - val_acc: 0.9901
Epoch 8/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0340 - acc: 0.9897 - val_loss: 0.0298 - val_acc: 0.9897
Epoch 9/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0320 - acc: 0.9908 - val_loss: 0.0267 - val_acc: 0.9916
Epoch 10/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0286 - acc: 0.9914 - val_loss: 0.0276 - val_acc: 0.9903
Test loss: 0.02757222411266339
Test accuracy: 0.9903

我使用的是 Keras=2.1.4张量流-gpu=1.5.0

I’m using Keras=2.1.4 tensorflow-gpu=1.5.0

我的keras.json文件配置如下:

my keras.json file is configured as follows:

{
    "floatx": "float32",
    "epsilon": 1e-07,
    "backend": "tensorflow",
    "image_data_format": "channels_last"
}

知道为什么会这样吗?

提前致谢

推荐答案

基于此 问题:

第一个纪元花费相同的时间,但计数器也考虑考虑构建计算图部分所花费的时间处理训练(几秒钟).这曾经是在compile 步骤,但现在懒惰地完成一个需求以避免不必要的工作.

The first epoch takes the same time, but the counter also takes into account the time taken by building the part of the computational graph that deals with training (a few seconds). This used to be done during the compile step, but now it is done lazily one demand to avoid unnecessary work.

这篇关于第一个训练时期很慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆