Float16比Keras中的Float32慢 [英] Float16 slower than float32 in keras

查看：234 发布时间：2020/4/25 9:51:40 python tensorflow keras

本文介绍了Float16比Keras中的Float32慢的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在测试我的新NVIDIA Titan V，它支持float16操作.我注意到在训练过程中，float16(〜800 ms/step)比float32(〜500 ms/step)要慢得多.

I'm testing out my new NVIDIA Titan V, which supports float16 operations. I noticed that during training, float16 is much slower (~800 ms/step) than float32 (~500 ms/step).

要进行float16操作，我将keras.json文件更改为:

To do float16 operations, I changed my keras.json file to:

{
"backend": "tensorflow",
"floatx": "float16",
"image_data_format": "channels_last",
"epsilon": 1e-07
}

为什么float16操作这么慢?我是否需要修改我的代码，而不仅仅是keras.json文件?

Why are the float16 operations so much slower? Do I need to make modifications to my code and not just the keras.json file?

我在Windows 10上使用CUDA 9.0，cuDNN 7.0，tensorflow 1.7.0和keras 2.1.5. 我的python 3.5代码如下:

I am using CUDA 9.0, cuDNN 7.0, tensorflow 1.7.0, and keras 2.1.5 on Windows 10. My python 3.5 code is below:

img_width, img_height = 336, 224

train_data_dir = 'C:\\my_dir\\train'
test_data_dir = 'C:\\my_dir\\test'
batch_size=128

datagen = ImageDataGenerator(rescale=1./255,
    horizontal_flip=True,   # randomly flip the images 
    vertical_flip=True) 

train_generator = datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')

test_generator = datagen.flow_from_directory(
    test_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')

# Architecture of NN
model = Sequential()
model.add(Conv2D(32,(3, 3), input_shape=(img_height, img_width, 3),padding='same',kernel_initializer='lecun_normal'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(32,(3, 3),padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64,(3, 3),padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64,(3, 3),padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(AveragePooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(1))
model.add(Activation('sigmoid'))

my_rmsprop = keras.optimizers.RMSprop(lr=0.0001, rho=0.9, epsilon=1e-04, decay=0.0)
model.compile(loss='binary_crossentropy',
          optimizer=my_rmsprop,
          metrics=['accuracy'])

# Training 
nb_epoch = 32
nb_train_samples = 512
nb_test_samples = 512

model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples/batch_size,
    epochs=nb_epoch,
    verbose=1,
    validation_data=test_generator,
    validation_steps=nb_test_samples/batch_size)

# Evaluating on the testing set
model.evaluate_generator(test_generator, nb_test_samples)

Float16比Keras中的Float32慢 [英] Float16 slower than float32 in keras

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Float16比Keras中的Float32慢 [英] Float16 slower than float32 in keras

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭