模型拟合未使用所有提供的数据 [英] Model fitting doesn't use all of the provided data
问题描述
我在玩Tensorflow 2.0 Keras入门教程( https:/时遇到问题. /www.tensorflow.org/tutorials/keras/classification ).
I ran into a problem when playing with the introduction Tutorial for Tensorflow 2.0 Keras (https://www.tensorflow.org/tutorials/keras/classification).
问题:
应该存在(并且有)60.000张图像以适合该模型.我通过打印出train_images
和train_labels
的长度来进行检查.
There should be (and there are) 60.000 Images to fit the model. I checked this by printing out the length of train_images
and train_labels
.
另一方面,在拟合模型时的输出让我相信,并非像1875/1875那样使用了所有数据.测试数据也一样.
The output when fitting the model on the other hand lets me believe that not all of the data was used as it says 1875/1875. Same for the Testing Data.
我停用了对此无效的GPU检测(看来).
我正在使用:
- Python 3.8.3
- Tensorflow 2.2.0
我的代码:
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
data = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = data.load_data()
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
# preprocess the image data to have a pixel value between 0 and 1
train_images = train_images / 255.0
test_images = test_images / 255.0
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dense(10)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=10)
test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)
print('\nTest accuracy:', test_acc)
输出:
2020-05-17 17:48:07.147033: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-17 17:48:10.075816: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-05-17 17:48:10.098581: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2020-05-17 17:48:10.105898: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP-UU9P1OG
2020-05-17 17:48:10.109837: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-UU9P1OG
2020-05-17 17:48:10.113879: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-05-17 17:48:10.127711: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x14dc97288a0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-05-17 17:48:10.132743: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
Epoch 1/10
1875/1875 [==============================] - 2s 1ms/step - loss: 0.4943 - accuracy: 0.8264
Epoch 2/10
1875/1875 [==============================] - 2s 938us/step - loss: 0.3747 - accuracy: 0.8649
Epoch 3/10
1875/1875 [==============================] - 2s 929us/step - loss: 0.3403 - accuracy: 0.8762
Epoch 4/10
1875/1875 [==============================] - 2s 914us/step - loss: 0.3146 - accuracy: 0.8844
Epoch 5/10
1875/1875 [==============================] - 2s 937us/step - loss: 0.2985 - accuracy: 0.8900
Epoch 6/10
1875/1875 [==============================] - 2s 923us/step - loss: 0.2808 - accuracy: 0.8964
Epoch 7/10
1875/1875 [==============================] - 2s 939us/step - loss: 0.2702 - accuracy: 0.8998
Epoch 8/10
1875/1875 [==============================] - 2s 911us/step - loss: 0.2585 - accuracy: 0.9032
Epoch 9/10
1875/1875 [==============================] - 2s 918us/step - loss: 0.2482 - accuracy: 0.9073
Epoch 10/10
1875/1875 [==============================] - 2s 931us/step - loss: 0.2412 - accuracy: 0.9106
313/313 - 0s - loss: 0.3484 - accuracy: 0.8729
Test accuracy: 0.8729000091552734
推荐答案
该模型的批处理大小为32,因此有60,000/32 = 1875
个批处理.
The model is being trained with a batchsize of 32, hence there are 60,000/32 = 1875
batches.
尽管 tensorflow文档在
Despite tensorflow documentation shows batch_size=None
in the fit
function overview, the information about this argument says:
batch_size
:整数或无.每个梯度更新的样本数. 如果未指定,batch_size将默认为32.如果您的数据采用数据集,生成器或keras.utils.Sequence实例的形式(因为它们生成批次),则不要指定batch_size.
batch_size
: Integer or None. Number of samples per gradient update. If unspecified, batch_size will default to 32. Do not specify the batch_size if your data is in the form of datasets, generators, or keras.utils.Sequence instances (since they generate batches).
这篇关于模型拟合未使用所有提供的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!