为什么使用TensorFlow进行深度学习的代码结果与其书中的快照有何不同? [英] Why is the result of the code offered by Deep Learning with TensorFlow different from the snapshot in its book
问题描述
在使用TensorFlow进行深度学习的第一章中,给出了有关如何构建用于识别手写数字的简单神经网络的示例.根据其描述,可以在
这本书的快照是:
我的屏幕快照中的结果为 375/375
,而书本快照中的结果为 48000/48000
.另外,我错过了48000个样本的 Train行,12000个样本的验证行
.为什么会这样?如何从本书的快照中获得相同的结果?
从我的输出中,我认为矿山加载的数据集的大小与代码中描述的大小相同:
#正在加载MNIST数据集# 核实#训练与测试之间的分配比例为60,000,分别为10,000#one-hot自动应用mnist = keras.datasets.mnist(X_train,Y_train),(X_test,Y_test)= mnist.load_data()
我的软件包版本:
<代码> $ python --version的Python 3.6.8$ python3 -c'将tensorflow导入为tf;打印(tf .__ version__)'2.3.1$ python3 -c'将tensorflow导入为tf;打印(tf.keras .__ version__)'2.4.0
我试图从源代码中找到答案. fit
方法在 training.py .在这种方法中,它实例化了 CallbackList 对象,然后创建 ProgbarLogger .
#training.py class模型方法合适#配置并调用`tf.keras.Callback`s的容器.如果不是isinstance(callbacks,callbacks_module.CallbackList):回调= callbacks_module.CallbackList(回调,add_history =真,add_progbar =详细!= 0,模特=自我verbose = verbose,epochs = epochs,steps = data_handler.inferred_steps)
#callbacks.py class ProgbarLoggerdef on_epoch_begin(self,epoch,logs = None):self._reset_progbar()如果self.verbose和self.epochs>1:print('Epoch%d/%d'%(epoch + 1,self.epochs))def on_train_batch_end(self,batch,logs = None):self._batch_update_progbar(批处理,日志)def _batch_update_progbar(自身,批处理,日志=无):#...如果self.verbose == 1:#仅在详细= 1时阻止异步日志= tf_utils.to_numpy_or_python_type(日志)self.progbar.update(self.seen,list(logs.items()),finalize = False)
ProgbarLogger
然后调用 ProgBar 更新方法来更新进度条.
#generic_utils.py类ProgBar方法更新如果self.verbose == 1:#...如果self.target不为None:numdigits = int(np.log10(self.target))+ 1bar =('%'+ str(数字)+'d/%d [')%(当前,self.target)
375
是 self.target
的值.然后,我发现 self.target
的值是从 CallbackList
对象的 steps
参数传递的.在第一个代码段中,您可以看到 steps = data_handler.inferred_steps
.属性 inferred_steps
在 data_adapter.py .
@propertydef inferred_steps(自己):所创建的数据集"的每个时期的推断步骤.在以下情况下将为无":(1)将基数未知的数据集"传递给数据处理程序",并且(2)没有提供"steps_per_epoch",并且(3)迭代的第一个时期尚未完成.返回值:所创建的数据集"的每个时期的推断步骤."返回self._inferred_steps
我迷失了 self._inferred_steps
的计算方式.
我认为小姐行与 training_arrays_v1.py .但是我不知道V1是什么意思.
<代码> def _print_train_info(num_samples_or_steps,val_samples_or_steps,is_dataset):如果is_dataset,则增量='steps',否则为'samples'msg ='以{0} {increment}递增的火车'.format(num_samples_or_steps,增量=增量)如果val_samples_or_steps:msg + =',在{0} {increment}'上验证..format(val_samples_or_steps,increment = increment)打印(味精)
好问题.
让我们将其分成更小的部分.
您训练了48.000个样本,并测试了12.000.但是,您的代码显示为375,而不是48.000.
如果查看批处理大小,则其值为128.
快速划分---> 48.000//128 = 375
您的代码正确无误.
问题来自于以下事实:在较旧版本的Keras和TensorFlow中,无论使用了多少batch_size,都显示了每个步骤的整个样本(48.000).在此示例中,进度条的更新方式为: 0、128、256 ....直到48.000
.
现在,在最新版本中, steps_per_epoch
和 validation_steps
参数等于样本数量(例如 48.000
)除以 batch_size
尺寸(例如 128
),因此是 375
.
这两种显示都是正确的,只是进度条不同而已,我个人同意并更喜欢后者,因为如果您的batch_size为 128
,我宁愿同意逻辑看到 1,2,3 ... 375
.
更新以进一步澄清:
在这里,您可以详细了解 model.fit()
参数.
https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit
steps_per_epoch整数或无.总步骤数(批次样本),然后再声明一个纪元并开始下一个纪元时代.
validation_steps仅当提供并提供了validate_data时才相关tf.data数据集.绘制的总步骤数(批样品)在每个时期结束时执行验证之前停止.
In the first chapter of Deep Learning with TensorFlow, it gives an example on how to build a simple neural network for recognizing handwritten digits. According to its description, the code bundle for the book can be found at GitHub.
From the context, I think section Running a simple TensorFlow 2.0 net and establishing a baseline uses the code same with Deep-Learning-with-TensorFlow-2-and-Keras/mnist_V1.py. When I run this example code, it gives me the following output:
The snapshot from the book is:
The result in my screenshot is 375/375
while in the snapshot of the book is 48000/48000
. Also, I miss the line Train on 48000 samples, validate on 12000 samples
. Why this happens? How can I get the same result with the snapshot from the book?
From my output, I think the size of mine loaded datasets is the same with what it describes in the code:
# loading MNIST dataset
# verify
# the split between train and test is 60,000, and 10,000 respectly
# one-hot is automatically applied
mnist = keras.datasets.mnist
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()
My package versions:
$ python --version
Python 3.6.8
$ python3 -c 'import tensorflow as tf; print(tf.__version__)'
2.3.1
$ python3 -c 'import tensorflow as tf; print(tf.keras.__version__)'
2.4.0
I tried to find answer from the source code. The fit
method is defined at training.py. In this method, it instantiates a CallbackList object which then creates ProgbarLogger.
# training.py class Model method fit
# Container that configures and calls `tf.keras.Callback`s.
if not isinstance(callbacks, callbacks_module.CallbackList):
callbacks = callbacks_module.CallbackList(
callbacks,
add_history=True,
add_progbar=verbose != 0,
model=self,
verbose=verbose,
epochs=epochs,
steps=data_handler.inferred_steps)
# callbacks.py class ProgbarLogger
def on_epoch_begin(self, epoch, logs=None):
self._reset_progbar()
if self.verbose and self.epochs > 1:
print('Epoch %d/%d' % (epoch + 1, self.epochs))
def on_train_batch_end(self, batch, logs=None):
self._batch_update_progbar(batch, logs)
def _batch_update_progbar(self, batch, logs=None):
# ...
if self.verbose == 1:
# Only block async when verbose = 1.
logs = tf_utils.to_numpy_or_python_type(logs)
self.progbar.update(self.seen, list(logs.items()), finalize=False)
ProgbarLogger
then calls ProgBar update method to update the progress bar.
# generic_utils.py class ProgBar method update
if self.verbose == 1:
# ...
if self.target is not None:
numdigits = int(np.log10(self.target)) + 1
bar = ('%' + str(numdigits) + 'd/%d [') % (current, self.target)
375
is the value of self.target
. I then find out the value of self.target
is passed from the steps
parameter of CallbackList
object. In the first code snippet, you can see steps=data_handler.inferred_steps
. The property inferred_steps
is defined at data_adapter.py.
@property
def inferred_steps(self):
"""The inferred steps per epoch of the created `Dataset`.
This will be `None` in the case where:
(1) A `Dataset` of unknown cardinality was passed to the `DataHandler`, and
(2) `steps_per_epoch` was not provided, and
(3) The first epoch of iteration has not yet completed.
Returns:
The inferred steps per epoch of the created `Dataset`.
"""
return self._inferred_steps
I got lost on how self._inferred_steps
is calculated.
I think the miss line is related with training_arrays_v1.py. But I don't know what does V1 mean.
def _print_train_info(num_samples_or_steps, val_samples_or_steps, is_dataset):
increment = 'steps' if is_dataset else 'samples'
msg = 'Train on {0} {increment}'.format(
num_samples_or_steps, increment=increment)
if val_samples_or_steps:
msg += ', validate on {0} {increment}'.format(
val_samples_or_steps, increment=increment)
print(msg)
Good question.
Let us break it into smaller parts.
You train on 48.000 samples and test on 12.000. However, your code display 375 instead of 48.000.
If you look at the batch size, its value is 128.
A quick division ---> 48.000 // 128 = 375
Your code is correct, which is good.
The problem comes from the fact that, in older version of Keras and TensorFlow, the entire samples per step were shown (48.000), regardless of the batch_size used. In this example, the progress bar is updated like: 0, 128, 256 .... until 48.000
.
Now, in more recent versions, the steps_per_epoch
and validation_steps
parameters are equal to the number of samples (say 48.000
) divided to the batch_size
dimension (say 128
), hence the 375
.
Both displays are correct, it is just a matter of different progress bars, I personally agree and prefer the latter, since, if you have a batch_size of 128
, I would rather agree with the logic of seeing 1, 2, 3 ... 375
.
Update for further clarification:
Here you have a detailed description of the model.fit()
arguments.
https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit
steps_per_epoch Integer or None. Total number of steps (batches of samples) before declaring one epoch finished and starting the next epoch.
validation_steps Only relevant if validation_data is provided and is a tf.data dataset. Total number of steps (batches of samples) to draw before stopping when performing validation at the end of every epoch.
这篇关于为什么使用TensorFlow进行深度学习的代码结果与其书中的快照有何不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!