使用 for 循环迭代数据集 TF 2.0 [英] Iterating over a Dataset TF 2.0 with for loop
问题描述
这个问题是关于如何在 make_initializable_iterator()
被弃用的情况下迭代一个 TF 数据集.
This problem is about how to iterate over a TF Dataset given that make_initializable_iterator()
is deprecated.
我使用以下函数读取数据集:
I read a data set with the function below:
def read_dataset_new(filename, target='delay'):
ds = tf.data.TFRecordDataset(filename)
ds = ds.map(lambda buf: parse(buf, target=target))
ds = ds.batch(1)
return ds
然后我想遍历数据集.我一直在使用:https://www.tensorflow.org/api_docs/python/tf/数据/数据集#make_initializable_iterator
Then I want to iterate over the data set. I have been using: https://www.tensorflow.org/api_docs/python/tf/data/Dataset#make_initializable_iterator
with tf.compat.v1.Session() as sess:
data_set = tfr_utils.read_dataset_new(self.tf_rcrds_fl_nm)
itrtr = data_set.make_initializable_iterator()
sess.run(itrtr.initializer)
features, label = itrtr.get_next()
features_keys = features.keys()
...
但是警告:此功能已弃用.它将在以后的版本中删除.更新说明:用于 ... in dataset:...."
But "Warning: THIS FUNCTION IS DEPRECATED. It will be removed in a future version. Instructions for updating: Use for ... in dataset:...."
除了弃用警告之外,我的代码按预期工作.
Apart from the deprecation warning, my code works as expected.
不过,鉴于弃用警告,我现在正在尝试:
Given the deprecation warning, though, I am now trying this:
with tf.compat.v1.Session() as sess:
data_set = tfr_utils.read_dataset_new(self.tf_rcrds_fl_nm)
for features, label in data_set:
features_keys = features.keys()
...
但这行不通.我得到:
self = <tensorflow.python.client.session.Session object at 0x12f2e57d0>
fn = <function BaseSession._do_run.<locals>._run_fn at 0x12f270440>
args = ({}, [<tensorflow.python.pywrap_tensorflow_internal.TF_Output; proxy of <Swig Object of type 'TF_Output *' at 0x12f3f75a0> >], [], None, None)
message = 'Resource AnonymousIterator/AnonymousIterator0/N10tensorflow4data16IteratorResourceE does not exist.\n\t [[node Iterat...tNext_1 (defined at /demo-routenet/tests/unit/test_tfrecord_utils.py:376) ]]'
m = <re.Match object; span=(102, 130), match='[[{{node IteratorGetNext_1}}'>
我能够找到的所有代码示例都显式地创建了一个迭代器,这显然不是人们应该做的.不过,我找不到一个应该做什么的例子.
The code samples I have been able to find all explicitly create an iterator, which is apparently not what one is supposed to do. I can't find an example of what one is supposed to do though.
我怀疑某些东西尚未初始化.所以,我也试过:
I suspect that something has not been initialised. So, I also tried:
sess.run(data_set)
但这也不起作用(我也没有任何理由认为它应该起作用,但只是为了让你们都知道我尝试了什么).
But that didn't work either (nor do I have any reason to suppose it should have, but just so you all know what I tried).
那么,如弃用评论所建议的那样,如何在 for 循环中使用数据集?
So, how does one use a Dataset in a for loop as the deprecation comment suggests please?
推荐答案
您想在输出中得到什么不是很清楚.如果您想获得数据集输出的值,您应该急切地执行.示例:
It is not very clear what you want to get at your output. If you want to get the values of the dataset output you should execute eagerly. Example:
tf.compat.v1.enable_eager_execution()
def read_dataset_new(filename, target='delay'):
ds = tf.data.TFRecordDataset(filename)
ds = ds.map(lambda buf: parse(buf, target=target))
ds = ds.batch(1)
return ds
# This should return your key values for each example.
for features, labels in read_dataset_new(self.tf_rcrds_fl_nm):
features_keys = features.keys()
# This should return your tensor values if they supposed to be numeric.
for features, labels in read_dataset_new(self.tf_rcrds_fl_nm):
features_array = numpy.array(features)
这篇关于使用 for 循环迭代数据集 TF 2.0的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!