使用 for 循环迭代数据集 TF 2.0 [英] Iterating over a Dataset TF 2.0 with for loop

查看:45
本文介绍了使用 for 循环迭代数据集 TF 2.0的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题是关于如何在 make_initializable_iterator() 被弃用的情况下迭代一个 TF 数据集.

This problem is about how to iterate over a TF Dataset given that make_initializable_iterator() is deprecated.

我使用以下函数读取数据集:

I read a data set with the function below:

def read_dataset_new(filename, target='delay'):
    ds = tf.data.TFRecordDataset(filename)
    ds = ds.map(lambda buf: parse(buf, target=target))
    ds = ds.batch(1)
    return ds

然后我想遍历数据集.我一直在使用:https://www.tensorflow.org/api_docs/python/tf/数据/数据集#make_initializable_iterator

Then I want to iterate over the data set. I have been using: https://www.tensorflow.org/api_docs/python/tf/data/Dataset#make_initializable_iterator

with tf.compat.v1.Session() as sess:
    data_set = tfr_utils.read_dataset_new(self.tf_rcrds_fl_nm)
    itrtr = data_set.make_initializable_iterator()
    sess.run(itrtr.initializer)
    features, label = itrtr.get_next()
    features_keys = features.keys()
...

但是警告:此功能已弃用.它将在以后的版本中删除.更新说明:用于 ... in dataset:...."

But "Warning: THIS FUNCTION IS DEPRECATED. It will be removed in a future version. Instructions for updating: Use for ... in dataset:...."

除了弃用警告之外,我的代码按预期工作.

Apart from the deprecation warning, my code works as expected.

不过,鉴于弃用警告,我现在正在尝试:

Given the deprecation warning, though, I am now trying this:

with tf.compat.v1.Session() as sess:
    data_set = tfr_utils.read_dataset_new(self.tf_rcrds_fl_nm)
    for features, label in data_set:
        features_keys = features.keys()
        ...

但这行不通.我得到:

self = <tensorflow.python.client.session.Session object at 0x12f2e57d0>
fn = <function BaseSession._do_run.<locals>._run_fn at 0x12f270440>
args = ({}, [<tensorflow.python.pywrap_tensorflow_internal.TF_Output; proxy of <Swig Object of type 'TF_Output *' at 0x12f3f75a0> >], [], None, None)
message = 'Resource AnonymousIterator/AnonymousIterator0/N10tensorflow4data16IteratorResourceE does not exist.\n\t [[node Iterat...tNext_1 (defined at /demo-routenet/tests/unit/test_tfrecord_utils.py:376) ]]'
m = <re.Match object; span=(102, 130), match='[[{{node IteratorGetNext_1}}'>

我能够找到的所有代码示例都显式地创建了一个迭代器,这显然不是人们应该做的.不过,我找不到一个应该做什么的例子.

The code samples I have been able to find all explicitly create an iterator, which is apparently not what one is supposed to do. I can't find an example of what one is supposed to do though.

我怀疑某些东西尚未初始化.所以,我也试过:

I suspect that something has not been initialised. So, I also tried:

sess.run(data_set)

但这也不起作用(我也没有任何理由认为它应该起作用,但只是为了让你们都知道我尝试了什么).

But that didn't work either (nor do I have any reason to suppose it should have, but just so you all know what I tried).

那么,如弃用评论所建议的那样,如何在 for 循环中使用数据集?

So, how does one use a Dataset in a for loop as the deprecation comment suggests please?

推荐答案

您想在输出中得到什么不是很清楚.如果您想获得数据集输出的值,您应该急切地执行.示例:

It is not very clear what you want to get at your output. If you want to get the values of the dataset output you should execute eagerly. Example:

tf.compat.v1.enable_eager_execution()

def read_dataset_new(filename, target='delay'):
    ds = tf.data.TFRecordDataset(filename)
    ds = ds.map(lambda buf: parse(buf, target=target))
    ds = ds.batch(1)
    return ds
# This should return your key values for each example.
for features, labels in read_dataset_new(self.tf_rcrds_fl_nm):
    features_keys = features.keys()
# This should return your tensor values if they supposed to be numeric.
for features, labels in read_dataset_new(self.tf_rcrds_fl_nm):
    features_array = numpy.array(features)

这篇关于使用 for 循环迭代数据集 TF 2.0的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆