提供给`tf.data.Dataset.from_generator(...)`的map函数可以解析张量对象吗? [英] Can the map function supplied to `tf.data.Dataset.from_generator(...)` resolve a tensor object?

查看:467
本文介绍了提供给`tf.data.Dataset.from_generator(...)`的map函数可以解析张量对象吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想创建一个 tf.data.Dataset.from_generator(...)数据集。我需要传递一个Python generator

I'd like to create a tf.data.Dataset.from_generator(...) dataset. I need to pass in a Python generator.

我想像这样将先前数据集的属性传递给生成器:

I would like to pass in a property of a previous dataset to the generator like so:

dataset = dataset.interleave(
  map_func=lambda x: tf.data.Dataset.from_generator(generator=lambda: gen(x), output_types=tf.int64),
  cycle_length=2
)

在这里定义 gen(...)以获取一个值(这是指向某些数据的指针,例如 gen 知道如何访问)。

Where I define gen(...) to take a value (which is a pointer to some data such as a filename which gen knows how to access).

之所以失败,是因为 gen 接收到张量对象,而不是python / numpy值。

This fails because gen receives a tensor object, not a python/numpy value.


是否可以将张量对象解析为 gen(...)内部的值

生成器交错的原因是,我可以与其他生成器一起处理数据指针/文件名列表数据集操作,例如 .shuffle() .repeat(),而无需将其烘焙到 gen(...)函数,如果我直接从数据指针/文件名列表中开始使用生成器,这将是必需的。

The reason for interleaving the generators is so I can manipulate the list of data-pointers/filenames with other dataset operations such as .shuffle() and .repeat() without the need to bake those into the gen(...) function, which would be necessary if I started with the generator directly from the list of data-pointers/filenames.

我想使用生成器,因为每个数据指针/文件名将生成大量数据值。

I want to use the generator because a large number of data values will be generated per data-pointer/filename.

推荐答案

TensorFlow现在支持将张量参数传递给生成器:

TensorFlow now supports passing tensor arguments to the generator:

def map_func(tensor):
    dataset = tf.data.Dataset.from_generator(generator, tf.float32, args=(tensor,))
    return dataset

这篇关于提供给`tf.data.Dataset.from_generator(...)`的map函数可以解析张量对象吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆