使用tensorflow的Dataset管道,如何“命名" map操作的结果? [英] Using tensorflow's Dataset pipeline, how do I *name* the results of a `map` operation?
问题描述
我在下面有一个map函数(可运行的示例),该函数输入一个string
并输出一个string
和integer
.
I have the map function below (runnable example), which inputs a string
and outputs a string
and an integer
.
我将原始输入命名为'filenames'
.但是,当我从map函数map_element_counts
返回值时,我只能返回一个元组(返回字典会产生异常).
in tf.data.Dataset.from_tensor_slices
I named the original input 'filenames'
. But when I return the values from the map function map_element_counts
I can only return a tuple (returning a dictionary generates an exception).
是否可以命名从我的map_element_counts
函数返回的2个元素?
Is there a way to name the 2 elements returned from my map_element_counts
function?
import tensorflow as tf
filelist = ['fileA_6', 'fileB_10', 'fileC_7']
def map_element_counts(fname):
# perform operations outside of tensorflow
return 'test', 10
ds = tf.data.Dataset.from_tensor_slices({'filenames': filelist})
ds = ds.map(map_func=lambda x: tf.py_func(
func=map_element_counts, inp=[x['filenames']], Tout=[tf.string, tf.int64]
))
element = ds.make_one_shot_iterator().get_next()
with tf.Session() as sess:
print(sess.run(element))
结果:
(b'test', 10)
所需结果:
{'elementA': b'test', 'elementB': 10)
添加的详细信息:
当我执行return {'elementA': 'test', 'elementB': 10}
时,出现此异常:
When I do return {'elementA': 'test', 'elementB': 10}
I get this exception:
tensorflow.python.framework.errors_impl.UnimplementedError: Unsupported object type dict
推荐答案
在ds.map
内部应用tf.py_func
可行.
我创建了一个非常简单的文件作为示例.我在里面写10的地方.
I created a very simple file as example. Where I just write 10 inside.
dummy_file.txt:
dummy_file.txt:
10
此处是脚本:
import tensorflow as tf
filelist = ['dummy_file.txt', 'dummy_file.txt', 'dummy_file.txt']
def py_func(input):
# perform operations outside of tensorflow
parsed_txt_file = int(input)
return 'test', parsed_txt_file
def map_element_counts(fname):
# let tensorflow read the text file
file_string = tf.read_file(fname['filenames'])
# then use python function on the extracted string
a, b = tf.py_func(
func=py_func, inp=[file_string], Tout=[tf.string, tf.int64]
)
return {'elementA': a, 'elementB': b, 'file': fname['filenames']}
ds = tf.data.Dataset.from_tensor_slices({'filenames': filelist})
ds = ds.map(map_element_counts)
element = ds.make_one_shot_iterator().get_next()
with tf.Session() as sess:
print(sess.run(element))
print(sess.run(element))
print(sess.run(element))
输出:
{'file': b'dummy_file.txt', 'elementA': b'test', 'elementB': 10}
{'file': b'dummy_file.txt', 'elementA': b'test', 'elementB': 10}
{'file': b'dummy_file.txt', 'elementA': b'test', 'elementB': 10}
这篇关于使用tensorflow的Dataset管道,如何“命名" map操作的结果?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!