使用文件名数据集,创建图像到元组的数据集 [英] Using a dataset of filenames, create a dataset of images to tuples
问题描述
我在一个文件夹中创建了一个包含许多图像文件名的 tensorflow 数据集.这些图像被命名为 [index].jpg,其中 index 是一些用于标识图像的整数.我有一个字符串 'index' 的字典来标记为元组.如何使用 tf.data.Dataset.map 将索引映射到标签元组?
这是我试图传递给 map 函数的 map_func:
defgrabImages(filepath):index = getIndexFromFilePath(filepath)img = tf.io.read_file(文件路径)img = translateImage(img)字典 = getLabelDictionary()返回索引,img
其中字典是标签字典的索引,索引是文件路径的索引,如 tf.Tensor 和 img 是位于文件路径的预处理图像.
这会返回一个带有索引的数据集,作为张量,映射到相应的图像.有没有办法使用 dictionary
使用 dictionary[index]
之类的东西来获取 index
的标签?基本上,我想找到索引的字符串内容.
我曾尝试在 grabImages
函数中使用 .numpy()
和 .eval()
和当前会话,但都不起作用.
这里有一个示例,说明如何在 tf.data.Dataset.map
函数中获取张量的字符串部分.>
以下是我在代码中实现的步骤.
- 你必须用
tf.py_function(get_path, [x], [tf.string])
来修饰map函数.您可以在此处找到有关 tf.py_function 的更多信息. - 您可以通过在 map 函数中使用
bytes.decode(file_path.numpy())
来获取字符串部分.
代码 -
%tensorflow_version 2.x将张量流导入为 tf将 numpy 导入为 npdef get_path(file_path):打印(文件路径:",bytes.decode(file_path.numpy()),类型(bytes.decode(file_path.numpy())))返回文件路径train_dataset = tf.data.Dataset.list_files('/content/bird.jpg')train_dataset = train_dataset.map(lambda x: tf.py_function(get_path, [x], [tf.string]))对于 train_dataset 中的 one_element:打印(一个元素)
输出 -
file_path:/content/bird.jpg (<tf.Tensor: shape=(), dtype=string, numpy=b'/content/bird.jpg'>,)
希望这能回答您的问题.
I create a tensorflow dataset of filenames of many images in a folder. The images are named [index].jpg, where index is some integer used to identify the images. I have a dictionary of string 'index' to labels as tuples. How, using tf.data.Dataset.map, can I map the index to a label tuple?
Here's the map_func I am trying to pass to the map function:
def grabImages(filepath):
index = getIndexFromFilePath(filepath)
img = tf.io.read_file(filepath)
img = translateImage(img)
dictionary = getLabelDictionary()
return index, img
Where dictionary is the index to labels dict, index is the index of the filepath as tf.Tensor and img is a preprocessed image that was at the filepath.
This returns a dataset with the index, as a tensor, mapped to the corresponding image. Is there a way to get the labels of the index
using dictionary
using something like dictionary[index]
? Basically, I want to find the string content of index.
I have tried using .numpy()
and .eval()
with the current session within the grabImages
function, but neither work.
Here is an example of how to get string part of a tensor in the tf.data.Dataset.map
function.
Below are the steps I have implemented in the code to achieve this.
- You have to decorate the map function with
tf.py_function(get_path, [x], [tf.string])
. You can find more about tf.py_function here. - You can get your string part by using
bytes.decode(file_path.numpy())
in map function.
Code -
%tensorflow_version 2.x
import tensorflow as tf
import numpy as np
def get_path(file_path):
print("file_path: ",bytes.decode(file_path.numpy()),type(bytes.decode(file_path.numpy())))
return file_path
train_dataset = tf.data.Dataset.list_files('/content/bird.jpg')
train_dataset = train_dataset.map(lambda x: tf.py_function(get_path, [x], [tf.string]))
for one_element in train_dataset:
print(one_element)
Output -
file_path: /content/bird.jpg <class 'str'>
(<tf.Tensor: shape=(), dtype=string, numpy=b'/content/bird.jpg'>,)
Hope this answers your question.
这篇关于使用文件名数据集,创建图像到元组的数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!