使用自定义文件格式创建TensorFlow数据集 [英] Create TensorFlow Dataset with custom file format

查看:213
本文介绍了使用自定义文件格式创建TensorFlow数据集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个tf.data.Dataset,其中文件名映射到Depth图像.我的图像保存为原始二进制文件,每个文件320 * 240 * 4字节.图片为320x240像素,其中4个字节代表一个像素.

I am trying to create a tf.data.Dataset, where filenames are mapped to Depth images. My images are saved as raw binary, 320*240*4 bytes per file. Images are 320x240 pixels, with 4 bytes representing a pixel.

我无法弄清楚如何创建一个解析函数,该函数将使用tf.Tensor文件名,并返回包含我的图像的(240,320)tf.Tensor.

I cannot figure out how to create a parsing function that will take a tf.Tensor filename, and return a (240, 320) tf.Tensor containing my image.

这是我尝试过的.

import tensorflow as tf
import numpy as np
import struct
import math
from os import listdir


class Dataset:
    def __init__(self):
        filenames = ["./depthframes/" + f for f in listdir("./depthframes/")]

        self._dataset = tf.data.Dataset.from_tensor_slices(filenames).map(Dataset._parse)

    @staticmethod
    def _parse(filename):
        img = DepthImage(filename)
        return img.frame


class DepthImage:
    def __init__(self, path):
        self.rows, self.cols = 240, 320
        self.f = open(path, 'rb')
        self.frame = []
        self.get_frame()

    def _get_frame(self):
        for row in range(self.rows):
            tmp_row = []
            for col in range(self.cols):
                tmp_row.append([struct.unpack('i', self.f.read(4))[0], ])
            tmp_row = [[0, ] if math.isnan(i[0]) else list(map(int, i)) for i in tmp_row]
            self.frame.append(tmp_row)

    def get_frame(self):
        self._get_frame()
        self.frame = tf.convert_to_tensor(np.array(self.frame).reshape(240, 320))


if __name__ == "__main__":
    Dataset()

我的错误如下:

File "C:/Users/gcper/Code/STEM/msrdailyact3d.py", line 23, in __init__ 
    self.f = open(path, 'rb')
TypeError: expected str, bytes or os.PathLike object, not Tensor

推荐答案

根据@kvsih的建议,以下解决方案有效.

Following the suggestion of @kvsih, the following solution worked.

self._dataset = tf.data.Dataset.from_tensor_slices(filenames)\
    .map(lambda name: tf.py_func(self._parse, [name], tf.int32))

此外,get_frame无法返回张量. self._parse必须返回上面的lambda中定义的tf.int32.以下代码替换了get_frame

Also, get_frame cannot return a tensor. self._parse must return an tf.int32, as defined in the lambda above. The following code replaces get_frame

def get_frame(self):
    self._get_frame()
    self.frame = np.array(self.frame).reshape(240, 320)

这篇关于使用自定义文件格式创建TensorFlow数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆