如何使用带有“num_epochs"集的“read_batch_examples"创建“input_fn"? [英] How to create `input_fn` using `read_batch_examples` with `num_epochs` set?

查看:34
本文介绍了如何使用带有“num_epochs"集的“read_batch_examples"创建“input_fn"?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个基本的 input_fn 可以与下面的 Tensorflow Estimators 一起使用.无需设置 num_epochs 参数即可完美运行;获得的张量具有离散形状.将 num_epochs 传入,因为 None 以外的任何内容都会导致未知形状.我的问题在于在使用 num_epochs 时构建稀疏张量;在不知道输入张量的形状的情况下,我无法弄清楚如何通用地创建所述张量.

I have a basic input_fn that can be used with Tensorflow Estimators below. It works flawlessly without setting the num_epochs parameter; the obtained tensor has a discrete shape. Pass in num_epochs as anything other than None results in an unknown shape. My issue lies with constructing sparse tensors whilst using num_epochs; I cannot figure out how to generically create said tensors without knowing the shape of the input tensor.

谁能想到解决这个问题的办法?我希望能够通过 num_epochs=1 来仅对数据集进行 1 次评估,并传递给 predict 以产生一组预测数据集的大小,不多不少.

Can anyone think of a solution to this problem? I'd like to be able to pass num_epochs=1 to be able to evaluate only 1 time over the data set, as well as to pass to predict to yield a set of predictions the size of the data set, no more no less.

def input_fn(batch_size):
    examples_op = tf.contrib.learn.read_batch_examples(
        FILE_NAMES,
        batch_size=batch_size,
        reader=tf.TextLineReader,
        num_epochs=1,
        parse_fn=lambda x: tf.decode_csv(x, [tf.constant([''], dtype=tf.string)] * len(HEADERS)))

    examples_dict = {}
    for i, header in enumerate(HEADERS):
        examples_dict[header] = examples_op[:, i]

    continuous_cols = {k: tf.string_to_number(examples_dict[k], out_type=tf.float32)
                       for k in CONTINUOUS_FEATURES}

    # Problems lay here while creating sparse categorical tensors
    categorical_cols = {
        k: tf.SparseTensor(
            indices=[[i, 0] for i in range(examples_dict[k].get_shape()[0])],
            values=examples_dict[k],
            shape=[int(examples_dict[k].get_shape()[0]), 1])
        for k in CATEGORICAL_FEATURES}

    feature_cols = dict(continuous_cols)
    feature_cols.update(categorical_cols)
    label = tf.string_to_number(examples_dict[LABEL], out_type=tf.int32)

    return feature_cols, label

推荐答案

我通过创建一个特定于 input_fn 预期内容的函数解决了上述问题;它接受一个密集的列并在不知道形状的情况下创建一个 SparseTensor.该功能是使用 tf.rangetf.shape 实现的.闲话少说,这里是通用的 input_fn 代码,它独立于 num_epochs 的设置:

I have solved the above issue by creating a function specific to what's expected on an input_fn; it takes in a dense column and creates a SparseTensor without knowing shape. The function was made possible using tf.range and tf.shape. Without further ado, here is the working generic input_fn code that works independently of num_epochs being set:

def input_fn(batch_size):
    examples_op = tf.contrib.learn.read_batch_examples(
        FILE_NAMES,
        batch_size=batch_size,
        reader=tf.TextLineReader,
        num_epochs=1,
        parse_fn=lambda x: tf.decode_csv(x, [tf.constant([''], dtype=tf.string)] * len(HEADERS)))

    examples_dict = {}
    for i, header in enumerate(HEADERS):
        examples_dict[header] = examples_op[:, i]

    feature_cols = {k: tf.string_to_number(examples_dict[k], out_type=tf.float32)
                    for k in CONTINUOUS_FEATURES}

    feature_cols.update({k: dense_to_sparse(examples_dict[k])
                         for k in CATEGORICAL_FEATURES})

    label = tf.string_to_number(examples_dict[LABEL], out_type=tf.int32)

    return feature_cols, label


def dense_to_sparse(dense_tensor):
    indices = tf.to_int64(tf.transpose([tf.range(tf.shape(dense_tensor)[0]), tf.zeros_like(dense_tensor, dtype=tf.int32)]))
    values = dense_tensor
    shape = tf.to_int64([tf.shape(dense_tensor)[0], tf.constant(1)])

    return tf.SparseTensor(
        indices=indices,
        values=values,
        shape=shape
    )

希望这对某人有所帮助!

Hope this helps someone!

这篇关于如何使用带有“num_epochs"集的“read_batch_examples"创建“input_fn"?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆