在 Estimator 中使用“read_batch_record_features" [英] Using 'read_batch_record_features' with an Estimator

查看:31
本文介绍了在 Estimator 中使用“read_batch_record_features"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

(我使用的是 tensorflow 1.0 和 Python 2.7)

(I'm using tensorflow 1.0 and Python 2.7)

我在使用 Estimator 处理队列时遇到问题.实际上,如果我将已弃用的 SKCompat 接口与自定义数据文件和给定的批量大小一起使用,模型就会正确训练.我正在尝试使用带有 input_fn 的新界面,该界面从 TFRecord 文件(相当于我的自定义数据文件)中批量处理功能.脚本运行正常,但损失值在 200 或 300 步后不会改变.模型似乎在一个小的输入批次上循环(这可以解释为什么损失收敛得如此之快).

I'm having trouble getting an Estimator to work with queues. Indeed, if I use the deprecated SKCompat interface with custom data files and a given batch size, the model trains properly. I'm trying to use the new interface with an input_fn that batches features out of TFRecord files (equivalent to my custom data files). The scripts runs properly but the loss value doesn't change after 200 or 300 steps. It seems that the model is looping on a small input batch (this would explain why the loss converges so fast).

我有一个run.py"脚本,如下所示:

I have a 'run.py' script that looks like the following:

import tensorflow as tf
from tensorflow.contrib import learn, metrics

#[...]
evalMetrics = {'accuracy':learn.MetricSpec(metric_fn=metrics.streaming_accuracy)}
runConfig = learn.RunConfig(save_summary_steps=10)
estimator = learn.Estimator(model_fn=myModel,
                            params=myParams,
                            modelDir='/tmp/myDir',
                            config=runConfig)

session = tf.Session(graph=tf.get_default_graph())

with session.as_default():
  tf.global_variables_initializer()
  coordinator = tf.train.Coordinator()
  threads = tf.train.start_queue_runners(sess=session,coord=coordinator)

  estimator.fit(input_fn=lambda: inputToModel(trainingFileList),steps=10000)

  estimator.evaluate(input_fn=lambda: inputToModel(evalFileList),steps=10000,metrics=evalMetrics)

  coordinator.request_stop()
  coordinator.join(threads)
session.close()

我的 inputToModel 函数如下所示:

My inputToModel function looks like this:

import tensorflow as tf

def inputToModel(fileList):
  features = {'rawData': tf.FixedLenFeature([100],tf.float32),
              'label': tf.FixedLenFeature([],tf.int64)}
  tensorDict = tf.contrib.learn.read_batch_record_features(fileList,
                                batch_size=100,
                                features=features,
                                randomize_input=True,
                                reader_num_threads=4,
                                num_epochs=1,
                                name='inputPipeline')
  tf.local_variables_initializer()
  data = tensorDict['rawData']
  labelTensor = tensorDict['label']
  inputTensor = tf.reshape(data,[-1,10,10,1])

  return inputTensor,labelTensor

欢迎任何帮助或建议!

推荐答案

尝试使用:tf.global_variables_initializer().run()

我想做类似的事情,但我不知道如何将 Estimator API 与多线程一起使用.也有一个实验课供服务 - 可能有用

I wanna do a similar thing but I do not know how to use Estimator API with multi-threading. There is an Experiment class for serving too - might be useful

删除session = tf.Session(graph=tf.get_default_graph())session.close() 并尝试:

with tf.Session() as sess:
  tf.global_variables_initializer().run()

这篇关于在 Estimator 中使用“read_batch_record_features"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆