Tensorflow 将数据从 tfrecords 正确读取到小批量中 [英] Tensorflow reading data from tfrecords into mini batches properly
问题描述
我正在尝试将数据从 csv 转换为 tfrecords,然后以小批量读取它并执行简单的 MLP,但我遇到了一些我无法弄清楚的错误.
<块引用>运行时错误:尝试使用关闭的会话.
关注:
<块引用>TypeError:提要的值不能是 tf.Tensor 对象.可接受的提要值包括 Python 标量、字符串、列表或 numpy ndarray.
我猜 shuffle 批处理队列正在关闭,不再提供预期的数据.另外,我想我错过了从 shuffle 队列到 feed dict 的一步.任何想法如何使这项工作或更好的方式做到这一点?
这是我的代码:
将 numpy 导入为 np将张量流导入为 tf进口大熊猫文件名 = "test.tfrecords"writer = tf.python_io.TFRecordWriter(文件名)csv = pandas.read_csv("TRAINING.csv").values使用 tf.python_io.TFRecordWriter(filename) 作为编写器:对于 csv 中的行:特征,标签 = 行 [:-1],行 [-1]示例 = tf.train.Example()example.features.feature["avgs"].float_list.value.extend(features)example.features.feature["pdiff"].float_list.value.append(label)writer.write(example.SerializeToString())def read_and_decode_single_example(文件名):filename_queue = tf.train.string_input_producer([文件名],num_epochs=无)阅读器 = tf.TFRecordReader()_, serialized_example = reader.read(filename_queue)特征 = tf.parse_single_example(serialized_example,特征={'pdiff': tf.FixedLenFeature([], np.float32),'avgs': tf.FixedLenFeature([14], np.float32)})pdiff = 功能['pdiff']avgs = 功能 ['avgs']返回平均值,pdiff平均,pdiff = read_and_decode_single_example(文件名)avgs_batch, pdiff_batch = tf.train.shuffle_batch([平均,pdiff],batch_size=200,容量=2000,min_after_dequeue=600)n_features = 14批量大小 = 50hidden_units = 7lr = .03X = tf.placeholder(tf.float32,[None,n_features])Y = tf.placeholder(tf.float32,[无])W = tf.Variable(tf.truncated_normal([n_features,hidden_units]))Wout = tf.Variable(tf.truncated_normal([hidden_units,1]))b = tf.Variable(tf.zeros([hidden_units]))回合 = tf.Variable(tf.zeros([1]))hidden1 = tf.nn.sigmoid(tf.matmul(X,W) + b)pred = tf.matmul(hidden1,Wout) + 回合损失 = tf.reduce_sum(tf.pow(pred - Y,2))优化器 = tf.train.AdamOptimizer(lr).minimize(loss)使用 tf.Session() 作为 sess:init = tf.global_variables_initializer()sess.run(初始化)tf.train.start_queue_runners(sess=sess)对于范围内的步长(1000):_, loss_val = sess.run([优化器,损失],feed_dict={X: avgs_batch, Y: pdiff_batch} )
<块引用>
堆栈跟踪:
ERROR:tensorflow:Exception in QueueRunner: Attempted to use a closed Session.回溯(最近一次调用最后一次):错误:张量流:QueueRunner 中的异常:尝试使用已关闭的会话.文件tf_tb.py",第 110 行,位于 <module>feed_dict={X: avgs_batch, Y: pdiff_batch} )运行中的文件/Users/mnaymik/.envs/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py",第 766 行run_metadata_ptr)文件/Users/mnaymik/.envs/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py",第 924 行,在 _run线程 Thread-1 中的异常:回溯(最近一次调用最后一次):文件/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/threading.py",第 914 行,在 _bootstrap_innerself.run()文件/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/threading.py",第862行,运行中self._target(*self._args, **self._kwargs)文件/Users/mnaymik/.envs/venv/lib/python3.5/site-packages/tensorflow/python/training/queue_runner_impl.py",第234行,在_runsess.run(enqueue_op)运行中的文件/Users/mnaymik/.envs/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py",第 766 行run_metadata_ptr)文件/Users/mnaymik/.envs/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py",第 902 行,在 _runraise RuntimeError('试图使用一个关闭的会话.')运行时错误:尝试使用已关闭的会话.线程 Thread-2 中的异常:回溯(最近一次调用最后一次):文件/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/threading.py",第 914 行,在 _bootstrap_innerself.run()文件/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/threading.py",第862行,运行中self._target(*self._args, **self._kwargs)文件/Users/mnaymik/.envs/venv/lib/python3.5/site-packages/tensorflow/python/training/queue_runner_impl.py",第234行,在_runsess.run(enqueue_op)运行中的文件/Users/mnaymik/.envs/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py",第 766 行run_metadata_ptr)文件/Users/mnaymik/.envs/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py",第 902 行,在 _runraise RuntimeError('试图使用一个关闭的会话.')运行时错误:尝试使用已关闭的会话.raise TypeError('提要的值不能是 tf.Tensor 对象.'类型错误:提要的值不能是 tf.Tensor 对象.可接受的提要值包括 Python 标量、字符串、列表或 numpy ndarray.
占位符是将数据导入模型的一种方式.队列是另一个.您可以像任何其他张量(例如占位符)一样覆盖由队列运行器生成的张量中的值,但您不能将张量的结果提供给同一图形/会话运行中的占位符.
换句话说,与其创建占位符,不如使用 tf.train.shuffle_batch
调用的输出:
X = avgs_batchY = pdiff_batch
(或将所有对 X
和 Y
的引用分别替换为 avgs_batch
和 pdiff_batch
)
I am trying to convert data from csv to tfrecords then read it out in mini batches and do a simple MLP but I am getting some errors that I can't figure out.
RuntimeError: Attempted to use a closed Session.
Followed by:
TypeError: The value of a feed cannot be a tf.Tensor object. Acceptable feed values include Python scalars, strings, lists, or numpy ndarrays.
I'm guessing the shuffle batch queue is closing and no longer feeding the expected data. Also, I think I am missing a step going from the shuffle queue to the feed dict. Any ideas how to make this work or a better way of doing it?
Here is my code:
import numpy as np
import tensorflow as tf
import pandas
filename = "test.tfrecords"
writer = tf.python_io.TFRecordWriter(filename)
csv = pandas.read_csv("TRAINING.csv").values
with tf.python_io.TFRecordWriter(filename) as writer:
for row in csv:
features, label = row[:-1], row[-1]
example = tf.train.Example()
example.features.feature["avgs"].float_list.value.extend(features)
example.features.feature["pdiff"].float_list.value.append(label)
writer.write(example.SerializeToString())
def read_and_decode_single_example(filename):
filename_queue = tf.train.string_input_producer([filename],
num_epochs=None)
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
features = tf.parse_single_example(
serialized_example,
features={
'pdiff': tf.FixedLenFeature([], np.float32),
'avgs': tf.FixedLenFeature([14], np.float32)})
pdiff = features['pdiff']
avgs = features['avgs']
return avgs, pdiff
avgs, pdiff = read_and_decode_single_example(filename)
avgs_batch, pdiff_batch = tf.train.shuffle_batch(
[avgs, pdiff], batch_size=200,
capacity=2000,
min_after_dequeue=600)
n_features = 14
batch_size = 50
hidden_units = 7
lr = .03
X = tf.placeholder(tf.float32,[None,n_features])
Y = tf.placeholder(tf.float32,[None])
W = tf.Variable(tf.truncated_normal([n_features,hidden_units]))
Wout = tf.Variable(tf.truncated_normal([hidden_units,1]))
b = tf.Variable(tf.zeros([hidden_units]))
bout = tf.Variable(tf.zeros([1]))
hidden1 = tf.nn.sigmoid(tf.matmul(X,W) + b)
pred = tf.matmul(hidden1,Wout) + bout
loss = tf.reduce_sum(tf.pow(pred - Y,2))
optimizer = tf.train.AdamOptimizer(lr).minimize(loss)
with tf.Session() as sess:
init = tf.global_variables_initializer()
sess.run(init)
tf.train.start_queue_runners(sess=sess)
for step in range(1000):
_, loss_val = sess.run([optimizer,loss],
feed_dict={X: avgs_batch, Y: pdiff_batch} )
Stack Trace:
ERROR:tensorflow:Exception in QueueRunner: Attempted to use a closed Session.
Traceback (most recent call last):
ERROR:tensorflow:Exception in QueueRunner: Attempted to use a closed Session.
File "tf_tb.py", line 110, in <module>
feed_dict={X: avgs_batch, Y: pdiff_batch} )
File "/Users/mnaymik/.envs/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/Users/mnaymik/.envs/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 924, in _run
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
File "/Users/mnaymik/.envs/venv/lib/python3.5/site-packages/tensorflow/python/training/queue_runner_impl.py", line 234, in _run
sess.run(enqueue_op)
File "/Users/mnaymik/.envs/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/Users/mnaymik/.envs/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 902, in _run
raise RuntimeError('Attempted to use a closed Session.')
RuntimeError: Attempted to use a closed Session.
Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
File "/Users/mnaymik/.envs/venv/lib/python3.5/site-packages/tensorflow/python/training/queue_runner_impl.py", line 234, in _run
sess.run(enqueue_op)
File "/Users/mnaymik/.envs/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/Users/mnaymik/.envs/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 902, in _run
raise RuntimeError('Attempted to use a closed Session.')
RuntimeError: Attempted to use a closed Session.
raise TypeError('The value of a feed cannot be a tf.Tensor object. '
TypeError: The value of a feed cannot be a tf.Tensor object. Acceptable feed values include Python scalars, strings, lists, or numpy ndarrays.
Placeholders are one way of getting data into your model. Queue's are another. You can override the value in a tensor generated by a queue runner like any other tensor (e.g. placeholders), but you can't feed the results from a tensor into a placeholder in the same graph/session run.
In other words, rather than creating placeholders, just use the output of your tf.train.shuffle_batch
call:
X = avgs_batch
Y = pdiff_batch
(or replace all references to X
and Y
with avgs_batch
and pdiff_batch
respectively)
这篇关于Tensorflow 将数据从 tfrecords 正确读取到小批量中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!