Tensorflow shuffle_batch 速度 [英] Tensorflow shuffle_batch speed

查看:20
本文介绍了Tensorflow shuffle_batch 速度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我注意到,如果我将训练数据加载到内存中并将其作为 numpy 数组馈入图中,与使用相同大小的 shuffle 批次相比,速度会有很大差异,我的数据有大约 1000 个实例.

使用内存 1000 次迭代只需不到几秒钟,但使用 shuffle 批处理需要近 10 分钟.我得到 shuffle batch 应该有点慢,但这似乎太慢了.这是为什么?

添加了赏金.关于如何更快地制作洗牌小批量的任何建议?

<块引用>

这是训练数据:链接到 bounty_training.csv (pastebin)

这是我的代码:

<块引用>

shuffle_batch

将 numpy 导入为 np将张量流导入为 tfdata = np.loadtxt('bounty_training.csv',delimiter=',',skiprows=1,usecols = (0,1,2,3,4,5,6,7,8,9,10,11,12,13,14))文件名 = "test.tfrecords"使用 tf.python_io.TFRecordWriter(filename) 作为编写器:对于数据行:特征,标签 = 行 [:-1],行 [-1]示例 = tf.train.Example()example.features.feature['features'].float_list.value.extend(features)example.features.feature['label'].float_list.value.append(label)writer.write(example.SerializeToString())def read_and_decode_single_example(文件名):filename_queue = tf.train.string_input_producer([文件名],num_epochs=无)阅读器 = tf.TFRecordReader()_, serialized_example = reader.read(filename_queue)特征 = tf.parse_single_example(serialized_example,特征={'标签':tf.FixedLenFeature([], np.float32),'功能':tf.FixedLenFeature([14], np.float32)})pdiff = 功能['标签']avgs = 特征['特征']返回平均值,pdiff平均,pdiff = read_and_decode_single_example(文件名)n_features = 14批量大小 = 1000hidden_​​units = 7lr = .001avgs_batch, pdiff_batch = tf.train.shuffle_batch([平均,pdiff],batch_size=batch_size,容量=5000,min_after_dequeue=2000)X = tf.placeholder(tf.float32,[None,n_features])Y = tf.placeholder(tf.float32,[None,1])W = tf.Variable(tf.truncated_normal([n_features,hidden_​​units]))b = tf.Variable(tf.zeros([hidden_​​units]))Wout = tf.Variable(tf.truncated_normal([hidden_​​units,1]))回合 = tf.Variable(tf.zeros([1]))hidden1 = tf.matmul(X,W) + bpred = tf.matmul(hidden1,Wout) + 回合损失 = tf.reduce_mean(tf.squared_difference(pred,Y))优化器 = tf.train.AdamOptimizer(lr).minimize(loss)使用 tf.Session() 作为 sess:init = tf.global_variables_initializer()sess.run(初始化)坐标 = tf.train.Coordinator()线程 = tf.train.start_queue_runners(sess=sess, coord=coord)对于范围内的步长(1000):x_, y_ = sess.run([avgs_batch,pdiff_batch])_, loss_val = sess.run([优化器,损失],feed_dict={X: x_, Y: y_.reshape(batch_size,1)} )如果步骤 % 100 == 0:打印(loss_val)coord.request_stop()coord.join(线程)

<块引用>

通过 numpy 数组批量处理

<代码>"""avgs 和 pdiff 首先加载到 numpy 数组中...与上面相同的型号"""使用 tf.Session() 作为 sess:init = tf.global_variables_initializer()sess.run(初始化)对于范围内的步长(1000):_, loss_value = sess.run([优化器,损失],feed_dict={X: avgs,Y: pdiff.reshape(n_instances,1)} )

解决方案

诀窍是,不是将单个示例输入 shuffle_batch,而是使用 enqueue_many=True 向其输入 n+1 维示例张量.我发现这个帖子很有帮助:

TFRecordReader 似乎非常慢,而且多线程阅读不工作

def get_batch(batch_size):阅读器 = tf.TFRecordReader()_, serialized_example = reader.read(filename_queue)批处理列表 = []对于范围内的 i(batch_size):batch_list.append(serialized_example)返回 [batch_list]batch_serialized_example = tf.train.shuffle_batch(get_batch(batch_size), batch_size=batch_size,容量=100*batch_size,min_after_dequeue=batch_size*10,num_threads=1,enqueue_many=真)特征 = tf.parse_example(batch_serialized_example,特征={'标签':tf.FixedLenFeature([], np.float32),'功能':tf.FixedLenFeature([14], np.float32)})batch_pdiff = 特征['标签']batch_avgs = 特征['特征']...

I noticed a big difference in speed if I load my training data into memory and feed it into the graph as a numpy array vs using a shuffle batch of the same size, my data has ~1000 instances.

Using memory 1000 iterations takes less than a few seconds but using a shuffle batch it takes almost 10 minutes. I get the shuffle batch should be a bit slower but this seems way too slow. Why is this?

Added a bounty. Any suggestions on how to make shuffled mini-batches faster?

Here is the training data: Link to bounty_training.csv (pastebin)

Here is my code:

shuffle_batch

import numpy as np
import tensorflow as tf

data = np.loadtxt('bounty_training.csv',
    delimiter=',',skiprows=1,usecols = (0,1,2,3,4,5,6,7,8,9,10,11,12,13,14))

filename = "test.tfrecords"

with tf.python_io.TFRecordWriter(filename) as writer:
    for row in data:
        features, label = row[:-1], row[-1]
        example = tf.train.Example()
        example.features.feature['features'].float_list.value.extend(features)
        example.features.feature['label'].float_list.value.append(label)
        writer.write(example.SerializeToString())

def read_and_decode_single_example(filename):
    filename_queue = tf.train.string_input_producer([filename],
                                                   num_epochs=None)
    reader = tf.TFRecordReader()
    _, serialized_example = reader.read(filename_queue)

    features = tf.parse_single_example(
        serialized_example,
        features={
            'label': tf.FixedLenFeature([], np.float32),
            'features': tf.FixedLenFeature([14], np.float32)})

    pdiff = features['label']
    avgs = features['features']

    return avgs, pdiff

avgs, pdiff = read_and_decode_single_example(filename)


n_features = 14
batch_size = 1000
hidden_units = 7
lr = .001

avgs_batch, pdiff_batch = tf.train.shuffle_batch(
    [avgs, pdiff], batch_size=batch_size,
    capacity=5000,
    min_after_dequeue=2000)

X = tf.placeholder(tf.float32,[None,n_features])
Y = tf.placeholder(tf.float32,[None,1])

W = tf.Variable(tf.truncated_normal([n_features,hidden_units]))
b = tf.Variable(tf.zeros([hidden_units]))

Wout = tf.Variable(tf.truncated_normal([hidden_units,1]))
bout = tf.Variable(tf.zeros([1]))

hidden1 = tf.matmul(X,W) + b
pred = tf.matmul(hidden1,Wout) + bout

loss = tf.reduce_mean(tf.squared_difference(pred,Y))

optimizer = tf.train.AdamOptimizer(lr).minimize(loss)

with tf.Session() as sess:
    init = tf.global_variables_initializer()
    sess.run(init)
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(sess=sess, coord=coord)

    for step in range(1000):
        x_, y_ = sess.run([avgs_batch,pdiff_batch])

        _, loss_val = sess.run([optimizer,loss],
              feed_dict={X: x_, Y: y_.reshape(batch_size,1)} )

        if step % 100 == 0:
            print(loss_val)


    coord.request_stop()
    coord.join(threads)

Full batch via numpy array

"""
avgs and pdiff loaded into numpy arrays first...
Same model as above
"""
   with tf.Session() as sess:
        init = tf.global_variables_initializer()
        sess.run(init)

        for step in range(1000):
            _, loss_value = sess.run([optimizer,loss],
                    feed_dict={X: avgs,Y: pdiff.reshape(n_instances,1)} )

解决方案

The trick is instead of feeding single examples into shuffle_batch you feed an n+1 dimensional tensor of examples to it with enqueue_many=True. I found this thread that was very helpful:

TFRecordReader seems extremely slow , and multi-threads reading not working

def get_batch(batch_size):
    reader = tf.TFRecordReader()
    _, serialized_example = reader.read(filename_queue)

    batch_list = []
    for i in range(batch_size):
        batch_list.append(serialized_example)

    return [batch_list]

batch_serialized_example = tf.train.shuffle_batch(
 get_batch(batch_size), batch_size=batch_size,
    capacity=100*batch_size,
    min_after_dequeue=batch_size*10,
    num_threads=1, 
    enqueue_many=True)

features = tf.parse_example(
    batch_serialized_example,
    features={
        'label': tf.FixedLenFeature([], np.float32),
        'features': tf.FixedLenFeature([14], np.float32)})

batch_pdiff = features['label']
batch_avgs = features['features']

...

这篇关于Tensorflow shuffle_batch 速度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆