在 TensorFlow 中使用多个输入管道 [英] Using multiple input pipelines in TensorFlow

查看:39
本文介绍了在 TensorFlow 中使用多个输入管道的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道如何使用输入管道从文件中读取数据:

I know how to use an input pipeline to read data from files:

input = ... # Read from file
loss = network(input) # build a network
train_op = ... # Using SGD or other algorithms to train the network.

但是如何在多个输入管道之间切换?比如说,如果我想在来自训练管道的训练集上训练 1000 个批次的网络,然后在来自另一个管道的验证集上对其进行验证,然后继续训练,然后验证,然后训练……等等.

But how can I switch between multiple input pipelines? Say, if I want to train a network for 1000 batches on the training set from the training pipeline, then validate it on a validation set from another pipeline, then keep training, then validate, then train, ..., and so forth.

使用 feed_dict 很容易实现这一点.我也知道如何使用检查点来实现这一点,就像在 cifar-10 示例中一样.但这有点麻烦:我需要将模型转储到磁盘,然后再次从磁盘读取.

It's easy to implement this with feed_dict. I also know how to use checkpoints to achieve this, just like in the cifar-10 example. But it's kind of cumbersome: I need to dump the model to disk then read it from disk again.

我可以在两个输入管道(一个用于训练数据,一个用于验证数据)之间切换来实现这一点吗?从训练数据队列中读取 1000 个批次,然后从验证数据队列中读取一些批次,依此类推.如果可能,怎么做?

Can I just switch between two input pipelines (one for training data, one for validation data) to achieve this? Reading 1000 batches from the training data queue, then a few batched from the validation data queue, and so forth. If it is possible, how to do it?

推荐答案

不确定这是否正是您要找的,但我正在两个单独的循环中使用相同的代码进行训练和验证.我的代码从 .CSV 文件读取数字和字符串数据,而不是图像.我正在读取两个单独的 CSV 文件,一个用于培训,另一个用于验证.我相信您可以将其概括为从两个组"文件中读取,而不仅仅是从单个文件中读取,因为代码就在那里.

Not sure if this is exactly what you are looking for, but I am doing training and validation in the same code in two separate loops. My code reads numeric and string data from .CSV files, not images. I am reading from two separate CSV files, one for training and one for validation. I'm sure you can generalize it to read from two 'sets' of files, rather than just single files, as the code is there.

这里是代码片段,以防万一.请注意,此代码首先将所有内容作为字符串读取,然后根据我自己的要求将必要的单元格转换为浮点数.如果您的数据是纯数字的,您应该将默认值设置为浮点数,一切都应该更容易.此外,其中有几行将权重和偏差放入 CSV 文件并将它们序列化到 TF 检查点文件中,具体取决于您喜欢的方式.

Here are the code snippets in case it helps. Note that this code first reads everything as string and then converts the necessary cells into floats, just given my own requirements. If your data is purely numeric, you should just set the defaults to floats and all should be easier. Also, there are a couple of lines in there that drop Weights and Biases into a CSV file AND serialize them into the TF checkpoint file, depending on which way you'd prefer.

        #first define the defaults:
        rDefaults = [['a'] for row in range((TD+TS+TL))]

    # this function reads line-by-line from CSV and separates cells into chunks:   
        def read_from_csv(filename_queue):
            reader = tf.TextLineReader(skip_header_lines=False)
            _, csv_row = reader.read(filename_queue)
            data = tf.decode_csv(csv_row, record_defaults=rDefaults)
            dateLbl = tf.slice(data, [0], [TD])
            features = tf.string_to_number(tf.slice(data, [TD], [TS]), tf.float32)
            label = tf.string_to_number(tf.slice(data, [TD+TS], [TL]), tf.float32)
            return dateLbl, features, label

    #this function loads the above lines and spits them out as batches of N:
        def input_pipeline(fName, batch_size, num_epochs=None):
            filename_queue = tf.train.string_input_producer(
                [fName],
                num_epochs=num_epochs,
                shuffle=True)  
            dateLbl, features, label = read_from_csv(filename_queue)
            min_after_dequeue = 10000 
            capacity = min_after_dequeue + 3 * batch_size # max of how much to load into memory
            dateLbl_batch, feature_batch, label_batch = tf.train.shuffle_batch(
                [dateLbl, features, label], 
                batch_size=batch_size,
                capacity=capacity,
                min_after_dequeue=min_after_dequeue)
            return dateLbl_batch, feature_batch, label_batch

    # These are the TRAINING features, labels, and meta-data to be loaded from the train file:    
        dateLbl, features, labels = input_pipeline(fileNameTrain, batch_size, try_epochs)
    # These are the TESTING features, labels, and meta-data to be loaded from the test file:
        dateLblTest, featuresTest, labelsTest = input_pipeline(fileNameTest, batch_size, 1) # 1 epoch here regardless of training

    # then you define the model, start the session, blah blah    

    # fire up the queue:        
            coord = tf.train.Coordinator()
            threads = tf.train.start_queue_runners(coord=coord)

    #This is the TRAINING loop:
try:            
                while not coord.should_stop():

                    dateLbl_batch, feature_batch, label_batch = sess.run([dateLbl, features, labels])      

                   _, acc, summary = sess.run([train_step, accuracyTrain, merged_summary_op], feed_dict={x: feature_batch, y_: label_batch, 
                keep_prob: dropout, 
                learning_rate: lRate})

            except tf.errors.OutOfRangeError: # (so done reading the file(s))

    # by the way, this dumps weights and biases into a CSV file, since you asked for that
                np.savetxt(fPath + fIndex + '_weights.csv', sess.run(W), 
    # and this serializes weight and biases into the TF-formatted protobuf:
        #        tf.train.Saver({'varW': W, 'varB': b}).save(sess, fileNameCheck)

            finally:
                coord.request_stop()

    # now re-start the runners for the testing file:   
            coord = tf.train.Coordinator()
            threads = tf.train.start_queue_runners(coord=coord)

            try:

                while not coord.should_stop():
    # so now this line reads features, labels, and meta-data, but this time from the training file:                
                    dateLbl_batch, feature_batch, label_batch = sess.run([dateLblTest, featuresTest, labelsTest])

                    guessY = tf.argmax(y, 1).eval({x: feature_batch, keep_prob: 1})
                    trueY = tf.argmax(label_batch, 1).eval()

                    accuracy = round(tf.reduce_mean(tf.cast(tf.equal(guessY, trueY), tf.float32)).eval(), 2)

            except tf.errors.OutOfRangeError:
                acCumTest /= i
            finally:
                coord.request_stop()


            coord.join(threads)

这可能与您尝试执行的操作不同,因为它首先完成训练循环,然后重新启动测试循环的队列.如果您想返回第四个,不确定您将如何执行此操作,但是您可以尝试通过交替传递相关文件名(或列表)来尝试使用上面定义的两个函数.

This may differ from what you are trying to do in the sense that it first completes the Training loop and THEN restarts the queues for the Testing loop. Not sure how you'd do this if you want to go back and fourth, but you can try to experiment with the two functions defined above by passing them the relevant file names (or lists) interchangeably.

另外,我不确定在训练后重新开始排队是否是最好的方法,但它对我有用.很想看到一个更好的例子,因为大多数 TF 例子使用一些围绕 MNIST 数据集的内置包装器来一次性进行训练......

Also I'm not sure if re-starting the queues after training is the best way to go, but it works for me. Would love to see a better example out there, as most TF examples use some built-in wrappers around the MNIST dataset to do the training in one go...

这篇关于在 TensorFlow 中使用多个输入管道的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆