使用 Tensorflow 对象检测 api 打乱训练数据集 [英] Shuffling the training dataset with Tensorflow object detection api

查看：299 发布时间：2021/6/11 19:50:50 tensorflow queue shuffle object-detection

本文介绍了使用 Tensorflow 对象检测 api 打乱训练数据集的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在研究使用 Faster-RCNN 模型和 Tensorflow 对象检测 API 的徽标检测算法.我的数据集是按字母顺序排列的(所以有一百个阿迪达斯标志，然后是一百个苹果标志等等).我希望它在训练时被洗牌.

I'm working on a logo detection algorithm using the Faster-RCNN model with the Tensorflow object detection api. My dataset is alphabetically ordered (so there are a hundred adidas logo, then hundred apple logo etc.). And i would like it to be shuffled while training.

我在配置文件中添加了一些值:

I've put some values in the config file:

train_input_reader:{
          shuffle: true
          queue_capacity: some value
          min_after_dequeue : some other value}

不管我输入的是什么值，算法首先是训练 a 的所有标志(阿迪达斯、苹果等)，在开始看到 b 的标志(bmw 等)后只是一段时间.) 和 c 的一个等.

However whatever are the values, I'm putting in, algorithm is at first training on all of the a's logos (adidas, apple and so on) and only a lapse of time after starting to see the b's logos (bmw etc.) and the c's one etc.

当然我可以直接打乱我的输入数据集，但我想了解它背后的逻辑.

Of course I could just shuffle my input dataset directly, but I would like to understand the logic behind it.

PS:我见过这个 post 关于 shuffling 和 min_after_dequeue，但我还是不太明白.我的批量大小是 1，所以它不应该使用 tf.train.shuffle_batch() 而应该使用 tf.RandomShuffleQueue

PS: I've seen this post about shuffling and min_after_dequeue, but I still dont quite get it. My batch size is 1 so it shouldn't be using tf.train.shuffle_batch() but only tf.RandomShuffleQueue

我的训练数据集大小是 5000，如果我写 min_after_dequeue: 4000 or 5000 它仍然没有正确洗牌.为什么?

My training dataset size is 5000 and if I write min_after_dequeue: 4000 or 5000 it is still not shuffled right. Why though?

更新:@AllenLavoie 对我来说有点难；因为有很多依赖项，而且我是 Tensorflow 的新手.但最终队列是由

Update: @AllenLavoie It's a bit hard for me; as there is a lot of dependencies and I'm new to Tensorflow. But in the end the queue is constructed by

tf.contrib.slim.parallel_reader.parallel_read(    _, string_tensor = parallel_reader.parallel_read(
        config.input_path,
        reader_class=tf.TFRecordReader,
        num_epochs=(input_reader_config.num_epochs
                    if input_reader_config.num_epochs else None),
        num_readers=input_reader_config.num_readers,
        shuffle=input_reader_config.shuffle,
        dtypes=[tf.string, tf.string],
        capacity=input_reader_config.queue_capacity,
        min_after_dequeue=input_reader_config.min_after_dequeue)

似乎当我将 num_readers = 1 放在配置文件中时，数据集终于按照我的意愿进行了改组(至少在开始时)，但是当有更多开始徽标按字母顺序排列.

It seems that when I'm putting num_readers = 1 in the config file the dataset is finally shuffling as I want, (at least in the beginning), but when there are more somehow on the start the logos are getting in the alphabetical order.

使用 Tensorflow 对象检测 api 打乱训练数据集 [英] Shuffling the training dataset with Tensorflow object detection api

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用 Tensorflow 对象检测 api 打乱训练数据集 [英] Shuffling the training dataset with Tensorflow object detection api

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭