ImageDataLayer和LMDB数据层之间的速度 [英] The speed between ImageDataLayer and LMDB data layer

查看:144
本文介绍了ImageDataLayer和LMDB数据层之间的速度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Caffe支持LMDB数据层和ImageDataLayer。
从某些数据集创建LMDB数据库需要一些时间和大量空间。
相反,ImageDataLayer仅使用txt文件,这非常方便。
我的问题是,这两种层之间的速度差异是否很大?
非常感谢!

Caffe support LMDB data layer and ImageDataLayer. Create LMDB database from some dataset require some time and a lot of space. In contrast, ImageDataLayer only use a txt file which is very convenient. My question is, is there big speed difference between these two kinds of layers? Thank you very much!

推荐答案

LMDB 旨在更快地从中获取数据给定的键值。数据也以未压缩的格式存储,这使得机器很容易读取数据并将其直接传递给GPU进行处理。

LMDB is designed for faster fetching of data from a given key value. Also the data is stored in uncompressed format, which makes it easy for the machine to just read the data and directly pass them to the GPU for processing.

ImageDataLayer 中,我们必须从文本文件中读取图像详细信息,并使用OpenCV将图像读取到内存中。图像的这种解压缩在计算上是昂贵的。

In ImageDataLayer, we have to read the image details from the text file, and use OpenCV to read the image to memory. This uncompressing of image is computationally expensive.

但是,最佳性能可能并不总是针对LMDB层,它在很大程度上取决于机器的配置。考虑一个256个图像批处理大小和227x227x3大小的图像的示例。另外,请考虑一下您是否在使用非常好的GPU和高端i8处理器机器。此处,LMDB格式的单个图像可能占用151KB。整批可能占用37MB。如果GPU每秒能够执行10个批处理,则硬盘的读取速度应为370MB / s。如果您使用的是普通SATA或外部硬盘,由于硬盘的限制,读取如此大的数据会遇到瓶颈。

But the best performance may not always be for the LMDB layer, it depends heavily on the configuration of the machine. Consider an example of 256 image batch size and the images of size 227x227x3. Also consider than you are using a very good GPU and a high end i8 processor machine. Here single image in LMDB format may occupy 151KB. A whole batch may occupy 37MB. If the GPU is able to perform 10 batches a second, the harddisk should have a speed of reading 370MB/s. If you are using a normal SATA or external harddisk, there will be bottlenecks on reading such large chunks of data due to the limits of the hard disk.

如果caffe无法以所需的速度获取数据时,瓶颈会减慢整个训练过程的速度,甚至更糟。同时,如果您读取256张图像并使用OpenCV的多核版本,则数据预取可能比读取LMDB更有效。

If caffe could not fetch data in the required speed, the bottleneck slows the whole training process even worse. At the same time, if you were reading 256 images and use multi-core version of OpenCV, the data prefetching may be handled more effectively than reading an LMDB.

如果您已将LMDB数据存储在SSD上,则不会发生这种情况!

The above case will not occur if you have stored the LMDB data on a SSD though!

这篇关于ImageDataLayer和LMDB数据层之间的速度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆