了解张量流队列和cpu<-> GPU传输 [英] Understanding tensorflow queues and cpu <-> gpu transfer

查看:78
本文介绍了了解张量流队列和cpu<-> GPU传输的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在阅读了这个github问题之后,我觉得我对队列的理解缺少了一些东西:

After reading this github issue I feel like I'm missing something in my understanding on queues:

https://github.com/tensorflow/tensorflow/issues/3009

我认为,当将数据加载到队列中时,它将在计算最后一批时被预先传输到GPU,因此,假设计算所需的时间比加载下一个加载时间更长,则实际上没有带宽瓶颈.批.

I thought that when loading data into a queue, it will get pre-transferred to the GPU while the last batch is getting computed, so that there is virtually no bandwidth bottleneck, assuming computation takes longer than the time to load the next batch.

但是上面的链接表明从队列到图形中有一个昂贵的副本(numpy<-> TF),并且将文件加载到图形中并在其中进行预处理会更快.但这对我来说没有意义.如果我从文件与原始numpy数组加载256x256图像,为什么有关系呢?如果有的话,我认为numpy版本会更快.我想念什么?

But the above link suggests that there is an expensive copy from queue into the graph (numpy <-> TF) and that it would be faster to load the files into the graph and do preprocessing there instead. But that doesn't make sense to me. Why does it matter if I load a 256x256 image from file vs a raw numpy array? If anything, I would think that the numpy version is faster. What am I missing?

推荐答案

没有实现GPU队列,因此它仅将内容加载到主内存中,并且没有异步预取到GPU中.您可以使用固定到gpu:0

There's no implementation of GPU queue, so it only loads stuff into main memory and there's no asynchronous prefetching into GPU. You could make something like a GPU-based queue using variables pinned to gpu:0

这篇关于了解张量流队列和cpu&lt;-&gt; GPU传输的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆