在进行评估 w.r.t. 时,是否应将数据批次移至 CPU 并转换(从 Torch Tensor)到 numpy 数组?训练期间的指标? [英] Should a data batch be moved to CPU and converted (from torch Tensor) to a numpy array when doing evaluation w.r.t. a metric during the training?

查看:23
本文介绍了在进行评估 w.r.t. 时,是否应将数据批次移至 CPU 并转换(从 Torch Tensor)到 numpy 数组?训练期间的指标?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在学习 Andrew Ng 的 CS230 斯坦福课程教程,并在 每个训练时期,通过计算指标进行评估.

I am going through Andrew Ng’s tutorial from the CS230 Stanford course, and in every epoch of the training, evaluation is performed by calculating the metrics.

但在计算指标之前,他们将批次发送到 CPU 并将它们转换为 numpy 数组(代码在这里).

But before calculating the metrics, they are sending the batches to CPU and converting them to numpy arrays (code here).

# extract data from torch Variable, move to cpu, convert to numpy arrays
output_batch = output_batch.data.cpu().numpy()
labels_batch = labels_batch.data.cpu().numpy()

# compute all metrics on this batch
summary_batch = {metric: metrics[metric](output_batch, labels_batch) for metric in metrics}

我的问题是:他们为什么要这样做?他们为什么不只是计算指标(这是完成的 here) 在 GPU 上使用火炬方法(例如 torch.sum 而不是 np.sum)?

My question is: why do they do that? Why don’t they just calculate the metrics (which is done here) on GPU using torch methods (e.g. torch.sum as opposed to np.sum)?

我认为 GPU 到 CPU 的传输会减慢速度,所以应该有很好的理由这样做?

I would think that GPU to CPU transfers would slow things down, so there should be a very good reason for doing them?

我是 PyTorch 的新手,所以我可能会遗漏一些东西.

I am new to PyTorch so I might be missing something.

推荐答案

如果我错了,请纠正我.将数据发送回 CPU 可以减少 GPU 负载,即使在进入下一个循环周期时更换了内存.此外,我相信转换为 numpy 具有释放内存的优势,因为您将数据与计算图分离.你最终操纵 labels_batch.cpu().numpy() 一个 fixed 数组 vs labels_batch 一个张量通过链接的 附加到整个网络>backward_fn 回调.

Correct me if I'm wrong. Sending back the data to the CPU allows to reduce the GPU load even though memory is replaced when entering the following loop cycle. Futhermore, I believe converting to numpy has the advantage of freeing memory since you are detaching your data from the calculation graph. You end up manipulating labels_batch.cpu().numpy() a fixed array vs labels_batch a tensor attached to the entire network through linked backward_fn callbacks.

这篇关于在进行评估 w.r.t. 时,是否应将数据批次移至 CPU 并转换(从 Torch Tensor)到 numpy 数组?训练期间的指标?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆