PyTorch - contiguous() 有什么作用? [英] PyTorch - What does contiguous() do?

查看:56
本文介绍了PyTorch - contiguous() 有什么作用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在 github 上浏览这个 LSTM 语言模型的例子 (链接).它的一般作用对我来说很清楚.但我仍在努力理解调用 contiguous() 的作用,这在代码中多次出现.

I was going through this example of a LSTM language model on github (link). What it does in general is pretty clear to me. But I'm still struggling to understand what calling contiguous() does, which occurs several times in the code.

例如在代码的第 74/75 行中创建了 LSTM 的输入和目标序列.数据(存储在 ids 中)是二维的,其中第一维是批量大小.

For example in line 74/75 of the code input and target sequences of the LSTM are created. Data (stored in ids) is 2-dimensional where first dimension is the batch size.

for i in range(0, ids.size(1) - seq_length, seq_length):
    # Get batch inputs and targets
    inputs = Variable(ids[:, i:i+seq_length])
    targets = Variable(ids[:, (i+1):(i+1)+seq_length].contiguous())

举个简单的例子,当使用批量大小为 1 和 seq_length 10 个 inputstargets 看起来像这样:

So as a simple example, when using batch size 1 and seq_length 10 inputs and targets looks like this:

inputs Variable containing:
0     1     2     3     4     5     6     7     8     9
[torch.LongTensor of size 1x10]

targets Variable containing:
1     2     3     4     5     6     7     8     9    10
[torch.LongTensor of size 1x10]

所以总的来说,我的问题是,contiguous() 有什么作用,我为什么需要它?

So in general my question is, what does contiguous() do and why do I need it?

此外,我不明白为什么会为目标序列调用该方法,而不是为输入序列调用该方法,因为这两个变量由相同的数据组成.

Further I don't understand why the method is called for the target sequence and but not the input sequence as both variables are comprised of the same data.

targets 怎么可能不连续而 inputs 仍然是连续的?

How could targets be non-contiguous and inputs still be contiguous?

我试图省略调用 contiguous(),但这会导致在计算损失时出现错误消息.

I tried to leave out calling contiguous(), but this leads to an error message when computing the loss.

RuntimeError: invalid argument 1: input is not contiguous at .../src/torch/lib/TH/generic/THTensor.c:231

显然在这个例子中调用contiguous()是必要的.

推荐答案

PyTorch 中对张量的一些操作不会改变张量的内容,但会改变数据的组织方式.这些操作包括:

There are a few operations on Tensors in PyTorch that do not change the contents of a tensor, but change the way the data is organized. These operations include:

narrow()view()expand()transpose()

例如:当您调用 transpose() 时,PyTorch 不会生成具有新布局的新张量,它只是修改 Tensor 对象中的元信息,因此偏移量和步幅描述了所需的新形状.在这个例子中,转置张量和原始张量共享相同的内存:

For example: when you call transpose(), PyTorch doesn't generate a new tensor with a new layout, it just modifies meta information in the Tensor object so that the offset and stride describe the desired new shape. In this example, the transposed tensor and original tensor share the same memory:

x = torch.randn(3,2)
y = torch.transpose(x, 0, 1)
x[0, 0] = 42
print(y[0,0])
# prints 42

这就是 contiguous 概念的用武之地.在上面的例子中,x 是连续的,但 y 不是因为它的内存布局不同于从头开始制作的相同形状的张量.请注意,contiguous" 这个词有点误导,因为张量的内容并不是分布在断开连接的内存块周围.这里字节仍然分配在一块内存中,但元素的顺序不同!

This is where the concept of contiguous comes in. In the example above, x is contiguous but y is not because its memory layout is different to that of a tensor of same shape made from scratch. Note that the word "contiguous" is a bit misleading because it's not that the content of the tensor is spread out around disconnected blocks of memory. Here bytes are still allocated in one block of memory but the order of the elements is different!

当您调用 contiguous() 时,它实际上会复制张量,使其元素在内存中的顺序与使用相同数据从头开始创建时相同.

When you call contiguous(), it actually makes a copy of the tensor such that the order of its elements in memory is the same as if it had been created from scratch with the same data.

通常你不需要担心这个.您通常可以安全地假设一切都会正常工作,并等到您收到 RuntimeError: input is not contiguous 其中 PyTorch 需要一个连续的张量来添加对 contiguous().

Normally you don't need to worry about this. You're generally safe to assume everything will work, and wait until you get a RuntimeError: input is not contiguous where PyTorch expects a contiguous tensor to add a call to contiguous().

这篇关于PyTorch - contiguous() 有什么作用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆