火炬/lua:从Tensor中检索n个最佳子集 [英] torch / lua: retrieving n-best subset from Tensor

查看:80
本文介绍了火炬/lua:从Tensor中检索n个最佳子集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我现在有以下代码,该代码将每个问题的最高分的索引存储在pred中,并将其转换为字符串.

我想对每个问题的n个最佳索引执行相同的操作,而不仅仅是具有最高分数的单个索引,然后将它们转换为字符串.我还想显示每个索引(或每个转换后的字符串)的分数.

因此,必须对scores进行排序,并且pred必须具有多个行/列,而不是1 x nqs.并且pred中每个条目的相应score值必须可检索.

我对lua/torch语法一无所知,任何帮助将不胜感激.

nqs=dataset['question']:size(1);
scores=torch.Tensor(nqs,noutput);
qids=torch.LongTensor(nqs);
for i=1,nqs,batch_size do
    xlua.progress(i, nqs)
    r=math.min(i+batch_size-1,nqs);
    scores[{{i,r},{}}],qids[{{i,r}}]=forward(i,r);
end

tmp,pred=torch.max(scores,2);

answer=json_file['ix_to_ans'][tostring(pred[{i,1}])]
print(answer)

解决方案

这是我的尝试,我使用简单的随机scores张量来演示其行为:

> scores=torch.floor(torch.rand(4,10)*100)
> =scores
 9   1  90  12  62   1  62  86  46  27
 7   4   7   4  71  99  33  48  98  63
 82   5  73  84  61  92  81  99  65   9
 33  93  64  77  36  68  89  44  19  25
[torch.DoubleTensor of size 4x10]

现在,由于您想要每个问题(行)的最佳索引,因此让我们对张量的每一行进行排序:

> values,indexes=scores:sort(2)

现在,让我们看一下返回张量包含什么:

> =values
  1   1   9  12  27  46  62  62  86  90
  4   4   7   7  33  48  63  71  98  99
  5   9  61  65  73  81  82  84  92  99
  19  25  33  36  44  64  68  77  89  93
  [torch.DoubleTensor of size 4x10]

> =indexes
  2   6   1   4  10   9   5   7   8   3
  2   4   1   3   7   8  10   5   9   6
  2  10   5   9   3   7   1   4   6   8
  9  10   1   5   8   3   6   4   7   2
  [torch.LongTensor of size 4x10]

如您所见,valuesi-th行是scoresi-th行的排序版本(升序),并且indexes中的每一行都为您提供相应的索引. /p>

您可以使用

获得每个问题(行)的N最佳值/索引.

> N_best_indexes=indexes[{{},{indexes:size(2)-N+1,indexes:size(2)}}]
> N_best_values=values[{{},{values:size(2)-N+1,values:size(2)}}]

让我们用N=3看到给定示例的值:

> return N_best_indexes
 7  8  3
 5  9  6
 4  6  8
 4  7  2
[torch.LongTensor of size 4x3]

> return N_best_values
 62  86  90
 71  98  99
 84  92  99
 77  89  93
[torch.DoubleTensor of size 4x3]

因此,问题jk-th最佳值是N_best_values[{{j},{values:size(2)-k+1}]],并且在scores矩阵中的相应索引由以下row, column值给出:

row=j
column=N_best_indexes[{{j},indexes:size(2)-k+1}}]. 

例如,第二个问题的第一个最佳值(k=1)是99,它位于scores中的2nd行和6th列.您会看到values[{{2},values:size(2)}}]99,并且indexes[{{2},{indexes:size(2)}}]给您6,这是scores矩阵中的列索引.

希望我能很好地解释我的解决方案.

I have following code now, which stores the indices with the maximum score for each question in pred, and convert it to string.

I want to do the same for n-best indices for each question, not just single index with the maximum score, and convert them to string. I also want to display the score for each index (or each converted string).

So scores will have to be sorted, and pred will have to be multiple rows/columns instead of 1 x nqs. And corresponding score value for each entry in pred must be retrievable.

I am clueless as to lua/torch syntax, and any help would be greatly appreciated.

nqs=dataset['question']:size(1);
scores=torch.Tensor(nqs,noutput);
qids=torch.LongTensor(nqs);
for i=1,nqs,batch_size do
    xlua.progress(i, nqs)
    r=math.min(i+batch_size-1,nqs);
    scores[{{i,r},{}}],qids[{{i,r}}]=forward(i,r);
end

tmp,pred=torch.max(scores,2);

answer=json_file['ix_to_ans'][tostring(pred[{i,1}])]
print(answer)

解决方案

Here is my attempt, I demonstrate its behavior using a simple random scores tensor:

> scores=torch.floor(torch.rand(4,10)*100)
> =scores
 9   1  90  12  62   1  62  86  46  27
 7   4   7   4  71  99  33  48  98  63
 82   5  73  84  61  92  81  99  65   9
 33  93  64  77  36  68  89  44  19  25
[torch.DoubleTensor of size 4x10]

Now, since you want the N best indexes for each question (row), let's sort each row of the tensor:

> values,indexes=scores:sort(2)

Now, let's look at what the return tensors contain:

> =values
  1   1   9  12  27  46  62  62  86  90
  4   4   7   7  33  48  63  71  98  99
  5   9  61  65  73  81  82  84  92  99
  19  25  33  36  44  64  68  77  89  93
  [torch.DoubleTensor of size 4x10]

> =indexes
  2   6   1   4  10   9   5   7   8   3
  2   4   1   3   7   8  10   5   9   6
  2  10   5   9   3   7   1   4   6   8
  9  10   1   5   8   3   6   4   7   2
  [torch.LongTensor of size 4x10]

As you see, the i-th row of values is the sorted version (in increasing order) of the i-th row of scores, and each row in indexes gives you the corresponding indexes.

You can get the N best values/indexes for each question (i.e. row) with

> N_best_indexes=indexes[{{},{indexes:size(2)-N+1,indexes:size(2)}}]
> N_best_values=values[{{},{values:size(2)-N+1,values:size(2)}}]

Let's see their values for the given example, with N=3:

> return N_best_indexes
 7  8  3
 5  9  6
 4  6  8
 4  7  2
[torch.LongTensor of size 4x3]

> return N_best_values
 62  86  90
 71  98  99
 84  92  99
 77  89  93
[torch.DoubleTensor of size 4x3]

So, the k-th best value for question j is N_best_values[{{j},{values:size(2)-k+1}]], and its corresponding index in the scores matrix is given by this row, column values:

row=j
column=N_best_indexes[{{j},indexes:size(2)-k+1}}]. 

For example, the first best value (k=1) for the second question is 99, which lies at the 2nd row and 6th column in scores. And you can see that values[{{2},values:size(2)}}] is 99, and that indexes[{{2},{indexes:size(2)}}] gives you 6, which is the column index in the scores matrix.

Hope that I explained my solution well.

这篇关于火炬/lua:从Tensor中检索n个最佳子集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆