有效地生成唯一的整数对 [英] Efficiently generating unique pairs of integers

查看：160 发布时间：2018/8/6 20:31:36 matlab random integer pair

本文介绍了有效地生成唯一的整数对的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在MATLAB中，我想在 [1，m] 范围内生成 n 对随机整数，每对都是独一无二的。为了唯一性，我认为对中数字的顺序是无关紧要的，因此 [3,10] 等于 [10,3] 。
此外，每对应由两个不同的整数组成;即 [3,4] 即可，但 [3,3] 将被拒绝。
编辑：每个可能的对应该选择具有相同的可能性。

（显然对参数的约束是 n< = m（m-1）/ 2 。）

我能够在<$时成功执行此操作c $ c> m 很小，如下：

  m = 500; n = 10; ％设置参数
 
 A =（（1：m）'* ones（1，m））;每列的％具有数字1  - > m 
 idxs1 = squareform（tril（A'， -  1））'; 
 idxs2 = squareform（tril（A，-1））'; 
 all_pairs = [idxs1，idxs2]; ％这包含所有可能的对
 
 idx_to_use = randperm（size（all_pairs，1），n）; ％选择随机n对
对= all_pairs（idx_to_use，:) 
 
对= 
 
 254 414 
 247 334 
 111 146 
 207 297 
 45 390 
 229 411 
 9 16 
 75 395 
 12 338 
 25 442

然而，矩阵 A 的大小 mxm ，意思是当 m 变大（例如超过10,000）时，MATLAB内存不足。

<我考虑生成一堆随机数 randi（m，[n，2]），并反复拒绝重复的行，但我担心陷入困境 n 接近 m（m-1）/ 2 时的循环。

是否有更简单，更清晰的方法来生成唯一的不同整数对？

解决方案

以正确的方式查看时容易，轻松。

你希望生成n对整数，[p，q]，这样p和q位于区间[1，m]和p

有多少个可能的对？对的总数仅为m *（m-1）/ 2。（即，从1到m-1的数字之和。）

因此我们可以生成范围为[1，m *（m-1）的n个随机整数）/ 2]。 Randperm做得很好。（较旧的matlab版本不允许第二个参数为randperm。）

  k = randperm（m / 2 *（m-1） ）中，n）;

（请注意，我用滑稽的方式用m写了这个表达式，或许除以2一个奇怪的地方。这可以避免某些m值接近上限的精度问题。）

现在，如果我们将每个可能的对[p，q]与其中一个相关联在k中的整数，我们可以向后工作，从k中生成的整数到一对[p，q]。因此，该列表中的前几对是：

  {[1,2]，[1,3]，[2， 3]，[1,4]，[2,4]，[3,4]，...，[m-1，m]}

我们可以把它们看作是一个严格的上三角形数组中的元素，大小为m乘m，因此那些元素位于主对角线上方。

  q = floor（sqrt（8 *（k-1）+ 1）/ 2 + 1/2）; 
 p = k  -  q。*（q-1）/ 2;

看到这些公式从k中的展开元素中恢复p和q。我们可以说服自己这确实有效，但也许一个简单的方法就是这个测试：

  k = 1:21 ; 
 q = floor（sqrt（8 *（k-1）+ 1）/ 2 + 3/2）; 
 p = k  - （q-1）。*（q-2）/ 2; 
 [k; p; q]'
 
 ans = 
 1 1 2 
 2 1 3 
 3 2 3 
 4 1 4 
 5 2 4 
 6 3 4 
 7 1 5 
 8 2 5 
 9 3 5 
 10 4 5 
 11 1 6 
 12 2 6 
 13 3 6 
 14 4 6 
 15 5 6 
 16 1 7 
 17 2 7 
 18 3 7 
 19 4 7 
 20 5 7 
 21 6 7

另一种方式测试它是为了表明所有对都是为一个小案例生成的。

  m = 5; 
 n = 10; 
 k = randperm（m / 2 *（m-1），n）; 
 q = floor（sqrt（8 *（k-1）+ 1）/ 2 + 3/2）; 
 p = k  - （q-1）。*（q-2）/ 2; 
 
 sortrows（[p; q]'，[2 1]）$ b $ b ans = 
 1 2 
 1 3 
 2 3 
 1 4 
 2 4 
 3 4 
 1 5 
 2 5 
 3 5 
 4 5

是的，看起来一切都很完美。现在尝试使用m和n的一些大数来测试使用的时间。

  tic 
 m = 1e6; 
 n = 100000; 
 k = randperm（m / 2 *（m-1），n）; 
 q = floor（sqrt（8 *（k-1）+ 1）/ 2 + 3/2）; 
 p = k  - （q-1）。*（q-2）/ 2; 
 toc 
 
经过的时间是0.014689秒。

此方案适用于大约1e8的m，因为双精度错误导致失败精确。在m / 2 *（m-1）超过2 ^ 53之前，精确限制应为m不大于134217728。一个很好的功能是不需要拒绝重复对。

In MATLAB, I would like to generate n pairs of random integers in the range [1, m], where each pair is unique. For uniqueness, I consider the order of the numbers in the pair to be irrelevant such that [3, 10] is equal to [10, 3]. Also, each pair should consist of two distinct integers; i.e. [3, 4] is ok but [3, 3] would be rejected. EDIT: Each possible pair should be chosen with equal likelihood.

(Obviously a constraint on the parameters is that n <= m(m-1)/2.)

I have been able to successfully do this when m is small, like so:

m = 500; n = 10;                   % setting parameters

A = ((1:m)'*ones(1, m));           % each column has the numbers 1 -> m
idxs1 = squareform(tril(A', -1))'; 
idxs2 = squareform(tril(A, -1))';   
all_pairs = [idxs1, idxs2];        % this contains all possible pairs

idx_to_use = randperm( size(all_pairs, 1), n );  % choosing random n pairs
pairs = all_pairs(idx_to_use, :)       

pairs =

   254   414
   247   334
   111   146
   207   297
    45   390
   229   411
     9    16
    75   395
    12   338
    25   442

However, the matrix A is of size m x m, meaning when m becomes large (e.g. upwards of 10,000), MATLAB runs out of memory.

I considered generating a load of random numbers randi(m, [n, 2]), and repeatedly rejecting the rows which repeated, but I was concerned about getting stuck in a loop when n was close to m(m-1)/2.

Is there an easier, cleaner way of generating unique pairs of distinct integers?

解决方案

Easy, peasy, when viewed in the proper way.

You wish to generate n pairs of integers, [p,q], such that p and q lie in the interval [1,m], and p

How many possible pairs are there? The total number of pairs is just m*(m-1)/2. (I.e., the sum of the numbers from 1 to m-1.)

So we could generate n random integers in the range [1,m*(m-1)/2]. Randperm does this nicely. (Older matlab releases do not allow the second argument to randperm.)

k = randperm(m/2*(m-1),n);

(Note that I've written this expression with m in a funny way, dividing by 2 in perhaps a strange place. This avoids precision problems for some values of m near the upper limits.)

Now, if we associate each possible pair [p,q] with one of the integers in k, we can work backwards, from the integers generated in k, to a pair [p,q]. Thus the first few pairs in that list are:

{[1,2], [1,3], [2,3], [1,4], [2,4], [3,4], ..., [m-1,m]}

We can think of them as the elements in a strictly upper triangular array of size m by m, thus those elements above the main diagonal.

q = floor(sqrt(8*(k-1) + 1)/2 + 1/2);
p = k - q.*(q-1)/2;

See that these formulas recover p and q from the unrolled elements in k. We can convince ourselves that this does indeed work, but perhaps a simple way here is just this test:

k = 1:21;
q = floor(sqrt(8*(k-1) + 1)/2 + 3/2);
p = k - (q-1).*(q-2)/2;
[k;p;q]'

ans =
     1     1     2
     2     1     3
     3     2     3
     4     1     4
     5     2     4
     6     3     4
     7     1     5
     8     2     5
     9     3     5
    10     4     5
    11     1     6
    12     2     6
    13     3     6
    14     4     6
    15     5     6
    16     1     7
    17     2     7
    18     3     7
    19     4     7
    20     5     7
    21     6     7

Another way of testing it is to show that all pairs get generated for a small case.

m = 5;
n = 10;
k = randperm(m/2*(m-1),n);
q = floor(sqrt(8*(k-1) + 1)/2 + 3/2);
p = k - (q-1).*(q-2)/2;

sortrows([p;q]',[2 1])
ans =
     1     2
     1     3
     2     3
     1     4
     2     4
     3     4
     1     5
     2     5
     3     5
     4     5

Yup, it looks like everything works perfectly. Now try it for some large numbers for m and n to test the time used.

tic
m = 1e6;
n = 100000;
k = randperm(m/2*(m-1),n);
q = floor(sqrt(8*(k-1) + 1)/2 + 3/2);
p = k - (q-1).*(q-2)/2;
toc

Elapsed time is 0.014689 seconds.

This scheme will work for m as large as roughly 1e8, before it fails due to precision errors in double precision. The exact limit should be m no larger than 134217728 before m/2*(m-1) exceeds 2^53. A nice feature is that no rejection for repeat pairs need be done.

这篇关于有效地生成唯一的整数对的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

有效地生成唯一的整数对 [英] Efficiently generating unique pairs of integers

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

有效地生成唯一的整数对 [英] Efficiently generating unique pairs of integers

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭