迭代算法的时间复杂度 [英] Time complexity of an iterative algorithm

查看:445
本文介绍了迭代算法的时间复杂度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试查找此算法的时间复杂度.

I am trying to find the Time Complexity of this algorithm.

迭代:算法从输入的位串生成给定汉明距离内的所有位串.它会生成所有递增的序列0 <= a[0] < ... < a[dist-1] < strlen(num),并还原相应索引处的位.

The iterative: algorithm produces all the bit-strings within a given Hamming distance, from the input bit-string. It generates all increasing sequences 0 <= a[0] < ... < a[dist-1] < strlen(num), and reverts bits at corresponding indices.

向量a应该保留必须反转其位的索引.因此,如果a包含当前索引i,我们将输出1而不是0,反之亦然.否则,我们按原样打印该位(请参见else-part),如下所示:

The vector a is supposed to keep indices for which bits have to be inverted. So if a contains the current index i, we print 1 instead of 0 and vice versa. Otherwise we print the bit as is (see else-part), as shown below:

// e.g. hamming("0000", 2);
void hamming(const char* num, size_t dist) {
    assert(dist > 0);
    vector<int> a(dist);
    size_t k = 0, n = strlen(num);
    a[k] = -1;
    while (true)
        if (++a[k] >= n)
            if (k == 0)
                return;
            else {
                --k;
                continue;
            }
        else
            if (k == dist - 1) {
                // this is an O(n) operation and will be called
                // (n choose dist) times, in total.
                print(num, a);
            }
            else {
                a[k+1] = a[k];
                ++k;
            }
}

此算法的时间复杂度是多少?

What is the Time Complexity of this algorithm?

我的尝试说:

dist * n +(n选择t)* n + 2

dist * n + (n choose t) * n + 2

但这似乎不是事实,请考虑以下示例,所有示例都带有dist = 2:

but this seems not to be true, consider the following examples, all with dist = 2:

len = 3, (3 choose 2) = 3 * O(n), 10 while iterations
len = 4, (4 choose 2) = 6 * O(n), 15 while iterations
len = 5, (5 choose 2) = 9 * O(n), 21 while iterations
len = 6, (6 choose 2) = 15 * O(n), 28 while iterations

这是两个有代表性的运行(打印将在循环开始时进行):

Here are two representative runs (with the print to be happening at the start of the loop):

000, len = 3
k = 0, total_iter = 1
vector a = -1 0 
k = 1, total_iter = 2
vector a = 0 0 
Paid O(n)
k = 1, total_iter = 3
vector a = 0 1 
Paid O(n)
k = 1, total_iter = 4
vector a = 0 2 
k = 0, total_iter = 5
vector a = 0 3 
k = 1, total_iter = 6
vector a = 1 1 
Paid O(n)
k = 1, total_iter = 7
vector a = 1 2 
k = 0, total_iter = 8
vector a = 1 3 
k = 1, total_iter = 9
vector a = 2 2 
k = 0, total_iter = 10
vector a = 2 3 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gsamaras@pythagoras:~/Desktop/generate_bitStrings_HammDistanceT$ ./iter
0000, len = 4
k = 0, total_iter = 1
vector a = -1 0 
k = 1, total_iter = 2
vector a = 0 0 
Paid O(n)
k = 1, total_iter = 3
vector a = 0 1 
Paid O(n)
k = 1, total_iter = 4
vector a = 0 2 
Paid O(n)
k = 1, total_iter = 5
vector a = 0 3 
k = 0, total_iter = 6
vector a = 0 4 
k = 1, total_iter = 7
vector a = 1 1 
Paid O(n)
k = 1, total_iter = 8
vector a = 1 2 
Paid O(n)
k = 1, total_iter = 9
vector a = 1 3 
k = 0, total_iter = 10
vector a = 1 4 
k = 1, total_iter = 11
vector a = 2 2 
Paid O(n)
k = 1, total_iter = 12
vector a = 2 3 
k = 0, total_iter = 13
vector a = 2 4 
k = 1, total_iter = 14
vector a = 3 3 
k = 0, total_iter = 15
vector a = 3 4 

推荐答案

while循环有些聪明和微妙,可以说它正在做两种不同的事情(如果算上a的初始化,甚至是三种).这就是使您的复杂度计算具有挑战性的原因,并且效率也比以前低.

The while loop is somewhat clever and subtle, and it's arguable that it's doing two different things (or even three if you count the initialisation of a). That's what's making your complexity calculations challenging, and it's also less efficient than it could be.

抽象地讲,要从当前索引中递增地计算下一组索引,其想法是找到小于n-dist+i的最后一个索引i,对其进行递增,然后将以下索引设置为a[i]+1a[i]+2等.

In the abstract, to incrementally compute the next set of indices from the current one, the idea is to find the last index, i, that's less than n-dist+i, increment it, and set the following indexes to a[i]+1, a[i]+2, and so on.

例如,如果dist = 5,n = 11,而您的索引是:

For example, if dist=5, n=11 and your indexes are:

0, 3, 5, 9, 10

然后5是小于n-dist+i的最后一个值(因为n-dist是6,并且10 = 6 + 4,9 = 6 + 3,但是5 <6 + 2).

Then 5 is the last value less than n-dist+i (because n-dist is 6, and 10=6+4, 9=6+3, but 5<6+2).

因此,我们增加5,并设置后续整数以获取索引集:

So we increment 5, and set the subsequent integers to get the set of indexes:

0, 3, 6, 7, 8

现在假设k=4

0, 3, 5, 9, 10

  • a[k] + 1为11,所以k变为3.
  • ++a[k]为10,所以a[k+1]变为10,而k变为4.
  • ++a[k]为11,所以k变为3.
  • ++a[k]为11,所以k变为2.
  • ++a[k]是6,所以a[k+1]变成6,k变成3.
  • ++a[k]为7,所以a[k+1]变为7,而k变为4.
  • ++a[k]为8,我们继续调用print函数.
    • a[k] + 1 is 11, so k becomes 3.
    • ++a[k] is 10, so a[k+1] becomes 10, and k becomes 4.
    • ++a[k] is 11, so k becomes 3.
    • ++a[k] is 11, so k becomes 2.
    • ++a[k] is 6, so a[k+1] becomes 6, and k becomes 3.
    • ++a[k] is 7, so a[k+1] becomes 7, and k becomes 4.
    • ++a[k] is 8, and we continue to call the print function.
    • 此代码是正确的,但效率不高,因为k在搜索可以递增而不会引起较高索引溢出的最高索引时会前后移动.实际上,如果最高索引从末尾开始是j,则代码将使用while循环的非线性数字迭代.如果您跟踪n==dist对于n的不同值时,while循环发生了多少次迭代,则可以轻松地自己演示一下.输出只有一行,但是您会看到迭代次数增加了O(2 ^ n)(实际上,您会看到2 ^(n + 1)-2个迭代).

      This code is correct, but it's not efficient because k scuttles backwards and forwards as it's searching for the highest index that can be incremented without causing an overflow in the higher indices. In fact, if the highest index is j from the end, the code uses a non-linear number iterations of the while loop. You can easily demonstrate this yourself if you trace how many iterations of the while loop occur when n==dist for different values of n. There is exactly one line of output, but you'll see an O(2^n) growth in the number of iterations (in fact, you'll see 2^(n+1)-2 iterations).

      这种划分会使您的代码不必要地效率低下,并且也难以分析.

      This scuttling makes your code needlessly inefficient, and also hard to analyse.

      相反,您可以以更直接的方式编写代码:

      Instead, you can write the code in a more direct way:

      void hamming2(const char* num, size_t dist) {
          int a[dist];
          for (int i = 0; i < dist; i++) {
              a[i] = i;
          }
          size_t n = strlen(num);
          while (true) {
              print(num, a);
              int i;
              for (i = dist - 1; i >= 0; i--) {
                  if (a[i] < n - dist + i) break;
              }
              if (i < 0) return;
              a[i]++;
              for (int j = i+1; j<dist; j++) a[j] = a[i] + j - i;
          }
      }
      

      现在,每次通过while循环时,都会产生一组新的索引.每次迭代的确切成本并不简单,但是由于print为O(n),而while循环中的其余代码处于最差的O(dist),因此总成本为O(N_INCR_SEQ(n,dist)* n ),其中N_INCR_SEQ(n,dist)是自然数<的递增序列数.长度dist的n.评论中的某人提供了一个链接,该链接为此提供了公式.

      Now, each time through the while loop produces a new set of indexes. The exact cost per iteration is not straightforward, but since print is O(n), and the remaining code in the while loop is at worst O(dist), the overall cost is O(N_INCR_SEQ(n, dist) * n), where N_INCR_SEQ(n, dist) is the number of increasing sequences of natural numbers < n of length dist. Someone in the comments provides a link that gives a formula for this.

      这篇关于迭代算法的时间复杂度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆