什么会导致算法具有 O(log n) 复杂度? [英] What would cause an algorithm to have O(log n) complexity?

查看:23
本文介绍了什么会导致算法具有 O(log n) 复杂度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对 big-O 的了解是有限的,当等式中出现对数项时,它让我更加失望.

My knowledge of big-O is limited, and when log terms show up in the equation it throws me off even more.

有人可以简单地向我解释什么是 O(log n) 算法吗?对数从何而来?

Can someone maybe explain to me in simple terms what a O(log n) algorithm is? Where does the logarithm come from?

当我试图解决这个期中练习题时,我特别想到了这一点:

This specifically came up when I was trying to solve this midterm practice question:

让 X(1..n) 和 Y(1..n) 包含两个整数列表,每个列表都按非降序排序.给出一个 O(log n) 时间算法来找到所有 2n 个组合元素的中位数(或第 n 个最小整数).例如,X = (4, 5, 7, 8, 9) 和 Y = (3, 5, 8, 9, 10),则 7 是组合列表 (3, 4, 5, 5, 7, 8, 8, 9, 9, 10).[提示:使用二分查找的概念]

Let X(1..n) and Y(1..n) contain two lists of integers, each sorted in nondecreasing order. Give an O(log n)-time algorithm to find the median (or the nth smallest integer) of all 2n combined elements. For ex, X = (4, 5, 7, 8, 9) and Y = (3, 5, 8, 9, 10), then 7 is the median of the combined list (3, 4, 5, 5, 7, 8, 8, 9, 9, 10). [Hint: use concepts of binary search]

推荐答案

我必须同意,当您第一次看到 O(log n) 算法时,这很奇怪……这个对数到底是从哪里来的?然而,事实证明,有几种不同的方法可以使对数项以大 O 表示法显示.以下是一些:

I have to agree that it's pretty weird the first time you see an O(log n) algorithm... where on earth does that logarithm come from? However, it turns out that there's several different ways that you can get a log term to show up in big-O notation. Here are a few:

取任意数n;比如说, 16. 在得到一个小于或等于 1 的数之前,你可以将 n 除以 2 多少次?对于 16,我们有

Take any number n; say, 16. How many times can you divide n by two before you get a number less than or equal to one? For 16, we have that

16 / 2 = 8
 8 / 2 = 4
 4 / 2 = 2
 2 / 2 = 1

请注意,这最终需要四个步骤才能完成.有趣的是,我们还有 log2 16 = 4.嗯……128 怎么样?

Notice that this ends up taking four steps to complete. Interestingly, we also have that log2 16 = 4. Hmmm... what about 128?

128 / 2 = 64
 64 / 2 = 32
 32 / 2 = 16
 16 / 2 = 8
  8 / 2 = 4
  4 / 2 = 2
  2 / 2 = 1

这需要七步,log2 128 = 7.这是巧合吗?不!这是有充分理由的.假设我们将一个数 n 除以 2 i 次.然后我们得到数字 n/2i.如果我们想求解 i 的值,其中该值最多为 1,我们得到

This took seven steps, and log2 128 = 7. Is this a coincidence? Nope! There's a good reason for this. Suppose that we divide a number n by 2 i times. Then we get the number n / 2i. If we want to solve for the value of i where this value is at most 1, we get

n/2i ≤1

n ≤2i

log2 n ≤我

log2 n ≤ i

换句话说,如果我们选择一个整数 i 使得 i ≥log2 n,然后在将 n 一分为二 i 次后,我们将得到一个至多为 1 的值.可以保证的最小 i 大致为 log2n,所以如果我们有一个算法除以 2 直到数字变得足够小,那么我们可以说它以 O(log n) 步结束.

In other words, if we pick an integer i such that i ≥ log2 n, then after dividing n in half i times we'll have a value that is at most 1. The smallest i for which this is guaranteed is roughly log2 n, so if we have an algorithm that divides by 2 until the number gets sufficiently small, then we can say that it terminates in O(log n) steps.

一个重要的细节是,你用什么常数除 n 并不重要(只要它大于 1);如果除以常数 k,则需要 logk n 步才能达到 1.因此,任何将输入大小重复除以某个分数的算法都需要 O(log n) 次迭代才能终止.这些迭代可能需要很多时间,因此净运行时间不必是 O(log n),但步数将是对数的.

An important detail is that it doesn't matter what constant you're dividing n by (as long as it's greater than one); if you divide by the constant k, it will take logk n steps to reach 1. Thus any algorithm that repeatedly divides the input size by some fraction will need O(log n) iterations to terminate. Those iterations might take a lot of time and so the net runtime needn't be O(log n), but the number of steps will be logarithmic.

那么这是从哪里来的呢?一个经典示例是二分搜索,这是一种用于搜索已排序的快速算法一个值的数组.该算法的工作原理如下:

So where does this come up? One classic example is binary search, a fast algorithm for searching a sorted array for a value. The algorithm works like this:

  • 如果数组为空,则返回该元素不存在于数组中.
  • 否则:
    • 查看数组的中间元素.
    • 如果它等于我们要查找的元素,则返回成功.
    • 如果它大于我们要查找的元素:
      • 扔掉数组的后半部分.
      • 重复
      • 扔掉数组的前半部分.
      • 重复

      例如在数组中搜索5

      1   3   5   7   9   11   13
      

      我们首先看中间元素:

      1   3   5   7   9   11   13
                  ^
      

      由于 7 > 5,而且由于数组已排序,因此我们知道数字 5 不能位于数组的后半部分,因此我们可以将其丢弃.这离开

      Since 7 > 5, and since the array is sorted, we know for a fact that the number 5 can't be in the back half of the array, so we can just discard it. This leaves

      1   3   5
      

      所以现在我们看看这里的中间元素:

      So now we look at the middle element here:

      1   3   5
          ^
      

      由于 3 <5、我们知道5不能出现在数组的前半部分,所以我们可以把前半部分的数组扔掉

      Since 3 < 5, we know that 5 can't appear in the first half of the array, so we can throw the first half array to leave

              5
      

      我们再看看这个数组的中间:

      Again we look at the middle of this array:

              5
              ^
      

      因为这正是我们要查找的数字,所以我们可以报告数组中确实有 5.

      Since this is exactly the number we're looking for, we can report that 5 is indeed in the array.

      那么它的效率如何?好吧,在每次迭代中,我们至少丢弃了剩余数组元素的一半.一旦数组为空或者我们找到了我们想要的值,算法就会停止.在最坏的情况下,元素不存在,所以我们一直将数组的大小减半,直到用完元素.这需要多长时间?好吧,由于我们一遍又一遍地将数组切成两半,我们最多将在 O(log n) 次迭代中完成,因为在运行之前我们不能将数组切成两半以上 O(log n) 次超出数组元素.

      So how efficient is this? Well, on each iteration we're throwing away at least half of the remaining array elements. The algorithm stops as soon as the array is empty or we find the value we want. In the worst case, the element isn't there, so we keep halving the size of the array until we run out of elements. How long does this take? Well, since we keep cutting the array in half over and over again, we will be done in at most O(log n) iterations, since we can't cut the array in half more than O(log n) times before we run out of array elements.

      遵循分而治之的一般技术的算法(将问题分成几部分,解决这些部分,然后将问题重新组合起来)出于同样的原因,它们中往往包含对数项 - 您不能将某个对象切成一半以上的 O(log n) 次.您可能希望将 merge sort 视为一个很好的例子.

      Algorithms following the general technique of divide-and-conquer (cutting the problem into pieces, solving those pieces, then putting the problem back together) tend to have logarithmic terms in them for this same reason - you can't keep cutting some object in half more than O(log n) times. You might want to look at merge sort as a great example of this.

      基数为 10 的数字 n 中有多少位数字?好吧,如果数字中有 k 位,那么最大的数字是 10k 的倍数.最大的 k 位数字是 999...9,k 次,这等于 10k + 1 - 1.因此,如果我们知道 n 中有 k 位,那么我们知道 n 的值最多为 10k + 1 - 1.如果我们想用 n 来求解 k,我们得到

      How many digits are in the base-10 number n? Well, if there are k digits in the number, then we'd have that the biggest digit is some multiple of 10k. The largest k-digit number is 999...9, k times, and this is equal to 10k + 1 - 1. Consequently, if we know that n has k digits in it, then we know that the value of n is at most 10k + 1 - 1. If we want to solve for k in terms of n, we get

      n ≤10k+1 - 1

      n ≤ 10k+1 - 1

      n + 1 ≤10k+1

      n + 1 ≤ 10k+1

      log10 (n + 1) ≤k + 1

      log10 (n + 1) ≤ k + 1

      (log10 (n + 1)) - 1 ≤k

      (log10 (n + 1)) - 1 ≤ k

      从中我们得到 k 大约是 n 的以 10 为底的对数.换句话说,n中的位数是O(log n).

      From which we get that k is approximately the base-10 logarithm of n. In other words, the number of digits in n is O(log n).

      例如,让我们考虑将两个太大而无法放入机器字的大数相加的复杂性.假设我们将这些数字以 10 为基数表示,我们将这些数字称为 m 和 n.添加它们的一种方法是通过小学方法 - 一次写出一位数字,然后从右向左计算.例如,要添加 1337 和 2065,我们首先将数字写为

      For example, let's think about the complexity of adding two large numbers that are too big to fit into a machine word. Suppose that we have those numbers represented in base 10, and we'll call the numbers m and n. One way to add them is through the grade-school method - write the numbers out one digit at a time, then work from the right to the left. For example, to add 1337 and 2065, we'd start by writing the numbers out as

          1  3  3  7
      +   2  0  6  5
      ==============
      

      我们添加最后一位数字并携带 1:

      We add the last digit and carry the 1:

                1
          1  3  3  7
      +   2  0  6  5
      ==============
                   2
      

      然后我们添加倒数第二个(倒数第二个")数字并携带 1:

      Then we add the second-to-last ("penultimate") digit and carry the 1:

             1  1
          1  3  3  7
      +   2  0  6  5
      ==============
                0  2
      

      接下来,我们添加倒数第三个(倒数第二个")数字:

      Next, we add the third-to-last ("antepenultimate") digit:

             1  1
          1  3  3  7
      +   2  0  6  5
      ==============
             4  0  2
      

      最后,我们添加倒数第四个(preantepenultimate"...我爱英语)数字:

      Finally, we add the fourth-to-last ("preantepenultimate"... I love English) digit:

             1  1
          1  3  3  7
      +   2  0  6  5
      ==============
          3  4  0  2
      

      现在,我们做了多少工作?我们对每个数字总共做了 O(1) 个工作(即恒定的工作量),并且总共有 O(max{log n, log m}) 个数字需要处理.这给出了总共 O(max{log n, log m}) 的复杂度,因为我们需要访问两个数字中的每个数字.

      Now, how much work did we do? We do a total of O(1) work per digit (that is, a constant amount of work), and there are O(max{log n, log m}) total digits that need to be processed. This gives a total of O(max{log n, log m}) complexity, because we need to visit each digit in the two numbers.

      许多算法通过在某个基数中一次处理一位数字来获得 O(log n) 项.一个经典的例子是基数排序,它以一个数字对整数进行排序时间.基数排序有很多种,但它们通常在 O(n log U) 时间内运行,其中 U 是被排序的最大可能整数.这样做的原因是排序的每次传递需要 O(n) 时间,并且总共需要 O(log U) 次迭代来处理被排序的最大数的每个 O(log U) 位.许多高级算法,例如 Gabow 的最短路径算法Ford-Fulkerson 最大流算法的缩放版本,在其复杂性中有一个对数项因为他们一次只处理一位数.

      Many algorithms get an O(log n) term in them from working one digit at a time in some base. A classic example is radix sort, which sorts integers one digit at a time. There are many flavors of radix sort, but they usually run in time O(n log U), where U is the largest possible integer that's being sorted. The reason for this is that each pass of the sort takes O(n) time, and there are a total of O(log U) iterations required to process each of the O(log U) digits of the largest number being sorted. Many advanced algorithms, such as Gabow's shortest-paths algorithm or the scaling version of the Ford-Fulkerson max-flow algorithm, have a log term in their complexity because they work one digit at a time.

      关于你如何解决这个问题的第二个问题,你可能想看看这个相关问题 探索更高级的应用程序.鉴于此处描述的问题的一般结构,当您知道结果中存在对数项时,您现在可以更好地了解如何思考问题,因此我建议您在给出答案之前不要查看答案一些想法.

      As to your second question about how you solve that problem, you may want to look at this related question which explores a more advanced application. Given the general structure of problems that are described here, you now can have a better sense of how to think about problems when you know there's a log term in the result, so I would advise against looking at the answer until you've given it some thought.

      希望这有帮助!

      这篇关于什么会导致算法具有 O(log n) 复杂度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆