Codility峰复杂性 [英] Codility Peaks Complexity

查看:132
本文介绍了Codility峰复杂性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚做了以下Codility 的问题。的问题如下:

I've just done the following Codility Peaks problem. The problem is as follows:

一个非空的零索引数组A由N个整数给出。 一峰比邻国更大的数组元素。更precisely,它是一个索引p,使得0℃ P< N - 1,A [P - 1]; A [P]和A [P]> A [P + 1]。 例如,下面的数组答:

A non-empty zero-indexed array A consisting of N integers is given. A peak is an array element which is larger than its neighbors. More precisely, it is an index P such that 0 < P < N − 1, A[P − 1] < A[P] and A[P] > A[P + 1]. For example, the following array A:

A[0] = 1
A[1] = 2
A[2] = 3
A[3] = 4
A[4] = 3
A[5] = 4
A[6] = 1
A[7] = 2
A[8] = 3
A[9] = 4
A[10] = 6
A[11] = 2

正好有三个高峰:3,5,10。 我们希望将这个数组到含有相同的元素数目的块。更多precisely,我们要选一个数K,将产生以下块: A [0],A [1],...,A [K - 1] A [K],A [K + 1],...,A [2K - 1] ... A [N - K],A [N - K + 1],...,A [N - 1]。 更重要的是,每块应该包含至少一个峰。注意块的极端元素(例如阿[K - 1]或甲[K])也可以是峰值,但只有当他们有与两个相邻(包括一个在相邻的块)。 的目标是找到块到其中的数组A可以分的最大​​数量。 数组A可分为块如下:

has exactly three peaks: 3, 5, 10. We want to divide this array into blocks containing the same number of elements. More precisely, we want to choose a number K that will yield the following blocks: A[0], A[1], ..., A[K − 1], A[K], A[K + 1], ..., A[2K − 1], ... A[N − K], A[N − K + 1], ..., A[N − 1]. What's more, every block should contain at least one peak. Notice that extreme elements of the blocks (for example A[K − 1] or A[K]) can also be peaks, but only if they have both neighbors (including one in an adjacent blocks). The goal is to find the maximum number of blocks into which the array A can be divided. Array A can be divided into blocks as follows:

1块(1,2,3,4,3,4,1,2,3,4,6,2)。该模块包含三个峰值。

one block (1, 2, 3, 4, 3, 4, 1, 2, 3, 4, 6, 2). This block contains three peaks.

两个块(1,2,3,4,3,4)和(1,2,3,4,6,2)。每块都有一个峰值。

two blocks (1, 2, 3, 4, 3, 4) and (1, 2, 3, 4, 6, 2). Every block has a peak.

三个块(1,2,3,4),(3,4,1,2),(3,4,6,2)。每块都有一个高峰。

three blocks (1, 2, 3, 4), (3, 4, 1, 2), (3, 4, 6, 2). Every block has a peak.

特别注意第一个块(1,2,3,4)具有A处的峰值[3],因为A〔2〕&其中; A [3]> A [4],即使[4]是在相邻块。 然而,阵列A不能被划分成四个区块,(1,2,3),(4,3,4),(1,2,3),(4,6,2),由于(1,2, 3)块不包含的峰。特别注意的(4,3,4)块包含两个峰:A [3]和A [5]。 块数组A可以分成的最大数目为3。

Notice in particular that the first block (1, 2, 3, 4) has a peak at A[3], because A[2] < A[3] > A[4], even though A[4] is in the adjacent block. However, array A cannot be divided into four blocks, (1, 2, 3), (4, 3, 4), (1, 2, 3) and (4, 6, 2), because the (1, 2, 3) blocks do not contain a peak. Notice in particular that the (4, 3, 4) block contains two peaks: A[3] and A[5]. The maximum number of blocks that array A can be divided into is three.

写一个函数: 一流的解决方案{公众诠释解决方案(INT [] A); } 即,给定一个非空零索引数组甲选自N个整数的,返回到块,其中A可以分的最大​​数量。 如果A不能分割成块的一些数,函数应该返回0。 例如,给定:

Write a function: class Solution { public int solution(int[] A); } that, given a non-empty zero-indexed array A consisting of N integers, returns the maximum number of blocks into which A can be divided. If A cannot be divided into some number of blocks, the function should return 0. For example, given:

A[0] = 1
A[1] = 2 
A[2] = 3 
A[3] = 4 
A[4] = 3 
A[5] = 4 
A[6] = 1 
A[7] = 2 
A[8] = 3 
A[9] = 4 
A[10] = 6 
A[11] = 2

函数应该返回3,如以上所解释。 假设:

the function should return 3, as explained above. Assume that:

N是范围[1..100,000]内的整数。 数组A的每个元素的范围[0..1,000,000,000]内的整数。

N is an integer within the range [1..100,000]; each element of array A is an integer within the range [0..1,000,000,000].

复杂度:

预计最坏情况下的时间复杂度为O(N *日志(日志(N)))

expected worst-case time complexity is O(N*log(log(N)))

预计最坏情况下的空间复杂度为O(N),超出输入存储(不包括所需的输入参数的存储)。

expected worst-case space complexity is O(N), beyond input storage (not counting the storage required for input arguments).

输入数组的元素可以被修改。

Elements of input arrays can be modified.

所以,我解决这个问题有什么对我来说似乎是蛮力解决方案 - 通过从 1..1 每个组的大小,并检查各组是否具有至少有一个峰。第15分钟,我想解决这个问题,我试图找出一些更优化的方式,因为所需的复杂度为O(N *日志(日志(N)))。

So I solve this with what to me appears to be the brute force solution – go through every group size from 1..N, and check whether every group has at least one peak. The first 15 minutes I was trying to solve this I was trying to figure out some more optimal way, since the required complexity is O(N*log(log(N))).

这是我的蛮力code,它通过了所有测试,包括路数,对于一个得分100/100:

This is my "brute-force" code that passes all the tests, including the large ones, for a score of 100/100:

public int solution(int[] A) {
    int N = A.length;

    ArrayList<Integer> peaks = new ArrayList<Integer>();
    for(int i = 1; i < N-1; i++){
        if(A[i] > A[i-1] && A[i] > A[i+1]) peaks.add(i);
    }

    for(int size = 1; size <= N; size++){
        if(N % size != 0) continue;
        int find = 0;
        int groups = N/size;
        boolean ok = true;
        for(int peakIdx : peaks){
            if(peakIdx/size > find){
                ok = false;
                break;
            }
            if(peakIdx/size == find) find++;
        }
        if(find != groups) ok = false;
        if(ok) return groups;
    }

    return 0;
}

我的问题是我怎么推断,这其实就是O(N *日志(日志(N))),因为它不是在所有明显的给我,我很惊讶,我通过测试的情况。我在寻找哪怕是最简单的复杂性证明的草图,将说服这个运行时的我。我会假设,一个日志(日志(N))系数是指某种通过在每个迭代平方根减少的问题,但我不知道如何适用于我的问题。非常感谢您的帮助

My question is how do I deduce that this is in fact O(N*log(log(N))), as it's not at all obvious to me, and I was surprised I pass the test cases. I'm looking for even the simplest complexity proof sketch that would convince me of this runtime. I would assume that a log(log(N)) factor means some kind of reduction of a problem by a square root on each iteration, but I have no idea how this applies to my problem. Thanks a lot for any help

推荐答案

您是完全正确的:得到这个问题需要降低日志记录的性能。

You're completely right: to get the log log performance the problem needs to be reduced.

一个n.log(的log(n))解决方案。在这个问题上(!),但蟒蛇解分数100%的准确性Codility不再考业绩。

A n.log(log(n)) solution in python [below]. Codility no longer test 'performance' on this problem (!) but the python solution scores 100% for accuracy.

正如你已经猜到: 外环将是为O(n),因为它正在测试块的每个大小是否干净除数 内环必须是O(日志(日志(N)))即可得到O(N日志(的log(n)))的整体。

As you've already surmised: Outer loop will be O(n) since it is testing whether each size of block is a clean divisor Inner loop must be O(log(log(n))) to give O(n log(log(n))) overall.

我们可以得到良好的内循环的表现,因为我们只需要执行D(N)N的约数,人数。我们可以存储的preFIX总和的峰那么远的,它使用由所述问题规范允许的为O(n)的空间。检查的峰是否发生在每个'组'是接着一个O(1)查找操作使用组的开始和结束的索引

We can get good inner loop performance because we only need to perform d(n), the number of divisors of n. We can store a prefix sum of peaks-so-far, which uses the O(n) space allowed by the problem specification. Checking whether a peak has occurred in each 'group' is then an O(1) lookup operation using the group start and end indices.

按照这一逻辑,当候选块大小为3循环需要执行N / 3的峰值检查。的复杂度变的总和:N / A + N / B + ... + N / n其中分母(A,B,......)都n的因子。

Following this logic, when the candidate block size is 3 the loop needs to perform n / 3 peak checks. The complexity becomes a sum: n/a + n/b + ... + n/n where the denominators (a, b, ...) are the factors of n.

简单地说:钕(N)操作的复杂度为O(n.log(的log(n)))。

Short story: The complexity of n.d(n) operations is O(n.log(log(n))).

更长的版本: 如果你一直在做的Codility经验,你会记得从 第8课: Prime和合数 的谐波次数操作的总和会给O(日志(N))的复杂性。我们有一组减少,因为我们只是在寻找系数的分母。 第9课:​​埃拉托色尼的筛 显示了总和素数的倒数是为O(log(日志(N))),并声称证明是不平凡。在这种情况下 维基百科 告诉我们,除数西格玛的总和(N )有一个上限(见罗宾的不平等,大约一半在页面)。

Longer version: If you've been doing the Codility Lessons you'll remember from the Lesson 8: Prime and composite numbers that the sum of harmonic number operations will give O(log(n)) complexity. We've got a reduced set, because we're only looking at factor denominators. Lesson 9: Sieve of Eratosthenes shows how the sum of reciprocals of primes is O(log(log(n))) and claims that 'the proof is non-trivial'. In this case Wikipedia tells us that the sum of divisors sigma(n) has an upper bound (see Robin's inequality, about half way down the page).

这是否完全回答你的问题?还就如何提高我的蟒蛇code建议非常欢迎!

Does that completely answer your question? Suggestions on how to improve my python code are also very welcome!

def solution(data):

    length = len(data)

    # array ends can't be peaks, len < 3 must return 0    
    if len < 3:
        return 0

    peaks = [0] * length

    # compute a list of 'peaks to the left' in O(n) time
    for index in range(2, length):
        peaks[index] = peaks[index - 1]

        # check if there was a peak to the left, add it to the count
        if data[index - 1] > data[index - 2] and data[index - 1] > data[index]:
            peaks[index] += 1

    # candidate is the block size we're going to test
    for candidate in range(3, length + 1):

        # skip if not a factor
        if length % candidate != 0:
            continue

        # test at each point n / block
        valid = True
        index = candidate
        while index != length:

            # if no peak in this block, break
            if peaks[index] == peaks[index - candidate]:
                valid = False
                break

            index += candidate

        # one additional check since peaks[length] is outside of array    
        if index == length and peaks[index - 1] == peaks[index - candidate]:
            valid = False

        if valid:
            return length / candidate

    return 0

现金: 主要荣誉,以@tmyklebu他 SO回答这对我帮助很大。

Credits: Major kudos to @tmyklebu for his SO answer which helped me a lot.

这篇关于Codility峰复杂性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆