为什么我的日志(n)堆比我的n ^ 2选择排序慢 [英] Why is my n log(n) heapsort slower than my n^2 selection sort

查看:158
本文介绍了为什么我的日志(n)堆比我的n ^ 2选择排序慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经实现了两种从最高到最低排序元素的算法。



第一个,在真实RAM模型上取二次时间,第二个是O(n log(n))时间。
第二个使用优先级队列来减少。



这里是时间,这是上述程序的输出。




  • 第一列是整数随机数组的大小

  • 第二列是O(n ^ 2)技术的时间(以秒为单位)

  • 第三列是O(n log(n) b
    $ b

      9600 1.92663 7.58865 
    9800 1.93705 7.67376
    10000 2.08647 8.19094




尽管复杂性有很大差异,第三列大于所考虑的数组大小的第二列。为什么会这样?是C ++的优先级队列实现慢吗?



我在Windows 7,Visual Studio 2012 32位上执行这个代码。



这是代码,

  #includestdafx.h 
#include< iostream>
#include< iomanip>
#include< cstdlib>
#include< algorithm>
#include< vector>
#include< queue>
#include< Windows.h>
#include< assert.h>

using namespace std;

double time_slower_sort(vector< int>& a)
{
LARGE_INTEGER frequency,start,end;
if(:: QueryPerformanceFrequency(& frequency)== FALSE)exit(0);
if(:: QueryPerformanceCounter(& start)== FALSE)exit(0);

for(size_t i = 0; i {

矢量< int& max_element(a.begin()+ i,a.end());
int max_value = * it;
* it = a [i];
a [i] = max_value;

}
if(:: QueryPerformanceCounter(& end)== FALSE)exit(0);
return static_cast< double>(end.QuadPart - start.QuadPart)/ frequency.QuadPart;
}



double time_faster_sort(vector< int>& a)
{
LARGE_INTEGER frequency,start,end;
if(:: QueryPerformanceFrequency(& frequency)== FALSE)exit(0);
if(:: QueryPerformanceCounter(& start)== FALSE)exit(0);


//推入优先级队列。每次插入的对数成本=> O(n log(n))总插入成本
priority_queue< int> pq;
for(size_t i = 0; i< a.size(); ++ i)
{
pq.push(a [i]);
}

//按优先级顺序从优先级队列中读取元素
//每次读取的对数读取成本=>对于整个向量的$(n log(n))读取成本
for(size_t i = 0; i {
a [i] = pq 。最佳();
pq.pop();
}
if(:: QueryPerformanceCounter(& end)== FALSE)exit(0);
return static_cast< double>(end.QuadPart - start.QuadPart)/ frequency.QuadPart;

}




int main(int argc,char ** argv)
{
/对不同大小的向量进行迭代,并尝试两种不同的变体
for(size_t N = 1000; N <= 10000; N + = 100)
{

用相同的随机元素初始化两个向量
vector< int> a(N),b(N);

//用随机元素初始化
for(size_t i = 0; i {
a [i] = rand 1000;
b [i] = a [i];
}

//对两个不同的变体进行排序并计算它们
cout<< N<
<< time_slower_sort(a)<< \t\t
<< time_faster_sort(b)<< endl;

// Sanity check
for(size_t i = 0; i <= N-2; ++ i)
{
assert(a [i] = b [i]); //两者应该返回相同的答案
assert(a [i]> = a [i + 1]); // else not sorted
}

}
return 0;
}


解决方案

更微妙的一个预期。在你 O(N ^ 2)解决方案中你没有做任何分配,算法工作到位,搜索最大,并与当前位置交换。



但在 priority_queue 版本 O(N log N) (内部的 priority_queue 中有一个 std :: vector 默认情况下存储状态)。这个向量当你 push_back 元素有时需要增长(和它),但这是时间在 O(N ^ 2)版本中不会丢失。如果您在 priority_queue 的初始化中进行以下几点更改:



priority_queue< int> pq(a.begin(),a.end()); 而不是 for循环
$ b

O(N log N)击败 O(N ^ 2)的时间应该相当大。在建议的更改中,在 priority_queue 版本中仍然有分配,但只有一次(您为大向量 size,并且分配是重要的耗时操作之一),也许初始化(在 O(N)中可以利用 priority_queue ,不知道 STL 是否真的这样)。



示例代码(用于编译和运行):

  #include< iostream& 
#include< iomanip>
#include< cstdlib>
#include< algorithm>
#include< vector>
#include< queue>
#include< Windows.h>
#include< assert.h>

using namespace std;

double time_slower_sort(vector< int>& a){
LARGE_INTEGER frequency,start,end;
if(:: QueryPerformanceFrequency(& frequency)== FALSE)
exit(0);
if(:: QueryPerformanceCounter(& start)== FALSE)
exit(0);

for(size_t i = 0; i
vector< int> :: iterator it = max_element(a。 begin()+ i,a.end());
int max_value = * it;
* it = a [i];
a [i] = max_value;
}
if(:: QueryPerformanceCounter(& end)== FALSE)
exit(0);
return static_cast< double>(end.QuadPart - start.QuadPart)/
frequency.QuadPart;
}

double time_faster_sort(vector< int>& a){
LARGE_INTEGER frequency,start,end;
if(:: QueryPerformanceFrequency(& frequency)== FALSE)
exit(0);
if(:: QueryPerformanceCounter(& start)== FALSE)
exit(0);

//推入优先级队列。每次插入的对数成本=> O(n
// log(n))总插入成本
priority_queue< int> pq(a.begin(),a.end()); //< -----唯一的变化在这里

//按优先级顺序从优先级队列中读取元素
//每次读取的对数读取成本=> (size_t i = 0; i a [i(n log(n))读取成本
//向量
] = pq.top();
pq.pop();
}
if(:: QueryPerformanceCounter(& end)== FALSE)
exit(0);
return static_cast< double>(end.QuadPart - start.QuadPart)/
frequency.QuadPart;
}

int main(int argc,char ** argv){
//迭代不同大小的向量,然后尝试两个不同的
// variant
for(size_t N = 1000; N <= 10000; N + = 100){

//初始化具有相同随机元素的两个向量
向量< int> a(N),b(N);

//用随机元素初始化
for(size_t i = 0; i a [i] = rand()%1000;
b [i] = a [i];
}

//对两个不同的变体进行排序并计算它们
cout<< N< < time_slower_sort(a)<< \t\t
<< time_faster_sort(b)<< endl;

//正确性检查
for(size_t i = 0; i <= N - 2; ++ i){
assert(a [i] == b [一世]); //两者应该返回相同的答案
assert(a [i]> = a [i + 1]); // else not sorted
}
}
return 0;
}



在我的电脑(Core 2 Duo 6300) p>

  1100 0.000753738 0.000110263 
1200 0.000883201 0.000115749
1300 0.00103077 0.000124526
1400 0.00126994 0.000250698
...
9500 0.0497966 0.00114377
9600 0.051173 0.00123429
9700 0.052551 0.00115804
9800 0.0533245 0.00117614
9900 0.0555007 0.00119205
10000 0.0552341 0.00120466


I have implemented two algorithms for sorting elements from highest to lowest.

The first one, takes quadratic time on the real RAM model and the second an O(n log(n)) time. The second one uses priority queues to get the reduction.

Here are the timings, which are the output of the above program.

  • the first column is the size of the random array of integers
  • the second column is the time in seconds for the O(n^2) technique
  • the third column is the time in seconds for the O(n log(n)) technique

     9600  1.92663      7.58865
     9800  1.93705      7.67376
    10000  2.08647      8.19094
    

In spite of this great difference in complexity, the 3rd column is larger than the second for the array sizes considered. Why is this so? Is the priority queue implementation of C++ slow?

I executed this code on Windows 7, Visual Studio 2012 32-bit.

Here is the code,

#include "stdafx.h"
#include <iostream>
#include <iomanip>
#include <cstdlib>
#include <algorithm>
#include <vector>
#include <queue>
#include <Windows.h>
#include <assert.h>

using namespace std;

double time_slower_sort(vector<int>& a) 
{
    LARGE_INTEGER frequency, start,end;
    if (::QueryPerformanceFrequency(&frequency) == FALSE  ) exit(0);
    if (::QueryPerformanceCounter(&start)       == FALSE  ) exit(0);

    for(size_t i=0 ; i < a.size() ; ++i)  
    {

        vector<int>::iterator it = max_element( a.begin() + i ,a.end() ) ;
        int max_value = *it; 
        *it = a[i];
        a[i] = max_value;    

    }
    if (::QueryPerformanceCounter(&end) == FALSE) exit(0);
    return static_cast<double>(end.QuadPart - start.QuadPart) / frequency.QuadPart;
}



double time_faster_sort(vector<int>& a) 
{
    LARGE_INTEGER frequency, start,end;
    if (::QueryPerformanceFrequency(&frequency) == FALSE  ) exit(0);
    if (::QueryPerformanceCounter(&start)       == FALSE  ) exit(0);


    // Push into the priority queue. Logarithmic cost per insertion = > O (n log(n)) total insertion cost
    priority_queue<int> pq;
    for(size_t i=0 ; i<a.size() ; ++i)
    {
        pq.push(a[i]);
    }

    // Read of the elements from the priority queue in order of priority
    // logarithmic reading cost per read => O(n log(n)) reading cost for entire vector
    for(size_t i=0 ; i<a.size() ; ++i)
    {
        a[i] = pq.top();
        pq.pop();
    }
    if (::QueryPerformanceCounter(&end) == FALSE) exit(0);
    return static_cast<double>(end.QuadPart - start.QuadPart) / frequency.QuadPart;

}




int main(int argc, char** argv)
{
    // Iterate over vectors of different sizes and try out the two different variants
    for(size_t N=1000; N<=10000 ; N += 100 ) 
    {

        // initialize two vectors with identical random elements
        vector<int> a(N),b(N);

        // initialize with random elements
        for(size_t i=0 ; i<N ; ++i) 
        {
            a[i] = rand() % 1000; 
            b[i] = a[i];
        }

        // Sort the two different variants and time them  
        cout << N << "  " 
             << time_slower_sort(a) << "\t\t" 
             << time_faster_sort(b) << endl;

        // Sanity check
        for(size_t i=0 ; i<=N-2 ; ++i) 
        {
            assert(a[i] == b[i]); // both should return the same answer
            assert(a[i] >= a[i+1]); // else not sorted
        }

    }
    return 0;
}

解决方案

I think the problem is really more subtle that one expected. In you O(N^2) solution you are making no allocation, the algorithm work in place, search the biggest and swap with the current position. This is OK.

But in the priority_queue version O(N log N) (the priority_queue in the internal have a std::vector by default to storage the state). This vector when you push_back element by element sometimes it need to grow (and it does) but this is time you are no losing in the O(N^2) version. If you make the following little change in the initialization of the priority_queue:

priority_queue<int> pq(a.begin(), a.end()); instead of the for loop

The time of the O(N log N) beat the O(N^2) as it should by a fair amount. In the proposed change there is still allocation in the priority_queue version, but is only one time (you are saving a lot of allocation for big vector sizes, and allocation is one of the important time consuming operation) and maybe the initialization (in O(N) could take advantage of the complete state of the priority_queue, don't know if the STL really does this).

Sample Code (for compile and run):

#include <iostream>
#include <iomanip>
#include <cstdlib>
#include <algorithm>
#include <vector>
#include <queue>
#include <Windows.h>
#include <assert.h>

using namespace std;

double time_slower_sort(vector<int>& a) {
    LARGE_INTEGER frequency, start, end;
    if (::QueryPerformanceFrequency(&frequency) == FALSE)
        exit(0);
    if (::QueryPerformanceCounter(&start) == FALSE)
        exit(0);

    for (size_t i = 0; i < a.size(); ++i) {

        vector<int>::iterator it = max_element(a.begin() + i, a.end());
        int max_value = *it;
        *it = a[i];
        a[i] = max_value;
    }
    if (::QueryPerformanceCounter(&end) == FALSE)
        exit(0);
    return static_cast<double>(end.QuadPart - start.QuadPart) /
           frequency.QuadPart;
}

double time_faster_sort(vector<int>& a) {
    LARGE_INTEGER frequency, start, end;
    if (::QueryPerformanceFrequency(&frequency) == FALSE)
        exit(0);
    if (::QueryPerformanceCounter(&start) == FALSE)
        exit(0);

    // Push into the priority queue. Logarithmic cost per insertion = > O (n
    // log(n)) total insertion cost
    priority_queue<int> pq(a.begin(), a.end());  // <----- THE ONLY CHANGE IS HERE

    // Read of the elements from the priority queue in order of priority
    // logarithmic reading cost per read => O(n log(n)) reading cost for entire
    // vector
    for (size_t i = 0; i < a.size(); ++i) {
        a[i] = pq.top();
        pq.pop();
    }
    if (::QueryPerformanceCounter(&end) == FALSE)
        exit(0);
    return static_cast<double>(end.QuadPart - start.QuadPart) /
           frequency.QuadPart;
}

int main(int argc, char** argv) {
    // Iterate over vectors of different sizes and try out the two different
    // variants
    for (size_t N = 1000; N <= 10000; N += 100) {

        // initialize two vectors with identical random elements
        vector<int> a(N), b(N);

        // initialize with random elements
        for (size_t i = 0; i < N; ++i) {
            a[i] = rand() % 1000;
            b[i] = a[i];
        }

        // Sort the two different variants and time them
        cout << N << "  " << time_slower_sort(a) << "\t\t"
             << time_faster_sort(b) << endl;

        // Sanity check
        for (size_t i = 0; i <= N - 2; ++i) {
            assert(a[i] == b[i]);     // both should return the same answer
            assert(a[i] >= a[i + 1]); // else not sorted
        }
    }
    return 0;
}

In my PC (Core 2 Duo 6300) the output obtained is:

1100  0.000753738      0.000110263
1200  0.000883201      0.000115749
1300  0.00103077       0.000124526
1400  0.00126994       0.000250698
...
9500  0.0497966        0.00114377
9600  0.051173         0.00123429
9700  0.052551         0.00115804
9800  0.0533245        0.00117614
9900  0.0555007        0.00119205
10000 0.0552341        0.00120466

这篇关于为什么我的日志(n)堆比我的n ^ 2选择排序慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆