为什么我的日志(n)堆比我的n ^ 2选择排序慢 [英] Why is my n log(n) heapsort slower than my n^2 selection sort
问题描述
我已经实现了两种从最高到最低排序元素的算法。
第一个,在真实RAM模型上取二次时间,第二个是O(n log(n))时间。
第二个使用优先级队列来减少。
这里是时间,这是上述程序的输出。
- 第一列是整数随机数组的大小
- 第二列是O(n ^ 2)技术的时间(以秒为单位)
-
第三列是O(n log(n) b
$ b9600 1.92663 7.58865
9800 1.93705 7.67376
10000 2.08647 8.19094
尽管复杂性有很大差异,第三列大于所考虑的数组大小的第二列。为什么会这样?是C ++的优先级队列实现慢吗?
我在Windows 7,Visual Studio 2012 32位上执行这个代码。
这是代码,
#includestdafx.h
#include< iostream>
#include< iomanip>
#include< cstdlib>
#include< algorithm>
#include< vector>
#include< queue>
#include< Windows.h>
#include< assert.h>
using namespace std;
double time_slower_sort(vector< int>& a)
{
LARGE_INTEGER frequency,start,end;
if(:: QueryPerformanceFrequency(& frequency)== FALSE)exit(0);
if(:: QueryPerformanceCounter(& start)== FALSE)exit(0);
for(size_t i = 0; i {
矢量< int& max_element(a.begin()+ i,a.end());
int max_value = * it;
* it = a [i];
a [i] = max_value;
}
if(:: QueryPerformanceCounter(& end)== FALSE)exit(0);
return static_cast< double>(end.QuadPart - start.QuadPart)/ frequency.QuadPart;
}
double time_faster_sort(vector< int>& a)
{
LARGE_INTEGER frequency,start,end;
if(:: QueryPerformanceFrequency(& frequency)== FALSE)exit(0);
if(:: QueryPerformanceCounter(& start)== FALSE)exit(0);
//推入优先级队列。每次插入的对数成本=> O(n log(n))总插入成本
priority_queue< int> pq;
for(size_t i = 0; i< a.size(); ++ i)
{
pq.push(a [i]);
}
//按优先级顺序从优先级队列中读取元素
//每次读取的对数读取成本=>对于整个向量的$(n log(n))读取成本
for(size_t i = 0; i {
a [i] = pq 。最佳();
pq.pop();
}
if(:: QueryPerformanceCounter(& end)== FALSE)exit(0);
return static_cast< double>(end.QuadPart - start.QuadPart)/ frequency.QuadPart;
}
int main(int argc,char ** argv)
{
/对不同大小的向量进行迭代,并尝试两种不同的变体
for(size_t N = 1000; N <= 10000; N + = 100)
{
用相同的随机元素初始化两个向量
vector< int> a(N),b(N);
//用随机元素初始化
for(size_t i = 0; i {
a [i] = rand 1000;
b [i] = a [i];
}
//对两个不同的变体进行排序并计算它们
cout<< N<
<< time_slower_sort(a)<< \t\t
<< time_faster_sort(b)<< endl;
// Sanity check
for(size_t i = 0; i <= N-2; ++ i)
{
assert(a [i] = b [i]); //两者应该返回相同的答案
assert(a [i]> = a [i + 1]); // else not sorted
}
}
return 0;
}
更微妙的一个预期。在你 O(N ^ 2)解决方案中你没有做任何分配,算法工作到位,搜索最大,并与当前位置交换。
但在 priority_queue
版本 O(N log N) (内部的 priority_queue
中有一个 std :: vector
默认情况下存储状态)。这个向量
当你 push_back
元素有时需要增长(和它),但这是时间在 O(N ^ 2)版本中不会丢失。如果您在 priority_queue
的初始化中进行以下几点更改:
priority_queue< int> pq(a.begin(),a.end());
而不是 for循环
$ b
O(N log N)击败 O(N ^ 2)的时间应该相当大。在建议的更改中,在 priority_queue
版本中仍然有分配,但只有一次(您为大向量
size,并且分配是重要的耗时操作之一),也许初始化(在 O(N)中可以利用
priority_queue
,不知道 STL
是否真的这样)。
示例代码(用于编译和运行):
#include< iostream&
#include< iomanip>
#include< cstdlib>
#include< algorithm>
#include< vector>
#include< queue>
#include< Windows.h>
#include< assert.h>
using namespace std;
double time_slower_sort(vector< int>& a){
LARGE_INTEGER frequency,start,end;
if(:: QueryPerformanceFrequency(& frequency)== FALSE)
exit(0);
if(:: QueryPerformanceCounter(& start)== FALSE)
exit(0);
for(size_t i = 0; i
vector< int> :: iterator it = max_element(a。 begin()+ i,a.end());
int max_value = * it;
* it = a [i];
a [i] = max_value;
}
if(:: QueryPerformanceCounter(& end)== FALSE)
exit(0);
return static_cast< double>(end.QuadPart - start.QuadPart)/
frequency.QuadPart;
}
double time_faster_sort(vector< int>& a){
LARGE_INTEGER frequency,start,end;
if(:: QueryPerformanceFrequency(& frequency)== FALSE)
exit(0);
if(:: QueryPerformanceCounter(& start)== FALSE)
exit(0);
//推入优先级队列。每次插入的对数成本=> O(n
// log(n))总插入成本
priority_queue< int> pq(a.begin(),a.end()); //< -----唯一的变化在这里
//按优先级顺序从优先级队列中读取元素
//每次读取的对数读取成本=> (size_t i = 0; i a [i(n log(n))读取成本
//向量
] = pq.top();
pq.pop();
}
if(:: QueryPerformanceCounter(& end)== FALSE)
exit(0);
return static_cast< double>(end.QuadPart - start.QuadPart)/
frequency.QuadPart;
}
int main(int argc,char ** argv){
//迭代不同大小的向量,然后尝试两个不同的
// variant
for(size_t N = 1000; N <= 10000; N + = 100){
//初始化具有相同随机元素的两个向量
向量< int> a(N),b(N);
//用随机元素初始化
for(size_t i = 0; i a [i] = rand()%1000;
b [i] = a [i];
}
//对两个不同的变体进行排序并计算它们
cout<< N< < time_slower_sort(a)<< \t\t
<< time_faster_sort(b)<< endl;
//正确性检查
for(size_t i = 0; i <= N - 2; ++ i){
assert(a [i] == b [一世]); //两者应该返回相同的答案
assert(a [i]> = a [i + 1]); // else not sorted
}
}
return 0;
}
在我的电脑(Core 2 Duo 6300) p>
1100 0.000753738 0.000110263
1200 0.000883201 0.000115749
1300 0.00103077 0.000124526
1400 0.00126994 0.000250698
...
9500 0.0497966 0.00114377
9600 0.051173 0.00123429
9700 0.052551 0.00115804
9800 0.0533245 0.00117614
9900 0.0555007 0.00119205
10000 0.0552341 0.00120466
I have implemented two algorithms for sorting elements from highest to lowest.
The first one, takes quadratic time on the real RAM model and the second an O(n log(n)) time. The second one uses priority queues to get the reduction.
Here are the timings, which are the output of the above program.
- the first column is the size of the random array of integers
- the second column is the time in seconds for the O(n^2) technique
the third column is the time in seconds for the O(n log(n)) technique
9600 1.92663 7.58865 9800 1.93705 7.67376 10000 2.08647 8.19094
In spite of this great difference in complexity, the 3rd column is larger than the second for the array sizes considered. Why is this so? Is the priority queue implementation of C++ slow?
I executed this code on Windows 7, Visual Studio 2012 32-bit.
Here is the code,
#include "stdafx.h"
#include <iostream>
#include <iomanip>
#include <cstdlib>
#include <algorithm>
#include <vector>
#include <queue>
#include <Windows.h>
#include <assert.h>
using namespace std;
double time_slower_sort(vector<int>& a)
{
LARGE_INTEGER frequency, start,end;
if (::QueryPerformanceFrequency(&frequency) == FALSE ) exit(0);
if (::QueryPerformanceCounter(&start) == FALSE ) exit(0);
for(size_t i=0 ; i < a.size() ; ++i)
{
vector<int>::iterator it = max_element( a.begin() + i ,a.end() ) ;
int max_value = *it;
*it = a[i];
a[i] = max_value;
}
if (::QueryPerformanceCounter(&end) == FALSE) exit(0);
return static_cast<double>(end.QuadPart - start.QuadPart) / frequency.QuadPart;
}
double time_faster_sort(vector<int>& a)
{
LARGE_INTEGER frequency, start,end;
if (::QueryPerformanceFrequency(&frequency) == FALSE ) exit(0);
if (::QueryPerformanceCounter(&start) == FALSE ) exit(0);
// Push into the priority queue. Logarithmic cost per insertion = > O (n log(n)) total insertion cost
priority_queue<int> pq;
for(size_t i=0 ; i<a.size() ; ++i)
{
pq.push(a[i]);
}
// Read of the elements from the priority queue in order of priority
// logarithmic reading cost per read => O(n log(n)) reading cost for entire vector
for(size_t i=0 ; i<a.size() ; ++i)
{
a[i] = pq.top();
pq.pop();
}
if (::QueryPerformanceCounter(&end) == FALSE) exit(0);
return static_cast<double>(end.QuadPart - start.QuadPart) / frequency.QuadPart;
}
int main(int argc, char** argv)
{
// Iterate over vectors of different sizes and try out the two different variants
for(size_t N=1000; N<=10000 ; N += 100 )
{
// initialize two vectors with identical random elements
vector<int> a(N),b(N);
// initialize with random elements
for(size_t i=0 ; i<N ; ++i)
{
a[i] = rand() % 1000;
b[i] = a[i];
}
// Sort the two different variants and time them
cout << N << " "
<< time_slower_sort(a) << "\t\t"
<< time_faster_sort(b) << endl;
// Sanity check
for(size_t i=0 ; i<=N-2 ; ++i)
{
assert(a[i] == b[i]); // both should return the same answer
assert(a[i] >= a[i+1]); // else not sorted
}
}
return 0;
}
I think the problem is really more subtle that one expected. In you O(N^2) solution you are making no allocation, the algorithm work in place, search the biggest and swap with the current position. This is OK.
But in the priority_queue
version O(N log N) (the priority_queue
in the internal have a std::vector
by default to storage the state). This vector
when you push_back
element by element sometimes it need to grow (and it does) but this is time you are no losing in the O(N^2) version. If you make the following little change in the initialization of the priority_queue
:
priority_queue<int> pq(a.begin(), a.end());
instead of the for loop
The time of the O(N log N) beat the O(N^2) as it should by a fair amount. In the proposed change there is still allocation in the priority_queue
version, but is only one time (you are saving a lot of allocation for big vector
sizes, and allocation is one of the important time consuming operation) and maybe the initialization (in O(N) could take advantage of the complete state of the priority_queue
, don't know if the STL
really does this).
Sample Code (for compile and run):
#include <iostream>
#include <iomanip>
#include <cstdlib>
#include <algorithm>
#include <vector>
#include <queue>
#include <Windows.h>
#include <assert.h>
using namespace std;
double time_slower_sort(vector<int>& a) {
LARGE_INTEGER frequency, start, end;
if (::QueryPerformanceFrequency(&frequency) == FALSE)
exit(0);
if (::QueryPerformanceCounter(&start) == FALSE)
exit(0);
for (size_t i = 0; i < a.size(); ++i) {
vector<int>::iterator it = max_element(a.begin() + i, a.end());
int max_value = *it;
*it = a[i];
a[i] = max_value;
}
if (::QueryPerformanceCounter(&end) == FALSE)
exit(0);
return static_cast<double>(end.QuadPart - start.QuadPart) /
frequency.QuadPart;
}
double time_faster_sort(vector<int>& a) {
LARGE_INTEGER frequency, start, end;
if (::QueryPerformanceFrequency(&frequency) == FALSE)
exit(0);
if (::QueryPerformanceCounter(&start) == FALSE)
exit(0);
// Push into the priority queue. Logarithmic cost per insertion = > O (n
// log(n)) total insertion cost
priority_queue<int> pq(a.begin(), a.end()); // <----- THE ONLY CHANGE IS HERE
// Read of the elements from the priority queue in order of priority
// logarithmic reading cost per read => O(n log(n)) reading cost for entire
// vector
for (size_t i = 0; i < a.size(); ++i) {
a[i] = pq.top();
pq.pop();
}
if (::QueryPerformanceCounter(&end) == FALSE)
exit(0);
return static_cast<double>(end.QuadPart - start.QuadPart) /
frequency.QuadPart;
}
int main(int argc, char** argv) {
// Iterate over vectors of different sizes and try out the two different
// variants
for (size_t N = 1000; N <= 10000; N += 100) {
// initialize two vectors with identical random elements
vector<int> a(N), b(N);
// initialize with random elements
for (size_t i = 0; i < N; ++i) {
a[i] = rand() % 1000;
b[i] = a[i];
}
// Sort the two different variants and time them
cout << N << " " << time_slower_sort(a) << "\t\t"
<< time_faster_sort(b) << endl;
// Sanity check
for (size_t i = 0; i <= N - 2; ++i) {
assert(a[i] == b[i]); // both should return the same answer
assert(a[i] >= a[i + 1]); // else not sorted
}
}
return 0;
}
In my PC (Core 2 Duo 6300) the output obtained is:
1100 0.000753738 0.000110263
1200 0.000883201 0.000115749
1300 0.00103077 0.000124526
1400 0.00126994 0.000250698
...
9500 0.0497966 0.00114377
9600 0.051173 0.00123429
9700 0.052551 0.00115804
9800 0.0533245 0.00117614
9900 0.0555007 0.00119205
10000 0.0552341 0.00120466
这篇关于为什么我的日志(n)堆比我的n ^ 2选择排序慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!