如何查找包含序列中所有元素的最小长度子序列 [英] How to find minimal-length subsequence that contains all element of a sequence
问题描述
给出一个序列,例如 S = {1,8,2,1,4,1,2,9,1,8,4},我需要找到最小长度的子序列包含 S 的所有元素(无重复,顺序无所谓)。
Given a sequence such as S = {1,8,2,1,4,1,2,9,1,8,4}, I need to find the minimal-length subsequence that contains all element of S (no duplicates, order does not matter). How do find this subsequence in an efficient way?
注意: S中有5个不同的元素: {1,2, 4,8,9}。最小长度的子序列必须包含所有这5个元素。
Note: There are 5 distinct elements in S: {1,2,4,8,9}. The minimum-length subsequence must contain all these 5 elements.
推荐答案
算法:
首先,确定数组中不同元素的数量-这可以在线性时间内轻松完成。允许存在 k
个不同的元素。
First, determine the quantity of different elements in the array - this can be easily done in linear time. Let there be k
different elements.
分配数组 cur
大小为10 ^ 5,每个显示了在当前子序列中使用了每个元素的数量(见下文)。
Allocate an array cur
of size 10^5, each showing how much of each element is used in current subsequence (see later).
按住a cnt
变量显示当前在考虑的序列中有多少个不同的元素。现在,使用两个索引,开始
和 end
,并通过以下方式遍历数组:
Hold a cnt
variable showing how many different elements are there currently in the considered sequence. Now, take two indexes, begin
and end
and iterate them through the array the following way:
- 将
cnt
和开始
初始化为0
,end
作为-1
(获得0
第一次递增)。然后在可能的情况下执行以下操作: -
如果
cnt!= k
:
- initialize
cnt
andbegin
as0
,end
as-1
(to get0
after first increment). Then while possible perform follows: If
cnt != k
:
2.1。递增 end
。如果 end
已经是数组的结尾,则中断。如果 cur [array [end]]
为零,则递增 cnt
。增量 cur [array [end]]
。
2.1. increment end
. If end
already is the end of array, then break. If cur[array[end]]
is zero, increment cnt
. Increment cur[array[end]]
.
其他:
2.2 {
尝试增加开始
迭代器:而 cur [ array [begin]]> 1
,将其递减,然后增加开始
( cur [array [begin]]> 1
表示当前子序列中还有另一个这样的元素)。毕竟,将 [开始,结束]
的时间间隔与当前答案进行比较,并存储起来更好。
Try to increment the begin
iterator: while cur[array[begin]] > 1
, decrement it, and increment the begin
(cur[array[begin]] > 1
means that we have another such element in our current subsequence). After all, compare the [begin, end]
interval with current answer and store it if it is better.
}
在进一步的过程变得不可能之后,您得到了答案。复杂度是 O(n)
-只是使两个插入器通过数组。
After the further process becomes impossible, you got the answer. The complexity is O(n)
- just passing two interators through the array.
C ++中的实现:
#include <iostream>
using namespace std;
const int MAXSIZE = 10000;
int arr[ MAXSIZE ];
int cur[ MAXSIZE ];
int main ()
{
int n; // the size of array
// read n and the array
cin >> n;
for( int i = 0; i < n; ++i )
cin >> arr[ i ];
int k = 0;
for( int i = 0; i < n; ++i )
{
if( cur[ arr[ i ] ] == 0 )
++k;
++cur[ arr[ i ] ];
}
// now k is the number of distinct elements
memset( cur, 0, sizeof( cur )); // we need this array anew
int begin = 0, end = -1; // to make it 0 after first increment
int best = -1; // best answer currently found
int ansbegin, ansend; // interval of the best answer currently found
int cnt = 0; // distinct elements in current subsequence
while(1)
{
if( cnt < k )
{
++end;
if( end == n )
break;
if( cur[ arr[ end ]] == 0 )
++cnt; // this elements wasn't present in current subsequence;
++cur[ arr[ end ]];
continue;
}
// if we're here it means that [begin, end] interval contains all distinct elements
// try to shrink it from behind
while( cur[ arr[ begin ]] > 1 ) // we have another such element later in the subsequence
{
--cur[ arr[ begin ]];
++begin;
}
// now, compare [begin, end] with the best answer found yet
if( best == -1 || end - begin < best )
{
best = end - begin;
ansbegin = begin;
ansend = end;
}
// now increment the begin iterator to make cur < k and begin increasing the end iterator again
--cur[ arr[ begin]];
++begin;
--cnt;
}
// output the [ansbegin, ansend] interval as it's the answer to the problem
cout << ansbegin << ' ' << ansend << endl;
for( int i = ansbegin; i <= ansend; ++i )
cout << arr[ i ] << ' ';
cout << endl;
return 0;
}
这篇关于如何查找包含序列中所有元素的最小长度子序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!