算法细分的数组"半等于",均匀子阵 [英] Algorithm for subdividing an array into "semi-equal", uniform sub-arrays
问题描述
由于有N个元素的数组,我在寻找并购(M< N)相继子阵列,等长或长度相差大多是1。例如,如果N = 12,M = 4,所有子阵列将具有相等的长度N / M = 3。如果N = 100和M = 12,我期望子阵列具有长度8和9,并且两个尺寸应原数组内均匀地小号$ P $垫。这个简单的任务,原来是一点点微妙的实现。我想出了布氏线算法的适应,它看起来像这样,当codeD在C ++:
Given an array with N elements, I am looking for M (M < N) successive sub-arrays with equal lengths or with lengths that differ by mostly 1. For example, if N = 12 and M = 4, all sub-arrays would have equal lengths of N/M = 3. If N = 100 and M = 12, I expect sub-arrays with lengths 8 and 9, and both sizes should be uniformly spread within the original array. This simple task turned to be a little bit subtle to implement. I came up with an adaptation of the Bresenham's line algorithm, which looks like this when coded in C++:
/// The function suggests how an array with num_data-items can be
/// subdivided into successively arranged groups (intervals) with
/// equal or "similar" length. The number of intervals is specified
/// by the parameter num_intervals. The result is stored into an array
/// with (num_data + 1) items, each of which indicates the start-index of
/// an interval, the last additional index being a sentinel item which
/// contains the value num_data.
///
/// Example:
///
/// Input: num_data ........... 14,
/// num_intervals ...... 4
///
/// Result: result_start_idx ... [ 0, 3, 7, 10, 14 ]
///
void create_uniform_intervals( const size_t num_data,
const size_t num_intervals,
std::vector<size_t>& result_start_idx )
{
const size_t avg_interval_len = num_data / num_intervals;
const size_t last_interval_len = num_data % num_intervals;
// establish the new size of the result vector
result_start_idx.resize( num_intervals + 1L );
// write the pivot value at the end:
result_start_idx[ num_intervals ] = num_data;
size_t offset = 0L; // current offset
// use Bresenham's line algorithm to distribute
// last_interval_len over num_intervals:
intptr_t error = num_intervals / 2;
for( size_t i = 0L; i < num_intervals; i++ )
{
result_start_idx[ i ] = offset;
offset += avg_interval_len;
error -= last_interval_len;
if( error < 0 )
{
offset++;
error += num_intervals;
} // if
} // for
}
这code计算的间隔长度为N = 100,M = 12:8 9 8 8 9 8 8 9 8 8 9 8
This code calculates the interval lengths for N = 100, M=12: 8 9 8 8 9 8 8 9 8 8 9 8
实际的问题是,我不知道究竟怎么叫我的问题,所以我难以寻找它。
The actual question is that I don't know how exactly to call my problem, so I had difficulty searching for it.
- 是否还有其他算法来完成这样的任务?
- 他们如何叫什么名字?也许名字会来,如果我知道应用的其他领域。
我所需要的算法作为用于数据的聚类更大算法的一部分。我想这也可能是实现并行排序有用的(?)。
I needed the algorithm as a part of a bigger algorithm for clustering of data. I think it could also be useful for implementing a parallel sort(?).
推荐答案
如果你的语言都有整数除法截断,一个简单的方法来计算截面尺寸我
是通过(N * I + N)/ M - (N * I)/ M
。例如,Python程序
If your language has integer division that truncates, an easy way to compute the size of section i
is via (N*i+N)/M - (N*i)/M
. For example, the python program
N=100;M=12
for i in range(M): print (N*i+N)/M - (N*i)/M
输出数字8 8 9 8 8 9 8 8 9 8 8 9. N = 12,M = 5
它输出2 2 3 2 3. N = 12,M = 3
它输出4 4 4。
outputs the numbers 8 8 9 8 8 9 8 8 9 8 8 9. With N=12;M=5
it outputs 2 2 3 2 3. With N=12;M=3
it outputs 4 4 4.
如果您的部分号码从1开始的,而不是从0开始的,前pression是代替(N * I)/ M - (N * IN)/ M
。
If your section numbers are 1-based rather than 0-based, the expression is instead (N*i)/M - (N*i-N)/M
.
这篇关于算法细分的数组&QUOT;半等于&QUOT;,均匀子阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!