从0〜替换重复号码具有唯一编号(N-1) [英] Replace duplicate numbers with unique numbers from 0-(N-1)
问题描述
我有积极的随机数是一定会包含重复的N长度的数组。
例如10,4,5,7,10,9,10,9,8,10,5
修改: N是可能是32或两个关于大小一些其他的力量的
I have an N-length array of positive random numbers that are certain to contain duplicates.
e.g. 10,4,5,7,10,9,10,9,8,10,5
Edit: N is likely to be 32, or some other power of two about that size.
我试图找到从0-(N-1)替换丢失号码的重复的最快方法。使用上面的例子,我想要的结果看起来是这样的:
10,4,5,7,0,9,1,2,8,3,6
其目标是使每个编号从0到N-1中的一个,而不只是更换所有的数字与O-(N-1)(随机顺序很重要)。
修改:同样重要的是,这种更换是确定的,即相同的输入会产生相同的输出(不是随机的)的
I am trying to find the fastest way to replace the duplicates with the missing numbers from 0-(N-1). Using the above example, I want a result that looks like this:
10,4,5,7,0,9,1,2,8,3,6
The goal being to have one of each number from 0 to N-1, without just replacing all the numbers with 0-(N-1) (the random order is important).
Edit: It's also important that this replacement is deterministic, i.e. the same input will have the same output (not random).
目前实施的Java,使用2布尔数组来跟踪使用/未使用的数字(唯一号码/丢失号码的范围在[0,N)),和具有N近似最坏情况下的运行时间+ N * SQRT (N)。
在code如下:
Currently implemented in Java, uses 2 boolean arrays to keep track of used/unused numbers (unique numbers/missing numbers in the range [0,N) ), and has an approximate worst-case runtime of N+N*sqrt(N).
The code follows:
public byte[] uniqueify(byte[] input)
{
boolean[] usedNumbers = new boolean[N];
boolean[] unusedIndices = new boolean[N];
byte[] result = new byte[N];
for(int i = 0; i < N; i++) // first pass through
{
int newIdx = (input[i] + 128) % N; // first make positive
if(!usedNumbers[newIdx]) // if this number has not been used
{
usedNumbers[newIdx] = true; // mark as used
result[i] = newIdx; // save it in the result
}
else // if the number is used
{
unusedIndices[i] = true; // add it to the list of duplicates
}
}
// handle all the duplicates
for(int idx = 0; idx < N; idx++) // iterate through all numbers
{
if(unusedIndices[idx]) // if unused
for(int i = 0; i < N; i++) // go through all numbers again
{
if(!usedNumbers[i]) // if this number is still unused
{
usedNumbers[i] = true; // mark as used
result[i] = idx;
break;
}
}
}
return result;
}
这似乎是最快的,我可以希望,但我想我会问互联网,因为还有人比我更聪明谁可能有一个更好的解决方案。
This seems like the fastest I can hope for, but I thought I'd ask the internet, because there are people much more clever than I who may have a better solution.
N.B。建议/解决方案不必使用Java语言。
N.B. Suggestions/solutions do not have to be in Java.
感谢您。
修改:我忘了提,我把它转换为C ++。我贴我的Java实现,因为它更完整。的
推荐答案
要做到这一点,最快的方式可能是最直接的。我会接一传,通过数据保持每个不同的值进行计数和标记,其中重复出现的列表。然后是形成未使用的值的列表,并在那里被发现重复的地方应用它们又只是早晚的问题。
The fastest way to do this is probably the most straightforward one. I would take a pass through the list of data keeping a count of each distinct value and marking where duplicates appeared. Then it is just a matter of forming a list of unused values and applying them in turn at the places where duplicates were found.
诱人的,因为它可能是用C ++ 列表
,如果速度是最重要的一个简单的C数组是最高效的。
Tempting as it may be to use a C++ List
, if speed is of the essence a simple C array is the most efficient.
该程序显示的原则。
#include <iostream>
#include <cstring>
using namespace std;
int main()
{
int data[] = { 10, 4, 5, 7, 10, 9, 10, 9, 8, 10, 5 };
int N = sizeof(data) / sizeof(data[0]);
int tally[N];
memset(tally, 0, sizeof(tally));
int dup_indices[N];
int ndups = 0;
// Build a count of each value and a list of indices of duplicate data
for (int i = 0; i < N; i++) {
if (tally[data[i]]++) {
dup_indices[ndups++] = i;
}
}
// Replace each duplicate with the next value having a zero count
int t = 0;
for (int i = 0; i < ndups; i++) {
while (tally[t]) t++;
data[dup_indices[i]] = t++;
}
for (int i = 0; i < N; i++) {
cout << data[i] << " ";
}
return 0;
}
输出
10 4 5 7 0 9 1 2 8 3 6
这篇关于从0〜替换重复号码具有唯一编号(N-1)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!