从0〜替换重复号码具有唯一编号(N-1) [英] Replace duplicate numbers with unique numbers from 0-(N-1)

查看:202
本文介绍了从0〜替换重复号码具有唯一编号(N-1)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有积极的随机数是一定会包含重复的N长度的数组。 例如10,4,5,7,10,9,10,9,8,10,5
修改 N是可能是32或两个关于大小一些其他的力量

I have an N-length array of positive random numbers that are certain to contain duplicates. e.g. 10,4,5,7,10,9,10,9,8,10,5
Edit: N is likely to be 32, or some other power of two about that size.

我试图找到从0-(N-1)替换丢失号码的重复的最快方法。使用上面的例子,我想要的结果看起来是这样的:
10,4,5,7,0,9,1,2,8,3,6
其目标是使每个编号从0到N-1中的一个,而不只是更换所有的数字与O-(N-1)(随机顺序很重要)。
修改同样重要的是,这种更换是确定的,即相同的输入会产生相同的输出(不是随机的)

I am trying to find the fastest way to replace the duplicates with the missing numbers from 0-(N-1). Using the above example, I want a result that looks like this:
10,4,5,7,0,9,1,2,8,3,6
The goal being to have one of each number from 0 to N-1, without just replacing all the numbers with 0-(N-1) (the random order is important).
Edit: It's also important that this replacement is deterministic, i.e. the same input will have the same output (not random).

目前实施的Java,使用2布尔数组来跟踪使用/未使用的数字(唯一号码/丢失号码的范围在[0,N)),和具有N近似最坏情况下的运行时间+ N * SQRT (N)。
在code如下:

Currently implemented in Java, uses 2 boolean arrays to keep track of used/unused numbers (unique numbers/missing numbers in the range [0,N) ), and has an approximate worst-case runtime of N+N*sqrt(N).
The code follows:

public byte[] uniqueify(byte[] input)
{
    boolean[] usedNumbers = new boolean[N];
    boolean[] unusedIndices = new boolean[N];
    byte[] result = new byte[N];

    for(int i = 0; i < N; i++) // first pass through
    {
        int newIdx = (input[i] + 128) % N; // first make positive
        if(!usedNumbers[newIdx]) // if this number has not been used
        {
            usedNumbers[newIdx] = true; // mark as used
            result[i] = newIdx; // save it in the result
        }
        else // if the number is used
        {
            unusedIndices[i] = true; // add it to the list of duplicates
        }
    }

    // handle all the duplicates
    for(int idx = 0; idx < N; idx++) // iterate through all numbers
    {
        if(unusedIndices[idx]) // if unused
            for(int i = 0; i < N; i++) // go through all numbers again
            {
                if(!usedNumbers[i]) // if this number is still unused
                {
                    usedNumbers[i] = true; // mark as used
                    result[i] = idx;
                    break;
                }
            }
    }
    return result;
}  

这似乎是最快的,我可以希望,但我想我会问互联网,因为还有人比我更聪明谁可能有一个更好的解决方案。

This seems like the fastest I can hope for, but I thought I'd ask the internet, because there are people much more clever than I who may have a better solution.

N.B。建议/解决方案不必使用Java语言。

N.B. Suggestions/solutions do not have to be in Java.

感谢您。

修改我忘了提,我把它转换为C ++。我贴我的Java实现,因为它更完整。

推荐答案

要做到这一点,最快的方式可能是最直接的。我会接一传,通过数据保持每个不同的值进行计数和标记,其中重复出现的列表。然后是形成未使用的值的列表,并在那里被发现重复的地方应用它们又只是早晚的问题。

The fastest way to do this is probably the most straightforward one. I would take a pass through the list of data keeping a count of each distinct value and marking where duplicates appeared. Then it is just a matter of forming a list of unused values and applying them in turn at the places where duplicates were found.

诱人的,因为它可能是用C ++ 列表,如果速度是最重要的一个简单的C数组是最高效的。

Tempting as it may be to use a C++ List, if speed is of the essence a simple C array is the most efficient.

该程序显示的原则。

#include <iostream>
#include <cstring>

using namespace std;

int main()
{
  int data[] = { 10, 4, 5, 7, 10, 9, 10, 9, 8, 10, 5 };
  int N = sizeof(data) / sizeof(data[0]);

  int tally[N];
  memset(tally, 0, sizeof(tally));

  int dup_indices[N];
  int ndups = 0;

  // Build a count of each value and a list of indices of duplicate data
  for (int i = 0; i < N; i++) {
    if (tally[data[i]]++) {
      dup_indices[ndups++] = i;
    }
  }

  // Replace each duplicate with the next value having a zero count
  int t = 0;
  for (int i = 0; i < ndups; i++) {
    while (tally[t]) t++;
    data[dup_indices[i]] = t++;
  }

  for (int i = 0; i < N; i++) {
    cout << data[i] << " ";
  }

  return 0;
}

输出

10 4 5 7 0 9 1 2 8 3 6

这篇关于从0〜替换重复号码具有唯一编号(N-1)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆