在C中的数组删除重复 [英] Removing Duplicates in an array in C

查看:114
本文介绍了在C中的数组删除重复的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题是一个有点复杂。这里的问题是摆脱重复的和阵列的独特元素保存到使用其原始序列另一个阵列

The question is a little complex. The problem here is to get rid of duplicates and save the unique elements of array into another array with their original sequence.

例如:

如果输入的输入B A C A D牛逼

If the input is entered b a c a d t

结果应该是:中的b确切状态A C D吨,输入输入。

The result should be : b a c d t in the exact state that the input entered.

所以,排序的数组,然后检查,因为我失去了原有的序列不能正常工作。我被告知要使用索引的数组,但我不知道该怎么办。那么,什么是您的意见做呢?

So, for sorting the array then checking couldn't work since I lost the original sequence. I was advised to use array of indices but I don't know how to do. So what is your advise to do that?

对于那些谁愿意回答,我想补充一些特定信息的问题。

For those who are willing to answer the question I wanted to add some specific information.

char** finduni(char *words[100],int limit)
{
//
//Methods here
//
}

是我的函数。阵列的重复的应被删除,并存储在不同的阵列字[100]。因此,该过程将在此进行。我首先想到了让字的所有元素到另一个阵列和排序的数组,但并不一些测试工作后。为求解器只是一个提醒。)

is the my function. The array whose duplicates should be removed and stored in a different array is words[100]. So, the process will be done on this. I firstly thought about getting all the elements of words into another array and sort that array but that doesn't work after some tests. Just a reminder for solvers :).

推荐答案

好了,这里是字符类型的一个版本。注意它不能扩展。

Well, here is a version for char types. Note it doesn't scale.

#include "stdio.h"
#include "string.h"

void removeDuplicates(unsigned char *string)
{
   unsigned char allCharacters [256] = { 0 };
   int lookAt;
   int writeTo = 0;
   for(lookAt = 0; lookAt < strlen(string); lookAt++)
   {
      if(allCharacters[ string[lookAt] ] == 0)
      {
         allCharacters[ string[lookAt] ] = 1;  // mark it seen
         string[writeTo++] = string[lookAt];     // copy it
      }
   }
   string[writeTo] = '\0';
}

int main()
{
   char word[] = "abbbcdefbbbghasdddaiouasdf";
   removeDuplicates(word);
   printf("Word is now [%s]\n", word);
   return 0;
}

以下是输出:

Word is now [abcdefghsiou]

是不是像你想要什么?您可以修改方法是否有字母之间的空间,但如果你使用 INT 浮动双击的char * 作为类型,该方法将无法扩展的。

Is that something like what you want? You can modify the method if there are spaces between the letters, but if you use int, float, double or char * as the types, this method won't scale at all.

修改

我张贴,然后看到你的澄清,它是的char * 的数组。我会更新的方法。

I posted and then saw your clarification, where it's an array of char *. I'll update the method.

我希望这不是太code。我适应这个快速排序算法并基本上都添加索引内存吧。该算法是O(n log n)的,如下面的3个步骤是添加剂那就是它们的2最坏的情况下的复杂度。

I hope this isn't too much code. I adapted this QuickSort algorithm and basically added index memory to it. The algorithm is O(n log n), as the 3 steps below are additive and that is the worst case complexity of 2 of them.


  1. 排序字符串数组,但每次交换应在数组索引中反映出来。这个阶段,第i个元素 originalIndices 后持有排序数组的第i个元素的原始指标。

  2. 将它们设置为 NULL ,索引值设置为元素删除排序数组中的重复元素,这是最高的就可以了。

  3. 排序原有指数的阵列,并确保每一个交换是反映字符串数组中为止。这让我们回字符串的原始数组,除了重复是在最后,他们都 NULL

  4. 良好的措施,我返回元素的新的计数。

  1. Sort the array of strings, but every swap should be reflected in the index array as well. After this stage, the i'th element of originalIndices holds the original index of the i'th element of the sorted array.
  2. Remove duplicate elements in the sorted array by setting them to NULL, and setting the index value to elements, which is the highest any can be.
  3. Sort the array of original indices, and make sure every swap is reflected in the array of strings. This gives us back the original array of strings, except the duplicates are at the end and they are all NULL.
  4. For good measure, I return the new count of elements.

code:

#include "stdio.h"
#include "string.h"
#include "stdlib.h"

void sortArrayAndSetCriteria(char **arr, int elements, int *originalIndices)
{
   #define  MAX_LEVELS  1000
   char *piv;
   int  beg[MAX_LEVELS], end[MAX_LEVELS], i=0, L, R;
   int idx, cidx;
   for(idx = 0; idx < elements; idx++)
      originalIndices[idx] = idx;
   beg[0] = 0;
   end[0] = elements;
   while (i>=0)
   {
      L = beg[i];
      R = end[i] - 1;
      if (L<R)
      {
         piv = arr[L];
         cidx = originalIndices[L];
         if (i==MAX_LEVELS-1)
            return;
         while (L < R)
         {
            while (strcmp(arr[R], piv) >= 0 && L < R) R--;
            if (L < R)
            {
               arr[L] = arr[R];
               originalIndices[L++] = originalIndices[R];
            }
            while (strcmp(arr[L], piv) <= 0 && L < R) L++;
            if (L < R)
            {
               arr[R] = arr[L];
               originalIndices[R--] = originalIndices[L];
            }
         }
         arr[L] = piv;
         originalIndices[L] = cidx;
         beg[i + 1] = L + 1;
         end[i + 1] = end[i];
         end[i++] = L;
      }
      else
      {
         i--;
      }
   }
}

int removeDuplicatesFromBoth(char **arr, int elements, int *originalIndices)
{
   // now remove duplicates
   int i = 1, newLimit = 1;
   char *curr = arr[0];
   while (i < elements)
   {
      if(strcmp(curr, arr[i]) == 0)
      {
         arr[i] = NULL;   // free this if it was malloc'd
         originalIndices[i] = elements;  // place it at the end
      }
      else
      {
         curr = arr[i];
         newLimit++;
      }
      i++;
   }
   return newLimit;
}

void sortArrayBasedOnCriteria(char **arr, int elements, int *originalIndices)
{
   #define  MAX_LEVELS  1000
   int piv;
   int beg[MAX_LEVELS], end[MAX_LEVELS], i=0, L, R;
   int idx;
   char *cidx;
   beg[0] = 0;
   end[0] = elements;
   while (i>=0)
   {
      L = beg[i];
      R = end[i] - 1;
      if (L<R)
      {
         piv = originalIndices[L];
         cidx = arr[L];
         if (i==MAX_LEVELS-1)
            return;
         while (L < R)
         {
            while (originalIndices[R] >= piv && L < R) R--;
            if (L < R)
            {
               arr[L] = arr[R];
               originalIndices[L++] = originalIndices[R];
            }
            while (originalIndices[L] <= piv && L < R) L++;
            if (L < R)
            {
               arr[R] = arr[L];
               originalIndices[R--] = originalIndices[L];
            }
         }
         arr[L] = cidx;
         originalIndices[L] = piv;
         beg[i + 1] = L + 1;
         end[i + 1] = end[i];
         end[i++] = L;
      }
      else
      {
         i--;
      }
   }
}

int removeDuplicateStrings(char *words[], int limit)
{
   int *indices = (int *)malloc(limit * sizeof(int));
   int newLimit;
   sortArrayAndSetCriteria(words, limit, indices);
   newLimit = removeDuplicatesFromBoth(words, limit, indices);
   sortArrayBasedOnCriteria(words, limit, indices);
   free(indices);
   return newLimit;
}

int main()
{
   char *words[] = { "abc", "def", "bad", "hello", "captain", "def", "abc", "goodbye" };
   int newLimit = removeDuplicateStrings(words, 8);
   int i = 0;
   for(i = 0; i < newLimit; i++) printf(" Word @ %d = %s\n", i, words[i]);
   return 0;
}

这篇关于在C中的数组删除重复的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆