卸下JavaScript的数组重复 [英] Remove Duplicates from JavaScript Array

查看:171
本文介绍了卸下JavaScript的数组重复的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这似乎是这样一个简单的需要,但我花了过多的时间量尝试这样做无济于事。我看了上的其他问题,我还没有找到我需要什么。我有一个非常简单的JavaScript数组,如 peoplenames =新的Array(迈克,马特,南希,亚当,珍妮,南希,卡尔); ,可能会或可能不会包含重复,我需要简单地删除重复项,并把唯一值在一个新的数组。而已。我可以指向所有codeS,我试过,但我认为这是无用的,因为他们没有工作。如果有人这样做,可以帮我我真的AP preciate它。 JavaScript或jQuery的解决方案,都是可以接受的。

相关阅读:<一href=\"http://stackoverflow.com/questions/840781/easiest-way-to-find-duplicate-values-in-a-javascript-array\">Easiest这样一个JavaScript数组中找到重复的值


解决方案

智能,但天真的方式

  uniqueArray = a.filter(函数(项目,POS){
    返回a.indexOf(项目)== POS;
})

基本上,我们迭代阵列上,并且对于每个元件,检查是否在阵列中该元件的第一位置等于当前位置。显然,这两个位置都是重复的元素不同。

使用3日(本阵)的滤波器回调的参数,我们可以避开数组变量的封闭:

  uniqueArray = a.filter(函数(项目,POS,自助){
    返回self.indexOf(项目)== POS;
})

虽然简洁,这种算法是不适合大阵列(二次时间)特别有效的。

散列表救援

 函数的uniq(一){
    变种看出= {};
    返回a.filter(函数(项目){
        返回seen.hasOwnProperty(项目)?假:(看到[项目] = TRUE);
    });
}

这是它是如何做的通常。我们的想法是每个元素放置在一个哈希表,然后检查其presence瞬间。这为我们提供了线性的时间,但至少有两个缺点:


  • 因为哈希键只能在Javascript中的字符串,这个code不区分数字和数字串。也就是说,的uniq([1,1])将仅返回 [1]

  • 出于同样的原因,所有的对象都将被认为是相等:的uniq([{富:1},{富:2]})将仅返回 [{富:1}]。

这是说,如果你的阵列只包含原语,你不关心类型(例如,它总是号),该解决方案是最佳的。

从两个世界

最好的

这是通用的解决方案结合了这两种方法:它使用哈希查找原语和对象的线性搜索

 函数的uniq(一){
    VAR prims = {布尔:{},数字:{},串:{}},OBJ文件= [];    返回a.filter(函数(项目){
        VAR类型= ty​​peof运算项目;
        如果(输入prims)
            返回prims【类型】.hasOwnProperty(项目)?假:(prims【类型】[项目] = TRUE);
        其他
            返回objs.indexOf(项目)&GT; = 0?假:objs.push(项目);
    });
}

排序| uniq的

另一个选择是数组排序第一,然后取出每一个元素等于preceding之一:

 函数的uniq(一){
    返回a.sort()。过滤器(函数(项目,POS进制){
        回报!POS ||项目=元[POS - 1]!;
    })
}

此外,这并不使用对象(因为所有的对象都是相等排序),除非能够提供一个特殊的比较功能。此外,这种方法原始数组默默地改变的副作用 - 没有好!但是,如果你的输入已经排序,这是要走的路(只是删除排序从上面的)。

按...

独特

有时它是希望uniquify基于不仅仅是平等的,例如一些其他的标准清单,以筛选出不同的对象,但大家分享一些财产。这可以通过优雅传递一个回调来完成。这种钥匙的回调被施加到每个元素,并以同样的钥匙元素被删除。由于有望回归原始,哈希表将正常工作在这里:

 函数uniqBy(A,键){
    变种看出= {};
    返回a.filter(函数(项目){
        VAR K =键(项目);
        返回seen.hasOwnProperty(K)?假:(看到[K] = TRUE);
    })
}

一个特别有用的键() JSON.stringify 这将删除物理上不同的对象,而是看是相同的:

  A = [[1,2,3],[4,5,6],[1,2,3]
B = uniqBy(一,JSON.stringify)
的console.log(二)// [[1,2,3],[4,5,6]

罗短跑 uniq的方法。他们的算法基本上类似于上面的第一个片段,并归结为这样:

  VAR的结果= [];
a.forEach(函数(项目){
     如果(result.indexOf(项目)小于0){
         result.push(项目);
     }
});

这是二次,但也有不错的额外的东西,比如封装本地的indexOf ,由一个键(uniqify能力 iteratee 在他们的说法),以及用于优化已排序数组。

如果你使用jQuery和之前没有一块钱不能忍受任何东西,它是这样的:

  $ .uniqArray =函数(){
        返回$ .grep(一,函数(项目,POS){
            返回$ .inArray(项目一)=== POS;
        });
  }

是,再次,第一片段的变体。

性能

函数调用是昂贵的在Javascript,因此,上述的解决方案,因为简洁因为它们是,不是特别有效。为了最大性能,更换过滤器一个循环,摆脱其他函数调用:

 函数uniq_fast(一){
    变种看出= {};
    变量超出= [];
    VAR LEN =则为a.length;
    变种J = 0;
    对于(VAR I = 0; I&LT; LEN,我++){
         VAR项目= a [i];
         如果(见于[项目]!== 1){
               看到[项目] = 1;
               出[J ++] =项目;
         }
    }
    返回的;
}

丑code这一块不一样的片段#3以上的,而是一个量级更快:

\r

ES6

ES6提供的对象,这使得事情的整体轻松很多:

 函数的uniq(一){
   返回Array.from(新集(一));
}

需要注意的是,不像在巨蟒,ES6集在插入顺序迭代,所以这code preserves原始数组的顺序。

不过,如果你需要具有独特元素的数组,为什么不使用套从一开始?

This seems like such a simple need but I've spent an inordinate amount of time trying to do this to no avail. I've looked at other questions on SO and I haven't found what I need. I have a very simple JavaScript array such as peoplenames = new Array("Mike","Matt","Nancy","Adam","Jenny","Nancy","Carl"); that may or may not contain duplicates and I need to simply remove the duplicates and put the unique values in a new array. That's it. I could point to all the codes that I've tried but I think it's useless because they don't work. If anyone has done this and can help me out I'd really appreciate it. JavaScript or jQuery solutions are both acceptable.

Related: Easiest way to find duplicate values in a JavaScript array

解决方案

"Smart" but naïve way

uniqueArray = a.filter(function(item, pos) {
    return a.indexOf(item) == pos;
})

Basically, we iterate over the array and, for each element, check if the first position of this element in the array is equal to the current position. Obviously, these two positions are different for duplicate elements.

Using the 3rd ("this array") parameter of the filter callback we can avoid a closure of the array variable:

uniqueArray = a.filter(function(item, pos, self) {
    return self.indexOf(item) == pos;
})

Although concise, this algorithm is not particularly efficient for large arrays (quadratic time).

Hashtables to the rescue

function uniq(a) {
    var seen = {};
    return a.filter(function(item) {
        return seen.hasOwnProperty(item) ? false : (seen[item] = true);
    });
}

This is how it's usually done. The idea is to place each element in a hashtable and then check for its presence instantly. This gives us linear time, but has at least two drawbacks:

  • since hash keys can only be strings in Javascript, this code doesn't distinguish numbers and "numeric strings". That is, uniq([1,"1"]) will return just [1]
  • for the same reason, all objects will be considered equal: uniq([{foo:1},{foo:2}]) will return just [{foo:1}].

That said, if your arrays contain only primitives and you don't care about types (e.g. it's always numbers), this solution is optimal.

The best from two worlds

An universal solution combines both approaches: it uses hash lookups for primitives and linear search for objects.

function uniq(a) {
    var prims = {"boolean":{}, "number":{}, "string":{}}, objs = [];

    return a.filter(function(item) {
        var type = typeof item;
        if(type in prims)
            return prims[type].hasOwnProperty(item) ? false : (prims[type][item] = true);
        else
            return objs.indexOf(item) >= 0 ? false : objs.push(item);
    });
}

sort | uniq

Another option is to sort the array first, and then remove each element equal to the preceding one:

function uniq(a) {
    return a.sort().filter(function(item, pos, ary) {
        return !pos || item != ary[pos - 1];
    })
}

Again, this doesn't work with objects (because all objects are equal for sort), unless a special compare function can be provided. Additionally, this method silently changes the original array as a side effect - not good! However, if your input is already sorted, this is the way to go (just remove sort from the above).

Unique by...

Sometimes it's desired to uniquify a list based on some criteria other than just equality, for example, to filter out objects that are different, but share some property. This can be done elegantly by passing a callback. This "key" callback is applied to each element, and elements with equal "keys" are removed. Since key is expected to return a primitive, hash table will work fine here:

function uniqBy(a, key) {
    var seen = {};
    return a.filter(function(item) {
        var k = key(item);
        return seen.hasOwnProperty(k) ? false : (seen[k] = true);
    })
}

A particularly useful key() is JSON.stringify which will remove objects that are physically different, but "look" the same:

a = [[1,2,3], [4,5,6], [1,2,3]]
b = uniqBy(a, JSON.stringify)
console.log(b) // [[1,2,3], [4,5,6]]

Libraries

Both underscore and Lo-Dash provide uniq methods. Their algorithms are basically similar to the first snippet above and boil down to this:

var result = [];
a.forEach(function(item) {
     if(result.indexOf(item) < 0) {
         result.push(item);
     }
});

This is quadratic, but there are nice additional goodies, like wrapping native indexOf, ability to uniqify by a key (iteratee in their parlance), and optimizations for already sorted arrays.

If you're using jQuery and can't stand anything without a dollar before it, it goes like this:

  $.uniqArray = function(a) {
        return $.grep(a, function(item, pos) {
            return $.inArray(item, a) === pos;
        });
  }

which is, again, a variation of the first snippet.

Performance

Function calls are expensive in Javascript, therefore the above solutions, as concise as they are, are not particularly efficient. For maximal performance, replace filter with a loop and get rid of other function calls:

function uniq_fast(a) {
    var seen = {};
    var out = [];
    var len = a.length;
    var j = 0;
    for(var i = 0; i < len; i++) {
         var item = a[i];
         if(seen[item] !== 1) {
               seen[item] = 1;
               out[j++] = item;
         }
    }
    return out;
}

This chunk of ugly code does the same as the snippet #3 above, but an order of magnitude faster:

function uniq(a) {
    var seen = {};
    return a.filter(function(item) {
        return seen.hasOwnProperty(item) ? false : (seen[item] = true);
    });
}

function uniq_fast(a) {
    var seen = {};
    var out = [];
    var len = a.length;
    var j = 0;
    for(var i = 0; i < len; i++) {
         var item = a[i];
         if(seen[item] !== 1) {
               seen[item] = 1;
               out[j++] = item;
         }
    }
    return out;
}

/////

var r = [0,1,2,3,4,5,6,7,8,9],
    a = [],
    LEN = 1000,
    LOOPS = 1000;

while(LEN--)
    a = a.concat(r);

var d = new Date();
for(var i = 0; i < LOOPS; i++)
    uniq(a);
document.write('<br>uniq, ms/loop: ' + (new Date() - d)/LOOPS)

var d = new Date();
for(var i = 0; i < LOOPS; i++)
    uniq_fast(a);
document.write('<br>uniq_fast, ms/loop: ' + (new Date() - d)/LOOPS)

ES6

ES6 provides the Set object, which makes things a whole lot easier:

function uniq(a) {
   return Array.from(new Set(a));
}

Note that, unlike in python, ES6 sets are iterated in insertion order, so this code preserves the order of the original array.

However, if you need an array with unique elements, why not use sets right from the beginning?

这篇关于卸下JavaScript的数组重复的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆