检查数组是否重复，只返回出现不止一次的项 [英] Check array for duplicates, return only items which appear more than once

查看：110 发布时间：2017/7/21 18:53:26 c# arrays string text duplicates

本文介绍了检查数组是否重复，只返回出现不止一次的项的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一封电子邮件的文本文件，例如

  Google12@gmail.com，
 MyUSERNAME @ me。 com，
 ME@you.com，
 ratonabat@co.co，
 iamcool@asd.com，
 ratonabat@co.co，

我需要检查所述文档的重复项，并从中创建一个独特的数组（所以如果ratonabat@co.co出现500次新数组他只会出现一次。）

编辑：
例如：

  username1@hotmail.com 
 username2@hotmail.com 
 username1@hotmail.com 
 username1@hotmail.com 
 username1 @ hotmail。 com 
 username1@hotmail.com

这是我的数据（在数组中或者文本文档，我可以处理）

我想要看到是否有重复，并将重复的ONCE移动到另一个数组。所以输出将是

  username1@hotmail.com

解决方案

你可以简单地使用Linq的 Distinct 扩展方法：

  var input = new string [] {...}; 
 var output = input.Distinct（）。ToArray（）;

您可能还需要考虑重构代码以使用 HashSet< string> 而不是一个简单的数组，因为它将优雅地处理重复。

要获取一个只包含那些重复的记录的数组，它有一个小小的moe复杂，但你仍然可以有一点Linq：

  var output = input.GroupBy（x => x）
 .Where（g => g.Skip（1）.Any（））
 .Select（g => g.Key）
 .ToArray（）;

说明：

.GroupBy 将相同的字符串组合在一起

。按以下条件分组 .Skip（1）.Any（）如果有2个或更多项目，返回true在组中。这相当于 .Count（）> 1 ，但它稍微更有效率，因为它找到第二个项目后停止计数。

 
  。选择返回仅由单个字符串组成的集合（而不是组）
 
   .ToArray 将结果集转换为数组。

 
 
 
 
 
 自定义扩展方法：
  public static class MyExtensions 
 {
 public static IEnumerable< T>副本< T>（该IEnumerable< T>）输入
 {
 var a = new HashSet T（）; 
 var b = new HashSet< T>（）; 
 foreach（var x in input）
 {
 if（！a.Add（x）&& b.Add（x））
 yield return x; 
} 
} 
} 
  
然后你可以称之为方法如下：
  var output = input.Duplicates（）。ToArray（）; 
  
我没有对此进行基准测试，但它应该比以前的方法更有效。 > 
I have an text document of emails such as
Google12@gmail.com,
MyUSERNAME@me.com,
ME@you.com,
ratonabat@co.co,
iamcool@asd.com,
ratonabat@co.co,
I need to check said document for duplicates and create a unique array from that (so if "ratonabat@co.co" appears 500 times in the new array he'll only appear once.)

Edit:
For an example:
username1@hotmail.com
username2@hotmail.com
username1@hotmail.com
username1@hotmail.com
username1@hotmail.com
username1@hotmail.com
This is my "data" (either in an array or text document, I can handle that)

I want to be able to see if there's a duplicate in that, and move the duplicate ONCE to another array. So the output would be
username1@hotmail.com

 解决方案 
You can simply use Linq's Distinct extension method:
var input = new string[] { ... };
var output = input.Distinct().ToArray();
You may also want to consider refactoring your code to use a HashSet<string> instead of a simple array, as it will gracefully handle duplicates. 



To get an array containing only those records which are duplicates, it's a little moe complex, but you can still do it with a little Linq:
var output = input.GroupBy(x => x)
                  .Where(g => g.Skip(1).Any())
                  .Select(g => g.Key)
                  .ToArray();
Explanation:


.GroupBy group identical strings together
.Where filter the groups by the following criteria

.Skip(1).Any() return true if there are 2 or more items in the group. This is equivalent to .Count() > 1, but it's slightly more efficient because it stops counting after it finds a second item.

.Select return a set consisting only of a single string (rather than the group)
.ToArray convert the result set to an array.




Here's another solution using a custom extension method:
public static class MyExtensions
{
    public static IEnumerable<T> Duplicates<T>(this IEnumerable<T> input)
    {
        var a = new HashSet<T>();
        var b = new HashSet<T>();
        foreach(var x in input)
        {
            if (!a.Add(x) && b.Add(x))
                yield return x;
        }
    }
}
And then you can call this method like this:
var output = input.Duplicates().ToArray();
I haven't benchmarked this, but it should be more efficient than the previous method.

                        这篇关于检查数组是否重复，只返回出现不止一次的项的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

检查数组是否重复，只返回出现不止一次的项 [英] Check array for duplicates, return only items which appear more than once

问题描述

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

检查数组是否重复，只返回出现不止一次的项 [英] Check array for duplicates, return only items which appear more than once

问题描述

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭