如何使用C#在文本文件中查找重复的行 [英] How to find duplicate lines in a text file using C#

查看:111
本文介绍了如何使用C#在文本文件中查找重复的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个文本文件。

我想在文本文档中显示所有重复的行和



存储那些重复的行在一个新的文本文件中。



如何使用C#

解决方案

这么简单的方法是使用Linq:

  string  [] lines = File.ReadAllLines(path); 
lines = lines.GroupBy(x = > x)。其中(g = > g.Count()> 1 )。选择(g = > g.Key)。ToArray();
File.WriteAllLines(newPath,lines):









OP:

 string [] lines = File.ReadAllLines(path); 
lines = lines.GroupBy(x =>Books)。其中(g => g.Count()> 1).Select(g => g.Key)。ToArray();
File.WriteAllLines(newP

ath,lines):



是不是



不,这不对!

而不是随意进行更改(这可能会让我们整天都在进行!)让我们逻辑地看一下:



 lines.GroupBy(x = >  x)。在哪里( g = >  g.Count()>   1 )。选择(g = >  g.Key)。ToArray(); 



一个集合(列表和数组都是集合的例子)

所以我们可以使用Linq方法将它们收集在一起,所以相同的行被分组:

 lines.GroupBy(x = >  x)

x => x 是一个lambda,它使用整行本身作为要分组的数据。

什么时候可以只选择那些行中行数的行(即相同的行数大于1 - 所以只重复:

 lines.GroupBy(x =  >  x)。其中(g = >  g.Count()>   1 



然后我们从小组中选择整个文本:

< pre lang =c#> lines.GroupBy(x = > x)。其中(g = > g.Count()> 1 )。选择(g = > g.Key);

(该组的关键是您分组的值)

并将其转换为数组:

 lines.GroupBy(x = >  x)。其中(g = >  g.Count()>   1 )。选择(g = >  g.Key)。ToArray(); 

所以我们可以把它放回去进入数组。



现在,如果你有一个字符串:

  string  s =  你好,这行是关于带有硬封底的书。; 



你怎么知道它是否包含你的搜索词?


I am having a textfile.
I want to display all the repeated lines in the text document and

store those repeated lines in a new text file.

How to do this using C#

解决方案

Easy way is to use Linq:

string[] lines = File.ReadAllLines(path);
lines = lines.GroupBy(x => x).Where(g => g.Count() > 1).Select(g => g.Key).ToArray();
File.WriteAllLines(newPath, lines):





OP:

string[] lines = File.ReadAllLines(path);
lines = lines.GroupBy(x => "Books").Where(g => g.Count() > 1).Select(g => g.Key).ToArray();
File.WriteAllLines(newP

ath, lines):

Was that right

No, it's not right!
Instead of making changes at random (which could take us both all day!) lets look at it logically:

lines.GroupBy(x => x).Where(g => g.Count() > 1).Select(g => g.Key).ToArray();


lines a Collection (lists and arrays are both examples of collections)
So we can use a Linq Method to "collect them together" so identical lines are "grouped":

lines.GroupBy(x => x)

The x => x is a lambda which uses the line whole line itself as the data to group by.
When can then select only those lines where the number of lines in the group (ie the number of lines that are identical) is greater than one - so just the duplicates:

lines.GroupBy(x => x).Where(g => g.Count() > 1)


Then we select the whole text from the group:

lines.GroupBy(x => x).Where(g => g.Count() > 1).Select(g => g.Key);

(The Key to the Group is the value you grouped by)
And convert it into an array:

lines.GroupBy(x => x).Where(g => g.Count() > 1).Select(g => g.Key).ToArray();

So we can put it back into the lines array.

Now, if you had a string like:

string s = "hello, this line is about books with hard back covers.";


How would you find out if it contained your search word?


这篇关于如何使用C#在文本文件中查找重复的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆