找重复的两列LINQ [英] Get duplicates for two columns with LINQ
问题描述
LINQ要把我逼疯了。为什么下面的查询不返回重复的,而它的工作原理,只有一个标识?哪里是我的错误?
LINQ drives me crazy. Why does following query not return the duplicates, whereas it works with only one identifier? Where is my error?
' generate some test-data '
Dim source As New DataTable
source.Columns.Add(New DataColumn("RowNumber", GetType(Int32)))
source.Columns.Add(New DataColumn("Value1", GetType(Int32)))
source.Columns.Add(New DataColumn("Value2", GetType(Int32)))
source.Columns.Add(New DataColumn("Text", GetType(String)))
Dim rnd As New Random()
For i As Int32 = 1 To 100
Dim newRow = source.NewRow
Dim value = rnd.Next(1, 20)
newRow("RowNumber") = i
newRow("Value1") = value
newRow("Value2") = (value + 1)
newRow("Text") = String.Format("RowNumber{0}-Text", i)
source.Rows.Add(newRow)
Next
' following query does not work, it always has Count=0 '
' although it works with only one identifier '
Dim dupIdentifiers = From row In source
Group row By grp = New With {.Val1 = row("Value1"), .Val2 = row("Value2")}
Into Group
Where Group.Count > 1
Select idGroup = New With {grp.Val1, grp.Val2, Group.Count}
修改:下面是完整的解决方案,这要归功于<一个href="http://stackoverflow.com/questions/7530906/get-duplicates-for-two-columns-with-linq/7530944#7530944">@Jon飞碟双向的答案:)
Edit: Following is the complete solution, thanks to @Jon Skeet's answer :)
Dim dupKeys = From row In source
Group row By grp = New With {Key .Val1 = CInt(row("Value1")), Key .Val2 = CInt(row("Value2"))}
Into Group Where Group.Count > 1
Select RowNumber = CInt(Group.FirstOrDefault.Item("RowNumber"))
Dim dupRows = From row In source
Join dupKey In dupKeys
On row("RowNumber") Equals dupKey
Select row
If dupRows.Any Then
' create a new DataTable from the first duplicate rows '
Dim dest = dupRows.CopyToDataTable
End If
与分组的主要问题是,我必须让他们键
属性。
在我上面的code接下来的问题是,从原始表中得到重复的行。
因为几乎所有的行具有重复(根据两个字段),则结果数据表中包含99的100行,而不是仅在19重复值。我需要只选择第一个重复的行,并加入他们的PK原始表。
The main problem with grouping was that i must make them key
properties.
The next problem in my above code was to get the duplicate rows from the original table.
Because nearly every row has a duplicate(according to two fields), the result DataTable contained 99 of 100 rows and not only the 19 duplicate values. I needed to select only the first duplicate row and join them with the original table on the PK.
Select RowNumber = CInt(Group.FirstOrDefault.Item("RowNumber"))
虽然这部作品在我的情况下,也许有人可以解释我如何从原始表中只选择重复的,如果我将不得不只组合键。
Although this works in my case, maybe someone can explain me how to select only the duplicates from the original table if i would have had only composite keys.
修改:我心中已经回答了这个问题的最后一部分我自己,所以这里是我需要的:
Edit: I'v answered the last part of the question myself, so here is all i need:
Dim dups = From row In source
Group By grp = New With {Key .Value1 = CInt(row("Value1")), Key .Value2 = CInt(row("Value2"))}
Into Group Where Group.Count > 1
Let Text = Group.First.Item("Text")
Select Group.First
If dups.Any Then
Dim dest = dups.CopyToDataTable
End If
我所需要的<一个href="http://stackoverflow.com/questions/641931/is-linqs-let-keyword-better-than-its-into-keyword/641944#641944">Let-Keyword为了保持另一列(多个)到相同的上下文,并返回分组的DUP只有第一行。在这条路上,我可以使用 CopyToDataTable
以创建从重复行的数据表。
I needed the Let-Keyword in order to keep the other column(s) into the same context and return only the first row of the grouped dups. On this way i can use CopyToDataTable
to create a DataTable from the duplicate rows.
只有几行(我能救第二个查询找到原始表中的行)找到多个列的重复和创建它们的数据表。
Only a few lines of code overall (i can save the second query to find the rows in the original table) to find duplicates on multiple columns and create a DataTable of them.
推荐答案
这个问题是这样的匿名类型的工作在VB中 - 他们是可变的默认;仅键
属性包含的散列和平等。试试这个:
The problem is the way anonymous types work in VB - they're mutable by default; only Key
properties are included for hashing and equality. Try this:
Group row By grp = New With {Key .Val1 = row("Value1"), Key .Val2 = row("Value2")}
(在C#这不会是一个问题 - 匿名类型在C#中始终不变的所有属性)
(In C# this wouldn't be a problem - anonymous types in C# are always immutable in all properties.)
这篇关于找重复的两列LINQ的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!