基于多列从数据表中删除重复行 [英] Removal of Duplicate Rows from Data table Based on Multiple columns
问题描述
我有数据表,其中包含许多重复行,我需要根据多个列从数据表中过滤这些行,以在结果数据表中获取不同的行....
I have data table which contains many duplicate rows i need to filter those rows from data table based upon multiple columns to get distinct rows in resultant data table....
Barcode Itemid PacktypeId
1 100 1
1 100 2
1 100 3
1 100 1
1 100 3
只需要包含packtypeid的行1,2,3剩余的第4行和第5行应该删除
我尝试过使用两种方法,但没有一种没有转向更好的结果
数据表包含10多列但是唯一列是条形码,ItemID,PackTypeID
方法-1:
need only rows which contains packtypeid 1,2,3 remaining 4th and 5th row should be removed
I have tried using two methods but none didn't turns for better result
Data table contains more than 10 columns but unique column's is "Barcode", "ItemID", "PackTypeID"
Method-1:
dt_Barcode = dt_Barcode.DefaultView.ToTable(true, "Barcode", "ItemID", "PackTypeID");
上面的方法过滤器是行,但它只返回列3列值我需要整个10列值。
方法-2:
The above method filter's the rows but it returns columns only 3 column values i need entire 10 column values.
Method-2:
List<string> keyColumns = new List<string>();
keyColumns.Add("Barcode");
keyColumns.Add("ItemID");
keyColumns.Add("PackTypeID");
RemoveDuplicates(DataTable table, List<string> keyColumns)
{
var uniqueness = new HashSet<string>();
StringBuilder sb = new StringBuilder();
int rowIndex = 0;
DataRow row;
DataRowCollection rows = table.Rows;
int i = rows.Count;
while (rowIndex < i)
{
row = rows[rowIndex];
sb.Length = 0;
foreach (string colname in keyColumns)
{
sb.Append(row[colname]);
sb.Append("|");
}
if (uniqueness.Contains(sb.ToString()))
{
rows.Remove(row);
}
else
{
uniqueness.Add(sb.ToString());
rowIndex++;
}
}
Above Method返回异常,就像位置5处没有行一样
The Above Method returns exception like there is no rows at position 5
推荐答案
尝试使用LINQ
try with LINQ
dt_Barcode = dt_Barcode.AsEnumerable()
.GroupBy(r => new { Itemid = r.Field<int>("Itemid"), PacktypeId = r.Field<int>("PacktypeId")})
.Select(g => g.First())
.CopyToDataTable();
样本测试代码:
Sample Test Code:
void Main()
{
DataTable dt_Barcode =GetTable();
dt_Barcode = dt_Barcode.AsEnumerable()
.GroupBy(r => new { Itemid = r.Field<int>("Itemid"), PacktypeId = r.Field<int>("PacktypeId")})
.Select(g => g.First())
.CopyToDataTable();
}
DataTable GetTable()
{
DataTable table = new DataTable();
table.Columns.Add("Barcode", typeof(int));
table.Columns.Add("Itemid", typeof(int));
table.Columns.Add("PacktypeId", typeof(int));
table.Rows.Add(1,100,1);
table.Rows.Add(1,100,2);
table.Rows.Add(1,100,3);
table.Rows.Add(1,100,1);
table.Rows.Add(1,100,3);
return table;
}
问题的示例代码。
列表< items> arritems = new List< items>();
items item = new items();
items item1 = new items();
items item2 = new items();
items item3 = new items();
items item4 = new items();
item。条形码= 1;
item.itemid = 100;
item.packtypeid = 1;
item1。条形码= 1;
item1.itemid = 100;
item1.packtypeid = 2;
item2。条形码= 1;
item2.itemid = 100;
item2.packtypeid = 3;
item3。条形码= 1;
item3.itemid = 100;
item3.packtypeid = 1;
item4。条形码= 1;
item4.itemid = 100;
item4.packtypeid = 3;
arritems.Add(item);
arritems.Add(item1);
arritems.Add(item2);
arritems.Add(item3);
arritems.Add(item4);
var distinctList = arritems.Select( x => new {x.itemid,x.packtypeid})。Distinct()。ToList();
Example code for your problem.
List<items> arritems = new List<items>();
items item = new items();
items item1 = new items();
items item2 = new items();
items item3 = new items();
items item4 = new items();
item.barcode = 1;
item.itemid = 100;
item.packtypeid = 1;
item1.barcode = 1;
item1.itemid = 100;
item1.packtypeid = 2;
item2.barcode = 1;
item2.itemid = 100;
item2.packtypeid = 3;
item3.barcode = 1;
item3.itemid = 100;
item3.packtypeid = 1;
item4.barcode = 1;
item4.itemid = 100;
item4.packtypeid = 3;
arritems.Add(item);
arritems.Add(item1);
arritems.Add(item2);
arritems.Add(item3);
arritems.Add(item4);
var distinctList = arritems.Select(x => new{x.itemid , x.packtypeid}).Distinct().ToList();
public class items
{
public int barcode;
public int itemid;
public int packtypeid;
}
在其他论坛和CodeProject本身已经讨论了很多次。
请看下面的链接:
http://stackoverflow.com/questions/1199176/how-to-select-distinct-rows-in-a-datatable-and-store -into-an-array [ ^ ]
http://stackoverflow.com/questions/17221561/datatable-distinct-rows [ ^ ]
ht tp://stackoverflow.com/questions/12723289/linq-datatable-select-distinct-rows [ ^ ]
根据DataTable的指定字段选择DISTINCT记录 [ ^ ]
This has been discussed so many times on other forums and CodeProject itself.
Look at the links below:
http://stackoverflow.com/questions/1199176/how-to-select-distinct-rows-in-a-datatable-and-store-into-an-array[^]
http://stackoverflow.com/questions/17221561/datatable-distinct-rows[^]
http://stackoverflow.com/questions/12723289/linq-datatable-select-distinct-rows[^]
Select DISTINCT records based on specified fields for DataTable [^]
这篇关于基于多列从数据表中删除重复行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!