如何在c#中加入两个DataTable? [英] How to Left Outer Join two DataTables in c#?

查看:349
本文介绍了如何在c#中加入两个DataTable?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

$ p
我如何离开外部加入(我认为是外部加入,但我不是100%确定)两个数据表与以下表格和条件,同时保留两个表中的所有列?

dtblLeft:

  id col1 anotherColumn2 
1 1 any2
2 1 any2
3 2 any2
4 3 any2
5 3 any2
6 3 any2
7 any2

dtblRight:

  col1 col2 anotherColumn1 
1 Hi any1
2再见any1
3以后any1
4从不any1

dtblJoined :

  id col1 col2 anotherColumn1 anotherColumn2 
1 1嗨any1 any2
2 1嗨any1 any2
3 2 Bye any1 any2
4 3以后any1 any2
5 3以后any1 any2
6 3后来any1 any2
7 any2

条件: p>


  • 在dtblLeft中,col1不需要具有唯一的值。

  • 在dtblRight中,col1具有唯一的值。

  • 如果dtblLeft在col1中缺少外键,或者在dtblRight中不存在外键,则将插入空或空字段。

  • 加入col1。






我可以使用常规的DataTable操作,LINQ或其他。



我尝试过这个,但删除重复:

  dtblA.PrimaryKey = new DataColumn [] {dtblA.Columns [col1]} 

DataTable dtblJoined = new DataTable();
dtblJoined.Merge(dtblA,false,MissingSchemaAction.AddWithKey);
dtblJoined.Merge(dtblB,false,MissingSchemaAction.AddWithKey);






编辑1:



这是我想要的,但它只有一个表的列(在这里link ):

  dtblJoined =(from dt in dtblA.Rows.Cast< DataRow>()
加入t2在dtblB.Rows.Cast< ; t1 [col1]上的DataRow>()等于t2 [col1]
select t1).CopyToDataTable();

编辑2:



这个链接似乎对我有用,但是我不得不改变一下如下:

  DataTable targetTable = dtblA.Clone(); 
var dt2Columns = dtblB.Columns.OfType< DataColumn>()。选择(dc =>
new DataColumn(dc.ColumnName,dc.DataType,dc.Expression,dc.ColumnMapping));
var dt2FinalColumns = from dt in dt2Columns.AsEnumerable()
其中targetTable.Columns.Contains(dc.ColumnName)== false
select dc;

targetTable.Columns.AddRange(dt2FinalColumns.ToArray());

var rowData = from row1 in dtblA.AsEnumerable()
join row2 in dtblB.AsEnumerable()
on row1 [col1] equals row2 [col1]
select row1.ItemArray.Concat(row2.ItemArray.Where(r2 => row1.ItemArray.Contains(r2)== false))。ToArray();

foreach(rowData中的对象[]值)
targetTable.Rows.Add(values);

我还发现这个链接,我可能会尝试,因为它似乎更简洁。



EDIT 3(11/18/2013):



更新表以反映更多情况。

解决方案

感谢大家的帮助。这是我根据多种资源所得出的:

  public static class DataTableHelper 
{
public枚举JoinType
{
///< summary>
///与常规连接相同。内连接仅产生在表A和表B中匹配的一组记录。
///< / summary>
Inner = 0,
///< summary>
///与Left Outer join相同。左外连接从表A中生成一组完整的记录,其中包含表B中匹配的记录(如果可用)。如果不匹配,右侧将包含null。
///< / summary>
左= 1
}

///< summary>
///加入在colToJoinOn上的DataTable中传递。
///< para>如果两个表中不存在colToJoinOn,则返回一个具有零行的适当DataTable。< / para>
///< / summary>
///< param name =dtblLeft>< / param>
///< param name =dtblRight>< / param>
///< param name =colToJoinOn>< / param>
///< param name =joinType>< / param>
///< returns>< / returns>
///< remarks>
///< para> http://stackoverflow.com/questions/2379747/create-combined-datatable-from-two-datatables-joined-with-linq-c-sharp?rq = 1< / para>
///< para> http://msdn.microsoft.com/en-us/library/vstudio/bb397895.aspx< / para>
///< para> http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html< / para>
///< para> http://stackoverflow.com/questions/406294/left-join-and-left-outer-join-in-sql-server< / para>
///< / remarks>
public static DataTable JoinTwoDataTablesOnOneColumn(DataTable dtblLeft,DataTable dtblRight,string colToJoinOn,JoinType joinType)
{
//将列名更改为临时名称,以便LINQ获取行数据将正常工作。
string strTempColName = colToJoinOn +_2;
if(dtblRight.Columns.Contains(colToJoinOn))
dtblRight.Columns [colToJoinOn] .ColumnName = strTempColName;

//从dtblLeft获取列
DataTable dtblResult = dtblLeft.Clone();

//从dtblRight
var dt2Columns = dtblRight.Columns.OfType< DataColumn>()中获取列。选择(dc => new DataColumn(dc.ColumnName,dc.DataType,dc .Expression,dc.ColumnMapping));

//从dtblLeft中获取不在dtblLeft中的列
var dt2FinalColumns = from dt in dt2Columns.AsEnumerable()
其中!dtblResult.Columns.Contains(dc.ColumnName)
选择dc;

//将其余的列添加到dtblResult
dtblResult.Columns.AddRange(dt2FinalColumns.ToArray());

//如果两个DataTable中都不存在colToJoinOn,则无法继续。
if(!dtblLeft.Columns.Contains(colToJoinOn)||(!dtblRight.Columns.Contains(colToJoinOn)&!dtblRight.Columns.Contains(strTempColName)))
{
if(!dtblResult.Columns.Contains(colToJoinOn))
dtblResult.Columns.Add(colToJoinOn);
return dtblResult;
}

开关(joinType)
{

默认值:
case JoinType.Inner:
#region Inner
//获取行数据
//要使用DataTable.AsEnumerable()扩展方法,需要在项目中添加对System.Data.DataSetExtension程序集的引用。
var rowDataLeftInner = from rowLeft in dtblLeft.AsEnumerable()
在rowLeft中的dtblRight.AsEnumerable()中连接rowRight [colToJoinOn]等于rowRight [strTempColName]
select rowLeft.ItemArray.Concat(rowRight。 ItemArray).ToArray();


//将行数据添加到dtblResult
foreach(rowDataLeftInner中的对象[]值)
dtblResult.Rows.Add(values);

#endregion
break;
case JoinType.Left:
#region Left
var rowDataLeftOuter = from rowLeft in dtblLeft.AsEnumerable()
join rowRight in dtblRight.AsEnumerable()on rowLeft [colToJoinOn] equals rowRight [strTempColName] into gj
from subRight in gj.DefaultIfEmpty()
select rowLeft.ItemArray.Concat((subRight == null)?(dtblRight.NewRow()。ItemArray):subRight.ItemArray)。 ToArray();


//将行数据添加到dtblResult
foreach(rowDataLeftOuter中的对象[]值)
dtblResult.Rows.Add(values);

#endregion
break;
}

//将列名更改为原始
dtblRight.Columns [strTempColName] .ColumnName = colToJoinOn;

//从结果中删除额外的列
dtblResult.Columns.Remove(strTempColName);

return dtblResult;
}
}

编辑3: / p>

此方法现在可以正常工作,并且当表具有2000+行时,该方法仍然很快。任何建议/建议/改进将不胜感激。



编辑4:



我有一些情况导致我意识到,以前的版本是真正做一个内在的联合。该功能已被修改以解决该问题。我使用了这个链接中的信息来计算出来。 p>

How can I Left Outer Join (I think it is Left Outer Join but I am not 100% sure) two data tables with the following tables and conditions while keeping all columns from both tables?

dtblLeft:

 id   col1   anotherColumn2
 1    1      any2
 2    1      any2
 3    2      any2
 4    3      any2
 5    3      any2
 6    3      any2
 7           any2

dtblRight:

 col1   col2      anotherColumn1
 1      Hi        any1
 2      Bye       any1
 3      Later     any1
 4      Never     any1

dtblJoined:

 id   col1  col2     anotherColumn1     anotherColumn2
 1    1     Hi       any1               any2
 2    1     Hi       any1               any2
 3    2     Bye      any1               any2
 4    3     Later    any1               any2
 5    3     Later    any1               any2
 6    3     Later    any1               any2
 7                                      any2

Conditions:

  • In dtblLeft, col1 is not required to have unique values.
  • In dtblRight, col1 has unique values.
  • If dtblLeft is missing a foreign key in col1 or it has one that does not exist in dtblRight then empty or null fields will be inserted.
  • Joining on col1.

I can use regular DataTable operations, LINQ, or whatever.

I tried this but it removes duplicates:

dtblA.PrimaryKey = new DataColumn[] {dtblA.Columns["col1"]}

DataTable dtblJoined = new DataTable();
dtblJoined.Merge(dtblA, false, MissingSchemaAction.AddWithKey);
dtblJoined.Merge(dtblB, false, MissingSchemaAction.AddWithKey);


EDIT 1:

This is close to I what I want but it only has columns from one of the tables ( found at this link ):

    dtblJoined = (from t1 in dtblA.Rows.Cast<DataRow>()
                  join t2 in dtblB.Rows.Cast<DataRow>() on t1["col1"] equals t2["col1"]
                  select t1).CopyToDataTable();

EDIT 2:

An answer from this link seems to work for me but I had to change it a bit as follows:

DataTable targetTable = dtblA.Clone();
var dt2Columns = dtblB.Columns.OfType<DataColumn>().Select(dc =>
new DataColumn(dc.ColumnName, dc.DataType, dc.Expression, dc.ColumnMapping));
var dt2FinalColumns = from dc in dt2Columns.AsEnumerable()
                   where targetTable.Columns.Contains(dc.ColumnName) == false
                   select dc;

targetTable.Columns.AddRange(dt2FinalColumns.ToArray());

var rowData = from row1 in dtblA.AsEnumerable()
                          join row2 in dtblB.AsEnumerable()
                          on row1["col1"] equals row2["col1"]
                          select row1.ItemArray.Concat(row2.ItemArray.Where(r2 => row1.ItemArray.Contains(r2) == false)).ToArray();

 foreach (object[] values in rowData)
      targetTable.Rows.Add(values);

I also found this link and I might try that out since it seems more concise.

EDIT 3 (11/18/2013):

Updated tables to reflect more situations.

解决方案

Thanks all for your help. Here is what I came up with based on multiple resources:

public static class DataTableHelper
{
    public enum JoinType
    {
        /// <summary>
        /// Same as regular join. Inner join produces only the set of records that match in both Table A and Table B.
        /// </summary>
        Inner = 0,
        /// <summary>
        /// Same as Left Outer join. Left outer join produces a complete set of records from Table A, with the matching records (where available) in Table B. If there is no match, the right side will contain null.
        /// </summary>
        Left = 1
    }

    /// <summary>
    /// Joins the passed in DataTables on the colToJoinOn.
    /// <para>Returns an appropriate DataTable with zero rows if the colToJoinOn does not exist in both tables.</para>
    /// </summary>
    /// <param name="dtblLeft"></param>
    /// <param name="dtblRight"></param>
    /// <param name="colToJoinOn"></param>
    /// <param name="joinType"></param>
    /// <returns></returns>
    /// <remarks>
    /// <para>http://stackoverflow.com/questions/2379747/create-combined-datatable-from-two-datatables-joined-with-linq-c-sharp?rq=1</para>
    /// <para>http://msdn.microsoft.com/en-us/library/vstudio/bb397895.aspx</para>
    /// <para>http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html</para>
    /// <para>http://stackoverflow.com/questions/406294/left-join-and-left-outer-join-in-sql-server</para>
    /// </remarks>
    public static DataTable JoinTwoDataTablesOnOneColumn(DataTable dtblLeft, DataTable dtblRight, string colToJoinOn, JoinType joinType)
    {
        //Change column name to a temp name so the LINQ for getting row data will work properly.
        string strTempColName = colToJoinOn + "_2";
        if (dtblRight.Columns.Contains(colToJoinOn))
            dtblRight.Columns[colToJoinOn].ColumnName = strTempColName;

        //Get columns from dtblLeft
        DataTable dtblResult = dtblLeft.Clone();

        //Get columns from dtblRight
        var dt2Columns = dtblRight.Columns.OfType<DataColumn>().Select(dc => new DataColumn(dc.ColumnName, dc.DataType, dc.Expression, dc.ColumnMapping));

        //Get columns from dtblRight that are not in dtblLeft
        var dt2FinalColumns = from dc in dt2Columns.AsEnumerable()
                              where !dtblResult.Columns.Contains(dc.ColumnName)
                              select dc;

        //Add the rest of the columns to dtblResult
        dtblResult.Columns.AddRange(dt2FinalColumns.ToArray());

        //No reason to continue if the colToJoinOn does not exist in both DataTables.
        if (!dtblLeft.Columns.Contains(colToJoinOn) || (!dtblRight.Columns.Contains(colToJoinOn) && !dtblRight.Columns.Contains(strTempColName)))
        {
            if (!dtblResult.Columns.Contains(colToJoinOn))
                dtblResult.Columns.Add(colToJoinOn);
            return dtblResult;
        }

        switch (joinType)
        {

            default:
            case JoinType.Inner:
                #region Inner
                //get row data
                //To use the DataTable.AsEnumerable() extension method you need to add a reference to the System.Data.DataSetExtension assembly in your project. 
                var rowDataLeftInner = from rowLeft in dtblLeft.AsEnumerable()
                                       join rowRight in dtblRight.AsEnumerable() on rowLeft[colToJoinOn] equals rowRight[strTempColName]
                                       select rowLeft.ItemArray.Concat(rowRight.ItemArray).ToArray();


                //Add row data to dtblResult
                foreach (object[] values in rowDataLeftInner)
                    dtblResult.Rows.Add(values);

                #endregion
                break;
            case JoinType.Left:
                #region Left
                var rowDataLeftOuter = from rowLeft in dtblLeft.AsEnumerable()
                                       join rowRight in dtblRight.AsEnumerable() on rowLeft[colToJoinOn] equals rowRight[strTempColName] into gj
                                       from subRight in gj.DefaultIfEmpty()
                                       select rowLeft.ItemArray.Concat((subRight== null) ? (dtblRight.NewRow().ItemArray) :subRight.ItemArray).ToArray();


                //Add row data to dtblResult
                foreach (object[] values in rowDataLeftOuter)
                    dtblResult.Rows.Add(values);

                #endregion
                break;
        }

        //Change column name back to original
        dtblRight.Columns[strTempColName].ColumnName = colToJoinOn;

        //Remove extra column from result
        dtblResult.Columns.Remove(strTempColName);

        return dtblResult;
    }
}

EDIT 3:

This method now works correctly and it is still fast when the tables have 2000+ rows. Any recommendations/suggestions/improvements would be appreciated.

EDIT 4:

I had a certain scenario that led me to realize the previous version was really doing an inner join. The function has been modified to fix that problem. I used info at this link to figure it out.

这篇关于如何在c#中加入两个DataTable?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆