复杂的LINQ to XML查询帮助 [英] Complex LINQ to XML query assistance

查看:77
本文介绍了复杂的LINQ to XML查询帮助的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不知道我尝试执行的查询是否可能,但是如果你们中的LINQ to SQL/XML大师中的一个能弄清楚这一点,我将非常感谢并向LINQ神致敬.我的最终目标是识别所有重复的XML模型,并显示除一个以外的所有重复的CECID.所以可以说我有一个看起来像这样的Xdocument:

I don’t know if the query I am trying to do is even possible but if one of you LINQ to SQL/XML guru’s can figure this out I will be so thankful and salute you as a LINQ God. My end goal is to identify all of the XML Models that are duplicates and show the CECID for all the duplicates except one. So lets say I have an Xdocument that looks like this:

<ApplianceModels xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" ApplianceType="IceMakers">
    <Model>
        <ReferenceNumber>201877149</ReferenceNumber>
        <Action>C</Action>
        <Brand>4564</Brand>
        <ModelNumber>1234212</ModelNumber>
        <EquipmentType>A</EquipmentType>
        <CoolingType>W</CoolingType>
        <IceType>C</IceType>
        <IceMakerProcessType>B</IceMakerProcessType>
        <TestLabCode>ARN3190</TestLabCode>
        <ManufacturerCode>ARN2396</ManufacturerCode>
        <HarvestRateLbs24Hr>56</HarvestRateLbs24Hr>
        <EnergyCons_kWhPer100Lbs>4.00</EnergyCons_kWhPer100Lbs>
        <WaterCons_galPer100Lbs>12</WaterCons_galPer100Lbs>
        <IceHardnessAdjustmentFactor xsi:nil="true" />
        <RegulatoryStatus>I</RegulatoryStatus>
        <CECID>d579ae7a-f3f7-4627-a3f1-f17b23aa28e3</CECID>
    </Model>
    <Model>
        <ReferenceNumber>201877143</ReferenceNumber>
        <Action>C</Action>
        <Brand>4564</Brand>
        <ModelNumber>12342</ModelNumber>
        <EquipmentType>A</EquipmentType>
        <CoolingType>W</CoolingType>
        <IceType>C</IceType>
        <IceMakerProcessType>B</IceMakerProcessType>
        <TestLabCode>ARN3190</TestLabCode>
        <ManufacturerCode>ARN2396</ManufacturerCode>
        <HarvestRateLbs24Hr>56</HarvestRateLbs24Hr>
        <EnergyCons_kWhPer100Lbs>4.00</EnergyCons_kWhPer100Lbs>
        <WaterCons_galPer100Lbs>12</WaterCons_galPer100Lbs>
        <IceHardnessAdjustmentFactor xsi:nil="true" />
        <RegulatoryStatus>I</RegulatoryStatus>
        <CECID>94c6d6e6-5b6a-4f45-a7ff-70a64e50e4e6</CECID>
    </Model>
    <Model>
        <ReferenceNumber>201877152</ReferenceNumber>
        <Action>C</Action>
        <Brand>4564</Brand>
        <ModelNumber>1231114234</ModelNumber>
        <EquipmentType>A</EquipmentType>
        <CoolingType>W</CoolingType>
        <IceType>C</IceType>
        <IceMakerProcessType>C</IceMakerProcessType>
        <TestLabCode>ARN3190</TestLabCode>
        <ManufacturerCode>ARN2396</ManufacturerCode>
        <HarvestRateLbs24Hr>81</HarvestRateLbs24Hr>
        <EnergyCons_kWhPer100Lbs>1.10</EnergyCons_kWhPer100Lbs>
        <WaterCons_galPer100Lbs>12</WaterCons_galPer100Lbs>
        <IceHardnessAdjustmentFactor>4.45</IceHardnessAdjustmentFactor>
        <RegulatoryStatus>I</RegulatoryStatus>
        <CECID>d97a603c-1836-43a3-b564-ab8d1bdec65f</CECID>
    </Model>
</ApplianceModels>

然后在SQL Server中,有一个名为tApplianceTypeColumns的表,对于给定的设备类型,该表如下所示:

Then in SQL Server I have a table called tApplianceTypeColumns that looks like this for a given appliance type:

ApplianceTypeID       ApplianceColumnUnique        ApplianceColumnName
10                    0                            ReferenceNumber
10                    1                            Brand
10                    1                            ModelNumber
10                    0                            EquipmentType
10                    0                            CoolingType
10                    0                            IceType
10                    0                            IceMakerProcessType
10                    0                            HarvestRateLbs24Hr
10                    0                            EnergyCons_kWhPer100Lbs
10                    0                            WaterCons_galPer100lbs
10                    1                            RegulatoryStatus

所以这是我的开始,但距离还很遥远:

So here is what I started with but I am far from being close:

var DupeItems = from m in doc.Descendants("Model").Elements()
                join at in entities.tApplianceTypeColumns on m.Name equals at.ApplianceColumnName
                group m by m.Element(at.ApplianceColumnName).Value into d
                where at.ApplianceTypeID == ApplianceTypeID

所以我真的希望能够按品牌,型号和管理状态分组,这些是tApplianceTypeColumns表中将ApplianceColumnUnique位列设置为true的列.真实位数可能会有所不同,具体取决于我在该表中查找的ApplianceTypeID.

So really I want to be able to group by Brand, Model Number, and RegulatoryStatus which are the columns in the tApplianceTypeColumns table that have the ApplianceColumnUnique bit column set to true. The number of true bits could vary depending on the ApplianceTypeID I am looking up in that table.

此外,我还需要在分组中包括两个永远不在tApplianceTypeColumns表中的元素,这些元素是Action然后是ManufacturerCode,然后是tApplianceTypeColumns中的所有其他唯一元素(没有特定顺序).

Additionally, I also need to include two elements in the grouping that are never in the tApplianceTypeColumns table and those elements are Action then ManufacturerCode followed by all the other unique elements from the tApplianceTypeColumns in no specific order.

ApplianceTypeID是一个已知参数,将传递给查询.因此,对于任何重复项,我都需要显示第二个及后续重复项的CECID,以便我可以获取这些CECID并在其他表中进行查找以更改其状态.但是,第一步是艰难的.我不在乎哪些重复项不会显示.我只需要显示除1以外的所有其他内容.我希望我已经对此进行了充分解释.

The ApplianceTypeID is a known parameter that will be passed to the query. So for any set of duplicates I need to display the CECID for the 2nd and subsequent duplicates so that I can take those CECID’s and do lookups in other tables to change their status. But this first step is tough. I don’t care which of the duplicates does not get displayed. I just need to display all others except 1. I hope I have explained this well enough.

推荐答案

任务可以分为3个步骤:

The task can be split into 3 steps:

  1. 找到要分组的唯一列:

  1. Find the unique columns to group with:

所以我真的希望能够按品牌,型号和管理状态分组,这些是tApplianceTypeColumns表中将ApplianceColumnUnique位列设置为true的列. 真实位数可能会有所不同,具体取决于我在该表中查找的ApplianceTypeID. 另外,我还需要在分组中包括两个元素,这些元素永远不在tApplianceTypeColumns表中,并且这些元素分别是 Action然后ManufacturerCode ,然后是所有其他唯一tApplianceTypeColumns中的元素没有特定的顺序.

So really I want to be able to group by Brand, Model Number, and RegulatoryStatus which are the columns in the tApplianceTypeColumns table that have the ApplianceColumnUnique bit column set to true. The number of true bits could vary depending on the ApplianceTypeID I am looking up in that table. Additionally, I also need to include two elements in the grouping that are never in the tApplianceTypeColumns table and those elements are Action then ManufacturerCode followed by all the other unique elements from the tApplianceTypeColumns in no specific order.

Enumerable.Concat(
    "Action,ManufacturerCode".Split(','),
    applianceTypeColumns
        .Where(at => at.ApplianceColumnUnique)
        .Select(at => at.ApplianceColumnName)
);

  • 通过上一步中的列将模型分组:

  • Group the models by the columns from prevous step:

    我们将列名投影到每个模型的列值中

    We project the column names into the column values of each model

    applianceModels.GroupBy(
        model => uniqueColumns.Select(columnName => model.Element(columnName)?.Value).ToArray()
    

    但是,我们不能仅按字符串数组进行分组,因此我们需要提供一个自定义IEqualityComparer:

    However, we can't just group by an array of string, so we need to provider a custom IEqualityComparer:

    new LambdaComparer<string[]>((a, b) => a.SequenceEqual(b), x => x.Aggregate(13, (hash, y) => hash * 7 + y?.GetHashCode() ?? 0))
    

  • 汇总重复项:

  • Aggregate the duplicates:

    .Select(g => new { g.Key, Duplicates = g.Select(x => x.Element("CECID")?.Value) })
    


  • 一切融合在一起:


    Everything put together:

    void Main()
    {
        const int ApplianceTypeID = 10;
    
        var applianceModels = GetApplianceModels().XPathSelectElements("Model"); //.Dump();
        var applianceTypeColumns = GetApplianceTypeColumns().Where(x => x.ApplianceTypeID == ApplianceTypeID); //.Dump();
    
        var uniqueColumns = Enumerable.Concat(
            "Action,ManufacturerCode".Split(','),
            applianceTypeColumns
                .Where(at => at.ApplianceColumnUnique)
                .Select(at => at.ApplianceColumnName)
        );
    
        var query = applianceModels
            .GroupBy(
                model => uniqueColumns.Select(columnName => model.Element(columnName)?.Value).ToArray(),
                new LambdaComparer<string[]>((a, b) => a.SequenceEqual(b), x => x.Aggregate(13, (hash, y) => hash * 7 + y?.GetHashCode() ?? 0))
            )
            .Where(x => x.Count() > 1)
            .Select(g => new { g.Key, Duplicates = g.Select(x => x.Element("CECID")?.Value) });
            //.Dump();
    }
    
    // Define other methods and classes here
    XElement GetApplianceModels()
    {
        return XElement.Parse(
    @"<ApplianceModels xmlns:xsi=""http://www.w3.org/2001/XMLSchema-instance"" xmlns:xsd=""http://www.w3.org/2001/XMLSchema"" ApplianceType=""IceMakers"">
        <Model>
            <ReferenceNumber>201877149</ReferenceNumber>
            <Action>C</Action>
            <Brand>4564</Brand>
            <ModelNumber>1234212</ModelNumber>
            <EquipmentType>A</EquipmentType>
            <CoolingType>W</CoolingType>
            <IceType>C</IceType>
            <IceMakerProcessType>B</IceMakerProcessType>
            <TestLabCode>ARN3190</TestLabCode>
            <ManufacturerCode>ARN2396</ManufacturerCode>
            <HarvestRateLbs24Hr>56</HarvestRateLbs24Hr>
            <EnergyCons_kWhPer100Lbs>4.00</EnergyCons_kWhPer100Lbs>
            <WaterCons_galPer100Lbs>12</WaterCons_galPer100Lbs>
            <IceHardnessAdjustmentFactor xsi:nil=""true"" />
            <RegulatoryStatus>I</RegulatoryStatus>
            <CECID>d579ae7a-f3f7-4627-a3f1-f17b23aa28e3</CECID>
        </Model>
        <Model>
            <ReferenceNumber>201877143</ReferenceNumber>
            <Action>C</Action>
            <Brand>4564</Brand>
            <ModelNumber>12342</ModelNumber>
            <EquipmentType>A</EquipmentType>
            <CoolingType>W</CoolingType>
            <IceType>C</IceType>
            <IceMakerProcessType>B</IceMakerProcessType>
            <TestLabCode>ARN3190</TestLabCode>
            <ManufacturerCode>ARN2396</ManufacturerCode>
            <HarvestRateLbs24Hr>56</HarvestRateLbs24Hr>
            <EnergyCons_kWhPer100Lbs>4.00</EnergyCons_kWhPer100Lbs>
            <WaterCons_galPer100Lbs>12</WaterCons_galPer100Lbs>
            <IceHardnessAdjustmentFactor xsi:nil=""true"" />
            <RegulatoryStatus>I</RegulatoryStatus>
            <CECID>94c6d6e6-5b6a-4f45-a7ff-70a64e50e4e6</CECID>
        </Model>
        <Model>
            <ReferenceNumber>201877152</ReferenceNumber>
            <Action>C</Action>
            <Brand>4564</Brand>
            <ModelNumber>1231114234</ModelNumber>
            <EquipmentType>A</EquipmentType>
            <CoolingType>W</CoolingType>
            <IceType>C</IceType>
            <IceMakerProcessType>C</IceMakerProcessType>
            <TestLabCode>ARN3190</TestLabCode>
            <ManufacturerCode>ARN2396</ManufacturerCode>
            <HarvestRateLbs24Hr>81</HarvestRateLbs24Hr>
            <EnergyCons_kWhPer100Lbs>1.10</EnergyCons_kWhPer100Lbs>
            <WaterCons_galPer100Lbs>12</WaterCons_galPer100Lbs>
            <IceHardnessAdjustmentFactor>4.45</IceHardnessAdjustmentFactor>
            <RegulatoryStatus>I</RegulatoryStatus>
            <CECID>d97a603c-1836-43a3-b564-ab8d1bdec65f</CECID>
        </Model>
    </ApplianceModels>");
    }
    IEnumerable<(int ApplianceTypeID, bool ApplianceColumnUnique, string ApplianceColumnName)> GetApplianceTypeColumns()
    {
        var data =
    @"ApplianceTypeID       ApplianceColumnUnique        ApplianceColumnName
    10                    0                            ReferenceNumber
    10                    1                            Brand
    10                    1                            ModelNumber
    10                    0                            EquipmentType
    10                    0                            CoolingType
    10                    0                            IceType
    10                    0                            IceMakerProcessType
    10                    0                            HarvestRateLbs24Hr
    10                    0                            EnergyCons_kWhPer100Lbs
    10                    0                            WaterCons_galPer100lbs
    10                    1                            RegulatoryStatus";
        return Regex.Matches(data, @"^(\d+)\s+(\d+)\s+(\w+)", RegexOptions.Multiline)
            .Cast<Match>()
            .Select(x => 
            (
                /*ApplianceTypeID = */int.Parse(x.Groups[1].Value),
                /*ApplianceColumnUnique = */int.Parse(x.Groups[2].Value) != 0,
                /*ApplianceColumnName = */x.Groups[3].Value
            ));
    }
    
    class LambdaComparer<T> : IEqualityComparer<T>
    {
        private readonly Func<T, T, bool> equals;
        private readonly Func<T, int> getHashCode;
    
        public LambdaComparer(Func<T, T, bool> equals, Func<T, int> getHashCode)
        {
            this.equals = equals;
            this.getHashCode = getHashCode;
        }
    
        public bool Equals(T x, T y) => equals(x, y);
        public int GetHashCode(T obj) => getHashCode(obj);
    }
    

    这篇关于复杂的LINQ to XML查询帮助的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆