更新列为不同的聚合值 [英] Update column to be different aggregate values

查看：150 发布时间：2017/7/20 23:33:25 tsql duplicate-removal

本文介绍了更新列为不同的聚合值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在创建一个用于合并并从表中删除重复行的脚本。该表包含地址信息，并使用整数字段来存储有关电子邮件的信息作为位标志（列名称lngValue）。例如，lngValue& 1 == 1表示其主地址。

有两次输入相同电子邮件的实例，但有时使用不同的lngValues。要解决这个问题，我需要从所有重复的文件中获取lngValue，并将它们分配给一个幸存的记录，然后删除其余的记录。

到目前为止，我最大的头痛是记录的合并。我想做的是将重复记录按位或所有lngValues在一起。这是我到目前为止，它只发现所有lngValues的值按位或一起。

警告：凌乱的代码

  declare @duplicates table 
（
 lngInternetPK int，
 lngContactFK int，
 lngValue int 
）
 
插入到@duplicates（lngInternetPK，lngContactFK，lngValue）
（
选择tblminternet.lngInternetPK，tblminternet.lngContactFK，tblminternet.lng从tblminternet内部连接的$ val $ b（选择strAddress，lngcontactfk，count（*）作为从tblminternet计数，其中lngValue& 256 256 group by strAddress，lngcontactfk）secondemail 
在tblminternet.strAddress = secondemail.strAddress和
 tblminternet.lngcontactfk = secondemail.lngcontactfk 
其中count> 1和tblminternet.strAddress不为空和tblminternet。 lngValue&256 256  -  order by lngContactFK，strAddress 
）
 
 update @duplicates set lngValue = t.val 
 
 from 
（select（sum（dupes.lngValue）& 65535）as val from 
（select h ere.lngInternetPK，here.lngContactFK，here.lng从tblminternet这里的值加入
（select strAddress，lngcontactfk，count（*）as count from tblminternet where lngValue& 256 256组由strAddress，lngcontactfk）secondemail 
在here.strAddress = secondemail.strAddress和
 here.lngcontactfk = secondemail.lngcontactfk 
其中count> 1和here.strAddress不为null，here.lngValue& 256 256）dupes，tblminternet这个
 
其中this.lngContactFK = dupes.lngContactFK 
）t 
其中lngInternetPK in（select lngInternetPK from @duplicates）

编辑：

根据要求，这里有一些示例数据：

表名称：tblminternet

列名称：

lngInternetPK

lngContactFK

lngValue

strAddress

示例行1：

lngInternetPK：1

lngContactFK：1

lngValue：33

strAddress：me@myaddress.com

示例行2：

lngInternetPK：2

lngContactFK：1

lngValue：40

strAddress：me@myaddress.com

如果这两个合并在这里是所需的结果：

lngInternetPK：1

lngContactFK：1

lngValue：41

strAddress：me@myaddress.com

其他必要的规则：

每个联系人都可以有多个iple电子邮件，但每个电子邮件行必须是不同的（每个电子邮件只能显示为一行）。

解决方案

SQL Server 缺少原生的按位聚合，这就是为什么我们需要模拟它们。

这里的主要思想是生成一组位从 0 到 15 ，每个位应用位掩码值并选择 MAX （这将为给定位给我们一个 OR ），然后选择 SUM 这将合并位掩码）。

我们只是为任何给定的$ $ c $更新了第一个 lngInternetPK c>（lngContactFK，strValue），新值 lngValue ，并删除所有重复项。

$ b $ （

UNION ALL
SELECT b + 1
FROM bits
WHERE b< 15
），
v AS
（
SELECT i。*，
（
SELECT SUM（value）
FROM（
SELECT MAX（lngValue& POWER（2，b））AS值
FROM tblmInternet ii
CROSS JOIN
位
WHERE ii.lngContactFK = i.lngContactFK
AND ii.strAddress = i.strAddress
GROUP BY
b
）q
）AS lngNewValue
FROM（
SELECT ii。*，ROW_NUMBER（）OVER分区由lngContactFK，strAddress ORDER BY lngInternetPK）AS rn
FROM tblmInternet ii
）i
WHERE rn = 1
）
更新v
SET lngValue = lngNewValue;

; WITH v AS
（
SELECT ii。*，ROW_NUMBER（）OVER（PARTITION BY lngContactFK，strAddress ORDER BY lngInternetPK）AS rn
FROM tblmInternet ii
）
DELETE v
WHERE rn> 1

有关详细说明，请参阅我的博客中的这篇文章：

SQL Server：按位或OR 汇总

I am creating a script that for "merging" and deleting duplicate rows from a table. The table contains address information, and uses an integer field for storing information about the email as bit flags (column name lngValue). For example, lngValue & 1 == 1 means its the primary address.

There are instances of the same email being entered twice, but sometimes with different lngValues. To resolve this, I need to take the lngValue from all duplicates and assign them to one surviving record and delete the rest.

My biggest headache so far as been with the "merging" of the records. What I want to do is bitwise or all lngValues of duplicate records together. Here is what I have so far, which only finds the value of all lngValues bitwise or'ed together.

Warning: messy code ahead

declare @duplicates table
(
lngInternetPK int,
lngContactFK int,
lngValue int
)

insert into @duplicates (lngInternetPK, lngContactFK, lngValue) 
(
select  tblminternet.lngInternetPK, tblminternet.lngContactFK, tblminternet.lngValue   from tblminternet  inner join 
(select strAddress, lngcontactfk, count(*) as count from tblminternet where lngValue & 256 <> 256 group by strAddress, lngcontactfk) secondemail
On tblminternet.strAddress = secondemail.strAddress and
tblminternet.lngcontactfk = secondemail.lngcontactfk 
where count > 1 and tblminternet.strAddress is not null and tblminternet.lngValue & 256 <> 256 --order by lngContactFK, strAddress
)

update @duplicates set lngValue = t.val

from 
                (select (sum(dupes.lngValue) & 65535) as val from 
                    (select  here.lngInternetPK,                     here.lngContactFK, here.lngValue from tblminternet here  inner join 
                    (select strAddress, lngcontactfk, count(*) as count from tblminternet where lngValue & 256 <> 256 group by strAddress, lngcontactfk) secondemail
                    On here.strAddress = secondemail.strAddress     and
                    here.lngcontactfk = secondemail.lngcontactfk 
                    where count > 1 and here.strAddress is not      null and here.lngValue & 256 <> 256) dupes, tblminternet this

                where this.lngContactFK = dupes.lngContactFK
                ) t
where lngInternetPK in (select lngInternetPK from @duplicates)

Edit:
As requested here is some sample data:

Table Name: tblminternet
Column Names:
lngInternetPK
lngContactFK
lngValue
strAddress

Example row 1:
lngInternetPK: 1
lngContactFK: 1
lngValue: 33
strAddress: "me@myaddress.com"

Example row 2:
lngInternetPK: 2
lngContactFK: 1
lngValue: 40
strAddress: "me@myaddress.com"

If these two were merged here is the desired result:
lngInternetPK: 1
lngContactFK: 1
lngValue: 41
strAddress: "me@myaddress.com"

Other necessary rules:
Each contact can have multiple emails, but each email row must be distinct ( each email can only appear as one row).

解决方案

SQL Server lacks native bitwise aggregates, that's why we need to emulate them.

The main idea here is to generate a set of bits from 0 to 15, for each bit apply the bitmask to the value and select MAX (which will give us an OR for a given bit), then select the SUM (which will merge the bit masks).

The we just update the first lngInternetPK for any given (lngContactFK, strValue) with the new value of lngValue, and delete all duplicates.

;WITH   bits AS
        (
        SELECT  0 AS b
        UNION ALL
        SELECT  b + 1
        FROM    bits
        WHERE   b < 15
        ),
        v AS
        (
        SELECT  i.*,
                (
                SELECT  SUM(value)
                FROM    (
                        SELECT  MAX(lngValue & POWER(2, b)) AS value
                        FROM    tblmInternet ii
                        CROSS JOIN
                                bits
                        WHERE   ii.lngContactFK = i.lngContactFK
                                AND ii.strAddress = i.strAddress
                        GROUP BY
                                b
                        ) q
                ) AS lngNewValue
        FROM    (
                SELECT  ii.*, ROW_NUMBER() OVER (PARTITION BY lngContactFK, strAddress ORDER BY lngInternetPK) AS rn
                FROM    tblmInternet ii
                ) i
        WHERE   rn = 1
        )
UPDATE  v
SET     lngValue = lngNewValue;

;WITH    v AS
        (
        SELECT  ii.*, ROW_NUMBER() OVER (PARTITION BY lngContactFK, strAddress ORDER BY lngInternetPK) AS rn
        FROM    tblmInternet ii
        )
DELETE  v
WHERE   rn > 1

See this article in my blog for more detailed explanations:

SQL Server: aggregate bitwise OR

这篇关于更新列为不同的聚合值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

更新列为不同的聚合值 [英] Update column to be different aggregate values

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

更新列为不同的聚合值 [英] Update column to be different aggregate values

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭