sql选择3列,并在两列上重复数据删除 [英] sql select 3 columns and dedupe on two columns

查看:168
本文介绍了sql选择3列,并在两列上重复数据删除的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个作业设置,当前从不包含唯一索引的表中选择记录。我意识到这可以通过在表和相关列上放置一个索引来解决,但是在这种情况下,为了测试目的,我需要删除索引,然后进行选择,这也将删除基于2列的重复项: p>

  SELECT DISTINCT [author],[pubDate],[dateadded] 
FROM [Feeds]。[dbo]。[socialPosts ]
WHERE CAST(FLOOR(CAST(dateadded AS float))AS datetime)>
DATEADD(DAY,DATEDIFF(DAY,0,GETDATE() - 2),0)
AND CAST(FLOOR(CAST(dateadded AS float))AS datetime)
DATEADD(DAY,DATEDIFF(DAY,0,GETDATE()),0)

这将从前一天选择所有记录,并且我想基于作者和pubdate重复数据删除记录。这可能是一个后期选择或完成之前,但想法是找出是否可以在选择内完成。

解决方案

您可以使用 GROUP BY dateadded 列上的任何聚合函数来获取唯一的作者,pubdate 结果。

  SELECT [author] 
,[pubDate]
,MAX([dateadded])
FROM [Feeds]。[dbo]。[socialPosts]
WHERE CAST(FLOOR(CAST(dateadded AS float))AS datetime)> dateadd(day,datediff(day,0,getdate() - 2),0)
AND CAST(FLOOR(CAST(dateadded AS float))AS datetime) dateadd(day,datediff(day,0,getDate()),0)
GROUP BY
[author]
,[pubdate]
/ pre>

I have a job setup that currently selects records from a table that does not contain a unique index. I realize this could be solved by just putting an index on the table and the relevant columns but, in this scenario for testing purposes, I need to remove the index and then do a select which will also remove duplicates based on 2 columns:

SELECT DISTINCT [author], [pubDate], [dateadded]
FROM [Feeds].[dbo].[socialPosts]
WHERE CAST(FLOOR(CAST(dateadded AS float)) AS datetime) > 
                               DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE() - 2), 0)  
AND CAST(FLOOR(CAST(dateadded AS float)) AS datetime) < 
                               DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()), 0)

This selects all records from the day before and I want to dedupe the records based on author and pubdate. This could be a post select or done prior but the idea is to find out if it can be done within a select.

解决方案

You can use a GROUP BY and any aggregate function on the dateadded column to get unique author, pubdate results.

SELECT  [author]
        ,[pubDate]
        ,MAX([dateadded])
 FROM   [Feeds].[dbo].[socialPosts]
 WHERE  CAST(FLOOR(CAST(dateadded AS float)) AS datetime) >  dateadd(day,datediff(day, 0, getdate()-2), 0)  
        AND CAST(FLOOR(CAST(dateadded AS float)) AS datetime) < dateadd(day,datediff(day, 0, getDate()), 0)
 GROUP BY 
        [author]
        , [pubdate]

这篇关于sql选择3列,并在两列上重复数据删除的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆