SQL:如何合并区分大小写的重复项 [英] SQL: How to merge case-insensitive duplicates

查看:162
本文介绍了SQL:如何合并区分大小写的重复项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我有一种情况,表格会跟踪播放器名称和记录,例如这个:

  stats 
------------------- ------------
nick totalgames wins ...
John 100 40
john 200 97
哨子50 47
wHiStLe 75 72
...

我需要合并昵称重复的行(忽略时)案例),并将记录合并成一个,如下所示:

  stats 
--------- ----------------------
nick totalgames wins ...
john 300 137
哨子125 119
...

我在Postgres中这样做。这样做最好的方法是什么?



我知道我可以通过这样做获得重复的名称:

 选择较低(nick)作为nick,totalgames,count(*)
从stats
group by lower(nick),totalgames
有(*)> 1;

我想到这样的事情:

  update stats 
set totalgames = totalgames + s.totalgames
from(that query up there)s
where lower(nick)= s.nick

除非这不正常。而且我仍然似乎无法删除包含重复名称的其他重复行。我能做什么?任何建议?

解决方案



这是您的更新:

  UPDATE stats 
SET totalgames = x.games,wins = x.wins
FROM(SELECT LOWER(nick)AS nick,SUM(totalgames)AS游戏,SUM(胜))AS赢得
FROM stats
GROUP BY LOWER(nick))AS x
WHERE LOWER(stats.nick)= x.nick;

以下是删除重复行的删除:

  DELETE FROM stats USING stats s2 
WHERE lower(stats.nick)= lower(s2.nick)AND stats.nick< s2.nick;

(请注意,'update ... from'和'delete ... using'特定于Postgres,并且从此答案这个答案。)



你可能还想运行这个来将所有的名字缩小:

 更新STATS SET nick = lower(nick); 

Aaaand在小写版本的nick中抛出一个唯一的索引(或者添加一个约束列不允许非小写值):

  CREATE UNIQUE INDEX ON stats(LOWER(nick)); 


What would be the best way to remove duplicates while merging their records into one?

I have a situation where the table keeps track of player names and their records like this:

stats
-------------------------------
nick     totalgames     wins   ...
John     100            40
john     200            97
Whistle  50             47
wHiStLe  75             72
...

I would need to merge the rows where nick is duplicated (when ignoring case) and merge the records into one, like this:

    stats
    -------------------------------
    nick     totalgames     wins   ...
    john     300            137
    whistle  125            119
    ...

I'm doing this in Postgres. What would be the best way to do this?

I know that I can get the names where duplicates exist by doing this:

select lower(nick) as nick, totalgames, count(*) 
from stats 
group by lower(nick), totalgames
having count(*) > 1;

I thought of something like this:

update stats
set totalgames = totalgames + s.totalgames
from (that query up there) s
where lower(nick) = s.nick

Except this doesn't work properly. And I still can't seem to be able to delete the other duplicate rows containing the duplicate names. What can I do? Any suggestions?

解决方案

SQL Fiddle

Here is your update:

 UPDATE stats
 SET totalgames = x.games, wins = x.wins
 FROM (SELECT LOWER(nick) AS nick, SUM(totalgames) AS games, SUM(wins) AS wins
     FROM stats
      GROUP BY LOWER(nick) ) AS x
 WHERE LOWER(stats.nick) = x.nick;

Here is the delete to blow away the duplicate rows:

 DELETE FROM stats USING stats s2
 WHERE lower(stats.nick) = lower(s2.nick) AND stats.nick < s2.nick;

(Note that the 'update...from' and 'delete...using' syntax are Postgres-specific, and were stolen shamelessly from this answer and this answer.)

You'll probably also want to run this to downcase all the names:

 UPDATE STATS SET nick = lower(nick);

Aaaand throw in a unique index on the lowercase version of 'nick' (or add a constraint to that column to disallow non-lowercase values):

CREATE UNIQUE INDEX ON stats (LOWER(nick)); 

这篇关于SQL:如何合并区分大小写的重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆