GROUP BY子句将所有VARCHAR字段视为不同 [英] GROUP BY clause sees all VARCHAR fields as different
问题描述
在尝试 GROUP BY VARCHAR 字段时,我目睹了一种奇怪的行为。
让我们看看下面这个例子,我试图发现过去至少有一次更名的顾客。
CREATE TABLE #CustomersHistory
(
Id INT IDENTITY(1,1),
CustomerId INT,
名称VARCHAR(200)
INSERT INTO #CustomersHistory VALUES(12,'AAA')
插入#Customers历史值(12,'AAA')
插入#CustomersHistory值( 12,'BBB')
INSERT INTO #CustomersHistory VALUES(44,'444')
SELECT ch.CustomerId,count(ch.Name)AS cnt
FROM #CustomersHistory ch
GROUP BY ch.CustomerId HAVING count(ch.Name)!= 1
其中奇怪的是(如果第一个INSERT的'AAA'与第二个不同)
CustomerId cnt //(我期待)
12 3 // 2
44 1 // 1
- 这种行为是否特定于T-SQL?
- 它为什么会以这种非常直观的方式行事?
- 习惯上如何克服这种限制? strong>这个问题与 GROUP BY问题与varchar 非常相似,在那里我没有' t找到答案为什么
- Is this behaviour specific to T-SQL?
- Why does it behave in this rather counter-intuitive way?
- How is it customary to overcome this limitation?
注意:使用 HAVING count(ch.Name) != 1
而不是 HAVING count(ch.Name)> 1
?
COUNT(DISTINCT ch.Name)
,它只能计算唯一的名字。
SELECT ch.CustomerId,count(DISTINCT ch.Name)AS cnt
FROM #CustomersHistory ch
GROUP BY ch.CustomerId HAVING count(DISTINCT ch.Name)> 1
有关更多信息,请查看 COUNT()在线预订文章
I have witnessed a strange behaviour while trying to GROUP BY a VARCHAR field.
Let the following example, where I try to spot customers that have changed name at least once in the past.
CREATE TABLE #CustomersHistory
(
Id INT IDENTITY(1,1),
CustomerId INT,
Name VARCHAR(200)
)
INSERT INTO #CustomersHistory VALUES (12, 'AAA')
INSERT INTO #CustomersHistory VALUES (12, 'AAA')
INSERT INTO #CustomersHistory VALUES (12, 'BBB')
INSERT INTO #CustomersHistory VALUES (44, '444')
SELECT ch.CustomerId, count(ch.Name) AS cnt
FROM #CustomersHistory ch
GROUP BY ch.CustomerId HAVING count(ch.Name) != 1
Which oddly yields (as if 'AAA' from first INSERT was different from the second one)
CustomerId cnt // (I was expecting)
12 3 // 2
44 1 // 1
Note: This question is very similar to GROUP BY problem with varchar, where I didn't find the answer to Why
Side Note: Is it good practice to use HAVING count(ch.Name) != 1
instead of HAVING count(ch.Name) > 1
?
The COUNT()
operator will count all rows regardless of value. I think you might want to use a COUNT(DISTINCT ch.Name)
which will only count unique names.
SELECT ch.CustomerId, count(DISTINCT ch.Name) AS cnt
FROM #CustomersHistory ch
GROUP BY ch.CustomerId HAVING count(DISTINCT ch.Name) > 1
For more information, take a look at the COUNT() article on book online
这篇关于GROUP BY子句将所有VARCHAR字段视为不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!