MySQL - 边界大小写使用GROUP BY [英] MySQL - edge case use of GROUP BY

查看:398
本文介绍了MySQL - 边界大小写使用GROUP BY的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对GROUP BY的理解是它的标准用途是汇总项目。所以一个典型的例子可能是:

pre $



$ ,

从表
按部门划分

每个部门的所有ID的数量。



因此,我使用group by教会了一个非常有用的(但可能很狡猾的!)技巧。我想知道这个用法是否有问题。虽然查询按预期运行[所有情况下的预期结果],但我的spidey感觉有些刺痛...

想象下面的数据集:

  id | user_id |成本| note 
----------------------------------
1 1 120测试1
2 1 150测试2
3 2 100测试3
4 3 120测试4

现在,如果我们执行以下SQL:

  select * from table 
group by user_id

您会得到以下结果集。

  id | user_id |成本| note 
----------------------------------
1 1 120测试1
3 2 100测试3
4 3 120测试4

查询运行显然为如下所示:


  • 当找到可分组的用户标识时,通过表
  • 运行表中的
  • ,忽略后续
  • 返回此唯一user_id项目的表



有效地,我得到一个独特具有特定的边界,我可以从这个列表中选择*。
此外,通过在订单之前订购表格,我可以使用它来过滤所有成本。

所以 - 这也是您所期望的..但是:在上面的例子中 - 假设我实际上确保了user_id 1的值为120(与其它可能的值相反 - 这个值为150)案件)。然后120似乎保证是答复。
然后可以按照某些顺序排序字母/数字/其他高级过滤器等......然后使用这种方式强制表中的第一项是答案。



我想要做的实际查询非常复杂。使用MIN或类似的不适合我想要的最终值...但是:这命令您的表然后采取第一个独特的项目使用组的方法实际上是相当优雅的(我认为)。
我实际上是通过限制4个字段来使用组,所以与其他SQL结合会产生一个正确的答案。

所以。

在那长长的背景之后:一个问题!

我用过的所有文档都只提到了使用group by和aggregate函数。我似乎无法找到JUST组的行为。
这引起了我的两件事:


  • 没有记录的正确(错误)用例

  • 我使用的mySQL版本的意外行为。



所以......哪一个是它?
如果这是一个正确的,但边缘的情况下,行为,那么很好。如果我欺骗SQL引擎吐出一些东西,那么我就没有证据表明它与未来的版本兼容,所以我会不习惯使用它。



通过上面的链接/帮助进行调查后,我认为很不幸的是:尽管答案是正确的,但并不保证是正确的......更准确地说,它是不确定的。



我真心有信心,我反复成功地使用了这个内部工作是先来第一秀,但由于规格说这不能保证,所以我不能依靠它。



欢呼声以帮助所有。对所有评论投了票。


My understanding of GROUP BY is that its standard use is to aggregate items. So a typical example might be:

select 

count(id),
department,

from table
group by department

The above would a count of all id's per department.

So, I got taught a very useful (but possible pretty dodgy!) trick using group by. I was wondering if this usage has any problems. Although the query runs as expected [results as expected in all cases], my spidey sense is tingling a bit...

Imagine the following data set:

id  |  user_id  |  cost  |  note
----------------------------------
1         1         120     Test 1
2         1         150     Test 2
3         2         100     Test 3
4         3         120     Test 4

Now if we do the following SQL:

select * from table
group by user_id

You get the following result set.

id  |  user_id  |  cost  |  note
----------------------------------
1         1         120     Test 1
3         2         100     Test 3
4         3         120     Test 4

The query runs apparently as follows:

  • run through the table
  • when a groupable user id is found, ignore the subsequent ones
  • return this table of unique user_id items

Effectively I get a "unique", with specific boundaries and I am able to select * from this list. Furthermore, by ordering the table prior to the order by, I can use this to filter all costs.

So - this is also as you'd expect.. BUT:

In the ABOVE example - Say I actually ensured that for user_id 1, the value 120 was shown (as opposed to it's other possible values - 150 in this case). Then 120 seems to be guaranteed to be the response. The approach could be then to sort by some order alphabetical/numeric/other advanced filters etc... THEN use this sort to force the first item in the table to be the "answer".

The actual query I want to do is pretty complex. Using MIN or similar are not suitable for the end value I want... However: this "order your table then take the first unique item using group by" approach is actually quite elegant (I think). I am actually using group by constrained across 4 fields, and this, combined with other SQL makes a CORRECT answer.

So. After that long background: a question!

All documentation I have used only talks about using group by with aggregate functions. I can't seem to find the behaviour of JUST group by. This strikes me as one of two things:

  • a correct (mis)use case that's not been documented
  • an accidental behaviour of whichever version of mySQL I'm using.

So... which one is it? If it's a correct, but edge case, behaviour, then great. If I'm tricking the SQL engine to spit something out, then I've got no proof this is compatible with future versions so I'd be uneasy to use it.

Cheers in advance all.

解决方案

After looking into this through the above links/help, I think it's unfortunately the case that: while the answer is correct, it's not guaranteed to be correct... More accurately it is "indeterminate".

I am genuinely confident following my repeated successful use of this that the internal workings are "first come first show", but as the spec also says this isn't guaranteed so I can't rely on it.

Cheers for help all. Have up-voted all comments.

这篇关于MySQL - 边界大小写使用GROUP BY的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆