如何从此查询中选择最新日期(不在现有表中)? [英] How to select latest date from this query (not in a existing table)?

查看:88
本文介绍了如何从此查询中选择最新日期(不在现有表中)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想显示每个任务的最新动作.这是表(虚拟数据,但结构相同):

//t_task
task_id             task_name
     A1              PC  Proc
     A2         Printer  Proc
     A3       Stationery Proc

//t_task_d
task_id           assigned_to
     A1                  John
     A1                 Sally
     A2                  John
     A3                 Sally

//t_act
no       act_id       act_date     task_id
 1           C1     2017-07-10          A1
 2           C2     2017-07-14          A1
 3           C3     2017-07-17          A1
 4           C1     2017-07-21          A2

//t_act_d
act_id       act_name
    C1      Surveying 
    C2       Contract
    C3      Execution

从上表中,我想创建某种报告.这是我的预期输出:

no            task_name   dates_of_act       status
 1              PC Proc     2017-07-17    Execution
 2         Printer Proc     2017-07-21    Surveying
 3      Stationery Proc         -           Pending /*if it's NULL, then it should be pending, but I can change this in the PHP section*/

这是我最近的查询:

SELECT 
    t_task.task_name, 
    DATE(t_act.act_date) AS 'dates_of_act', 
    t_act_d.act_name 
FROM t_task 

INNER JOIN t_task_d ON t_task.task_id = t_task_d.task_id
LEFT OUTER JOIN t_act ON t_task.task_id = t_act.task_id 
LEFT OUTER JOIN t_act_d ON t_act.act_id = t_act_d.act_id

GROUP BY t_task.task_id 
ORDER BY t_act.act_date ASC

我的查询结果是:

no            task_name   dates_of_act       status
 1              PC Proc     2017-07-10    Surveying
 2         Printer Proc     2017-07-21    Surveying
 3      Stationery Proc         -           Pending

注意
我更喜欢速度,因为数据量很大.如果可能的话,我也尽量避免子查询

解决方案

我认为是这样做的.

select
  t_task.task_id,
  t_task.task_name,
  latest_action.act_date,
  IFNULL(t_act_d.act_name, 'Pending') as act_name
from
  t_task
  left outer join (
    select
      @row_num := IF(@prev_value=concat_ws('', t_act.task_id),@row_num+1, 1) as row_number,
      t_act.task_id,
      t_act.act_id,
      t_act.act_date,
      @prev_value := concat_ws('', t_act.task_id) as z
    from
      t_act,
      (select @row_num := 1) x,
      (select @prev_value := '') y
    order by
      t_act.task_id,
      t_act.act_date desc
  ) as latest_action on
    t_task.task_id = latest_action.task_id
  left outer join t_act_d on
    latest_action.act_id = t_act_d.act_id
where
  latest_action.row_number = 1 or
  latest_action.row_number is null
order by
  case when latest_action.act_date is null then '9999-01-01' else latest_action.act_date end

您提供的数据的结果是:

+---------+-----------------+------------+-----------+
| task_id | task_name       | act_date   | act_name  |
+---------+-----------------+------------+-----------+
| A1      | PC  Proc        | 2017-07-17 | Execution |
| A2      | Printer  Proc   | 2017-07-21 | Surveying |
| A3      | Stationery Proc | NULL       | Pending   |
+---------+-----------------+------------+-----------+

我更熟悉T-SQL,在这里我将使用row_number()窗口函数.想法是让row_number字段显示每个行的排名,以它是否是每个任务的最新操作(值1),第二最新操作(值2)等为依据.每个任务的最新操作最终以row_number为1结束,因此您可以通过从此latest_action子查询中按row_number = 1进行过滤来消除这些问题.

因为latest_action子查询总体上只运行一次,而不是每行一次,所以对性能的影响不大.不幸的是,我不能保证整个变量设置/增加不会对性能造成太大的影响,这是我第一次在MySQL中使用此逻辑,我不知道它的性能如何.

如何重现T-SQL的row_number()功能的逻辑来自此处: ROW_NUMBER() MySQL

I'd like to show the latest action from each task. This is the table (dummy data but same structure) :

//t_task
task_id             task_name
     A1              PC  Proc
     A2         Printer  Proc
     A3       Stationery Proc

//t_task_d
task_id           assigned_to
     A1                  John
     A1                 Sally
     A2                  John
     A3                 Sally

//t_act
no       act_id       act_date     task_id
 1           C1     2017-07-10          A1
 2           C2     2017-07-14          A1
 3           C3     2017-07-17          A1
 4           C1     2017-07-21          A2

//t_act_d
act_id       act_name
    C1      Surveying 
    C2       Contract
    C3      Execution

From the above tables, I'd like to create some kind of report. This is my expected output:

no            task_name   dates_of_act       status
 1              PC Proc     2017-07-17    Execution
 2         Printer Proc     2017-07-21    Surveying
 3      Stationery Proc         -           Pending /*if it's NULL, then it should be pending, but I can change this in the PHP section*/

This is my closest current query:

SELECT 
    t_task.task_name, 
    DATE(t_act.act_date) AS 'dates_of_act', 
    t_act_d.act_name 
FROM t_task 

INNER JOIN t_task_d ON t_task.task_id = t_task_d.task_id
LEFT OUTER JOIN t_act ON t_task.task_id = t_act.task_id 
LEFT OUTER JOIN t_act_d ON t_act.act_id = t_act_d.act_id

GROUP BY t_task.task_id 
ORDER BY t_act.act_date ASC

My query result is:

no            task_name   dates_of_act       status
 1              PC Proc     2017-07-10    Surveying
 2         Printer Proc     2017-07-21    Surveying
 3      Stationery Proc         -           Pending

Note
I prefer speed because the data is huge. I also try to avoid subqueries if possible

解决方案

I think this does it.

select
  t_task.task_id,
  t_task.task_name,
  latest_action.act_date,
  IFNULL(t_act_d.act_name, 'Pending') as act_name
from
  t_task
  left outer join (
    select
      @row_num := IF(@prev_value=concat_ws('', t_act.task_id),@row_num+1, 1) as row_number,
      t_act.task_id,
      t_act.act_id,
      t_act.act_date,
      @prev_value := concat_ws('', t_act.task_id) as z
    from
      t_act,
      (select @row_num := 1) x,
      (select @prev_value := '') y
    order by
      t_act.task_id,
      t_act.act_date desc
  ) as latest_action on
    t_task.task_id = latest_action.task_id
  left outer join t_act_d on
    latest_action.act_id = t_act_d.act_id
where
  latest_action.row_number = 1 or
  latest_action.row_number is null
order by
  case when latest_action.act_date is null then '9999-01-01' else latest_action.act_date end

The results from the data you provided are:

+---------+-----------------+------------+-----------+
| task_id | task_name       | act_date   | act_name  |
+---------+-----------------+------------+-----------+
| A1      | PC  Proc        | 2017-07-17 | Execution |
| A2      | Printer  Proc   | 2017-07-21 | Surveying |
| A3      | Stationery Proc | NULL       | Pending   |
+---------+-----------------+------------+-----------+

I'm more familiar with T-SQL, where I'd use the row_number() window function. The idea is to have the row_number field show a ranking of each row in terms of whether it's the most recent (value 1), second most recent (value 2) etc. action for each task. The most recent action for each task winds up with a row_number of 1, so you can get these out by filtering on row_number = 1 from this latest_action subquery.

Because the latest_action subquery is run once overall, not once per row, it is not much of a performance hit. Unfortunately, I can't promise that the whole variable setting / incrementing thing isn't much of a performance hit, this is the first time I've used this logic in MySQL, I don't know how performant it is.

The logic for how to reproduce T-SQL's row_number() functionality came from here: ROW_NUMBER() in MySQL

这篇关于如何从此查询中选择最新日期(不在现有表中)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆