连续行之间的日期差异-复杂 [英] Date Difference between consecutive rows - complicated
问题描述
我以前发布了一个问题,该问题已得到答复,但我需要查询这个也是. 我有一个表结构,其中包含这样的数据(日期格式为 dd/mm/yyyy ).
I had previously posted a question which was answered but I need a query for this too. I have a table structure with data like this (dates in the format dd/mm/yyyy).
ID Account Number Unit Admit_Date Disch_Date
1 1001 w32 01/04/2012
2 1002 w32 01/04/2012 01/04/2012
3 1001 ccu 03/04/2012
4 1001 w33 05/04/2012
5 1003 cicu 04/04/2012
6 1001 ccu 07/04/2012
7 1001 ccu 07/04/2012 10/04/2012
8 1003 w33 05/04/2012
9 1003 w33 05/04/2012 08/04/2012
基本上,该表处理的患者是进入特定病房并在病房之间转移,然后在当天或几天后最终出院的患者. 查询的预期结果将是:
Basically this table deals with patients getting admitted to a particular ward and transferred between wards and then finally discharged either on same day or few days later. The expected result from query would be:
Account_Number No. Of Days
1001 01/04/2012 03/04/2012 2
1001 03/04/2012 05/04/2012 2
1001 05/03/2012 07/04/2012 2
1001 07/04/2012 10/04/2012 3
1002 01/04/2012 01/04/2012 0
1003 04/04/2012 05/04/2012 1
1003 05/04/2012 08/04/2012 3
出院日期字段仅在患者出院时填写,因此,我想计算出患者的每个活动日期(包括入院日期和出院日期)之间的日期差.
The discharge date field will only be filled when the patient is discharged, hence I would like to calculate date difference between each date of movement of the patient including both admission and the date of discharge.
我使用MS Access 2003.
I use MS Access 2003.
我希望有人能够对此提供帮助.
I hope that some one will be able to help me with this.
推荐答案
过滤掉不相关的数据
对于任何复杂的查询,部分技术是逐步构建查询,并随时进行测试.
Filtering out the irrelevant data
With any complex query, part of the art is building up the query piece by piece, testing as you go.
我假设表格名称为PatientMovements,并且:
I'm assuming that the table name is PatientMovements and that:
给定ID = {6,7}和ID = {8,9}之类的成对行,正确地说该行是患者(帐号),单位和住院人数所在的行如果同时有同一患者,单位和入院日期的记录,但出院日期不为空,则忽略出院日期为零的日期.
Given pairs of rows like ID = {6,7} and ID = {8,9}, it is correct to say that the row where the patient (account number), unit and admission date with null discharge date is ignored when there is also a record for the same patient, unit and admit date but a non-null discharge date.
因此,第一步是生成我们需要处理的行,从数据库中记录的表中过滤掉不相关的数据.这是两套数据的UNION:
So, step one is to generate the rows that we need to work on, filtering out the irrelevant data from the table recorded in the database. This is a UNION of two sets of data:
- 出院日期不为空的那些行.
- 出行日期为空的行,但相同的帐户,单位和入场日期没有行.
很明显,UNION的第一部分是:
Clearly, the first part of the UNION is:
SELECT * FROM PatientMovements WHERE DischargeDate IS NOT NULL
显然,UNION的第二部分是:
Less obviously, the second part of the UNION is:
SELECT *
FROM PatientMovements AS p1
WHERE DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
现在您可以将它们合并到一个结果集中:
Now you can combine those into a single result set:
SELECT *
FROM PatientMovements
WHERE DischargeDate IS NOT NULL
UNION
SELECT *
FROM PatientMovements AS p1
WHERE DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
您可以通过检查其返回ID为1..5、7和9的行来验证上面的查询.
You can verify the query above by checking that it returns rows with IDs 1..5, 7, and 9.
警告:未经测试的代码.此答案中的SQL都没有靠近DBMS,因此未经测试.
Warning: untested code. None of the SQL in this answer has been near a DBMS, so it is untested.
然后,您可以从其他问题中应用您的知识来对数据进行排序并计算日期差等.唯一的麻烦是您必须将该查询写两次,这很痛苦(除非MS Access 2003支持'WITH'子句或公用表表达式).
And then you can apply your learning from the other question to order the data and calculate the date differences, etc. The only complication is that you have to write that query out twice, which is painful (unless MS Access 2003 support the 'WITH' clause or common table expression).
但是没有单个查询来获取此必需的输出吗?
But would there be no single query to obtain this required output?
当然,UNION是一个查询.我想你可以写:
The UNION is a single query, of course. I suppose you could just write:
SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
我无法立即想到一种更紧凑的查询方式.
I can't immediately think of a more compact way of doing the query.
对另一个问题的公认答案有两个可能的解决方案(已通过注释修改并重新格式化):
The accepted answer to the other question has two possible solutions (as amended by comments and reformatted):
SELECT T1.ID, T1.AccountNumber, T1.Date,
MIN(T2.Date) AS NextDate,
DATEDIFF("D", T1.Date, MIN(T2.Date)) AS DaysDiff
FROM YourTable T1
JOIN YourTable T2
ON T1.AccountNumber = T2.AccountNumber AND T2.Date > T1.Date
或者:
SELECT ID, AccountNumber, Date, NextDate,
DATEDIFF("D", Date, NextDate) AS DaysDiff
FROM (SELECT ID, AccountNumber, Date,
(SELECT MIN(Date)
FROM YourTable T2
WHERE T2.AccountNumber = T1.AccountNumber
AND T2.Date > T1.Date
) AS NextDate
FROM YourTable T1
) AS T
如评论中所述,问题中没有表名会导致答案中出现不同的表名;在这个答案中,我所谓的PatientMovements被称为YourTable.另一个区别是原始问题未在数据中包含单位"或排放日期"列.但是,我给出的UNION查询提供了运行这些查询所需的相关数据,因此剩下要做的就是将UNION查询写到其他答案中,代替YourTable.这导致:
As noted in a comment, the absence of the table name in the question leads to different table names appearing in the answer; what I called PatientMovements was called YourTable in this answer. The other difference is that the original question did not include the Unit or DischargeDate columns in the data. However, the UNION query I gave gives the relevant data on which to run these queries, so all that's left to do is write the UNION query into the other answers in place of YourTable. This leads to:
SELECT T1.ID, T1.AccountNumber, T1.Date,
MIN(T2.Date) AS NextDate,
DATEDIFF("D", T1.Date, MIN(T2.Date)) AS DaysDiff
FROM (SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
) AS T1
JOIN (SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
) AS T2
ON T1.AccountNumber = T2.Accountnumber AND T2.Date > T1.Date
或者:
SELECT ID, AccountNumber, Date, NextDate,
DATEDIFF("D", Date, NextDate) AS DaysDiff
FROM (SELECT ID, AccountNumber, Date,
(SELECT MIN(Date)
FROM (SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
) AS T2
WHERE T2.Accountnumber = T1.AccountNumber
AND T2.Date > T1.Date
) AS NextDate
FROM (SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
) AS T1
) AS T
因此,只要您小心谨慎,将查询分成多个部分,然后将它们一致地组合在一起,就可以驯服看起来最糟糕的查询.
So, as long as you are careful, and develop queries in fragments, and then combine them consistently, the most awful looking query can be tamed.
请注意,SQL标准具有通用表表达式"(CTE)或"WITH子句",这可以使事情变得更加简单.
Note that the SQL Standard has 'common table expressions' (CTEs) aka 'WITH clauses' which can make things still easier.
WITH YourTable AS
(SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
)
SELECT T1.ID, T1.AccountNumber, T1.Date,
MIN(T2.Date) AS NextDate,
DATEDIFF("D", T1.Date, MIN(T2.Date)) AS DaysDiff
FROM YourTable T1
JOIN YourTable T2
ON T1.AccountNumber = T2.AccountNumber AND T2.Date > T1.Date
或者:
WITH YourTable AS
(SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
)
SELECT ID, AccountNumber, Date, NextDate,
DATEDIFF("D", Date, NextDate) AS DaysDiff
FROM (SELECT ID, AccountNumber, Date,
(SELECT MIN(Date)
FROM YourTable T2
WHERE T2.AccountNumber = T1.AccountNumber
AND T2.Date > T1.Date
) AS NextDate
FROM YourTable T1
) AS T
使用CTE的主要优点之一是,明确告知优化器使用的所有位置处的表表达式都是相同的,而当多次写入时,它可能不会发现这种共性.另外,多次写出查询可能会由于编辑错误而使两个相同的查询"实际上略有不同. CTE排除了这种可能性.在当前情况下的另一个优势是,将CTE与另一个问题的解决方案结合起来就是儿童游戏.
One of the major advantages of using a CTE is that the optimizer is told explicitly that the table expressions is the same in all places it is used, whereas when it is written out several times, it might not spot that commonality. Plus, writing the query out several times opens up the possibility that the two 'meant to be the same' queries are actually slightly different because of an editing error; that possibility is precluded by the CTE. The other advantage in the current context was that combining the CTE with the solutions to the other question was child's play.
对您来说很不幸,MS Access 2003不太可能支持CTE.我分担你的痛苦;我使用的DBMS也不是.
Sadly for you, it is unlikely that MS Access 2003 supports CTEs. I share your pain; the DBMS I work with mainly doesn't either.
这篇关于连续行之间的日期差异-复杂的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!