SQL查询从另一个表中的时间戳获取最新行 [英] SQL Query where I get most recent rows from timestamp from another table
问题描述
我在桌子上放了一些感官信息.我已经弄清楚了该查询,它将确切地告诉我特定设备上的值何时发生变化.
I've got some sensory information going into a table. I have figured out the query that will tell me exactly when the value at a particular device changes.
我需要知道的是当时所有其他传感器的状态.诀窍是,时间戳记将不相等.我可以从传感器1获得一个数据点,然后3分钟后,从传感器2获得一个数据点,然后在30秒后,从传感器1获得另一个数据点.
What I need to know is the status of all of the other sensors at that time. The trick is, the timestamps won't be equal. I could get a data point from sensor 1, then 3 minute later, one from sensor 2, and then 30 seconds later, another from sensor 1.
所以,这是我正在谈论的示例:
So, here is an example of what I am talking about:
--- data_table ---
sensor | state | stime
-------+-------+---------------------
1 | A | 2014-08-17 21:42:00
1 | A | 2014-08-17 21:43:00
2 | B | 2014-08-17 21:44:00
3 | C | 2014-08-17 21:45:00
2 | D | 2014-08-17 21:46:00
3 | C | 2014-08-17 21:47:00
1 | B | 2014-08-17 21:48:00
3 | A | 2014-08-17 21:49:00
2 | D | 2014-08-17 21:50:00
2 | A | 2014-08-17 21:51:00
现在,我知道将为我提供状态更改的查询.我已经把它记下来了,而且它在视图中.该表如下所示:
Now, I know the query that will deliver me the state changes. I've got this down, and it's in a view. That table would look like:
--- state_changed_view ---
sensor | state | stime
-------+-------+---------------------
2 | D | 2014-08-17 21:46:00
1 | B | 2014-08-17 21:48:00
3 | A | 2014-08-17 21:49:00
2 | A | 2014-08-17 21:51:00
我想要的是一个JOIN,我可以在其中获取"state_changed_view"的所有值,以及视图内"sensor_timestamp"处其他相应传感器的值.
What I want is a JOIN, where I can get all of the values of the 'state_changed_view', but also the values of the other corresponding sensors at the 'sensor_timestamp' within the view.
因此,理想情况下,我希望我的结果看起来像(或类似的东西):
So, ideally, I want my result to look like (or something similar to):
sensor | state | stime | sensor | state | stime
-------+-------+---------------------+--------+-------+---------------------
2 | D | 2014-08-17 21:46:00 | 1 | A | 2014-08-17 21:43:00
2 | D | 2014-08-17 21:46:00 | 2 | D | 2014-08-17 21:46:00
2 | D | 2014-08-17 21:46:00 | 3 | C | 2014-08-17 21:45:00
1 | B | 2014-08-17 21:48:00 | 1 | B | 2014-08-17 21:48:00
1 | B | 2014-08-17 21:48:00 | 2 | D | 2014-08-17 21:46:00
1 | B | 2014-08-17 21:48:00 | 3 | C | 2014-08-17 21:47:00
3 | A | 2014-08-17 21:49:00 | 1 | B | 2014-08-17 21:48:00
3 | A | 2014-08-17 21:49:00 | 2 | D | 2014-08-17 21:46:00
3 | A | 2014-08-17 21:49:00 | 3 | A | 2014-08-17 21:49:00
2 | A | 2014-08-17 21:51:00 | 1 | B | 2014-08-17 21:48:00
2 | A | 2014-08-17 21:51:00 | 2 | A | 2014-08-17 21:51:00
2 | A | 2014-08-17 21:51:00 | 3 | A | 2014-08-17 21:49:00
如您所见,对于每个传感器,state_changed_view
中存在的每一行,我都需要'data_table'中的最新行.
As you can see, I need the most recent row in 'data_table' for each sensor, for every row that exists in state_changed_view
.
我只是不知道如何获取SQL以根据特定的时间戳获取最近的行.
I just don't know how to get the SQL to get me the most recent row according to a particular timestamp.
这是在PL/pgSQL系统上,因此与Postgres兼容的任何东西都很方便.
This is on a PL/pgSQL system, so anything compatible with Postgres is handy.
推荐答案
查询
对于小型,给定的一组传感器进行检索(适用于Postgres 8.4 或更高版本):
Query
For a small, given set of sensors to retrieve (this works for Postgres 8.4 or later):
SELECT c.sensor AS sensor_change
, d1.state AS state_1, d1.stime AS stime_1
, d2.state AS state_2, d2.stime AS stime_2
, d3.state AS state_3, d3.stime AS stime_3
FROM (
SELECT sensor, stime
, lag(state) OVER (PARTITION BY sensor ORDER BY stime)
<> state AS change
, max(CASE WHEN sensor = 1 THEN stime ELSE NULL END) OVER w AS last_1
, max(CASE WHEN sensor = 2 THEN stime ELSE NULL END) OVER w AS last_2
, max(CASE WHEN sensor = 3 THEN stime ELSE NULL END) OVER w AS last_3
FROM data d
WINDOW w AS (ORDER BY stime)
) c
JOIN data d1 ON d1.sensor = 1 AND d1.stime = c.last_1
JOIN data d2 ON d2.sensor = 2 AND d2.stime = c.last_2
JOIN data d3 ON d3.sensor = 3 AND d3.stime = c.last_3
WHERE c.change
ORDER BY c.stime;
根本不使用视图,而是直接在表上构建,这样会更快.
Not using the view at all, building on the table directly, that's faster.
这是假设(sensor, stime)
上的UNIQUE INDEX是明确的.性能还很大程度上取决于这样的索引.
This is assuming a UNIQUE INDEX on (sensor, stime)
to be unambiguous. Performance also heavily depends on such an index.
与 @Nick的解决方案(基于JOIN LATERAL
(Postgres 9.3+)构建)相反,它返回单行,其中包含每个状态更改的所有值.
As opposed to @Nick's solution, building on JOIN LATERAL
(Postgres 9.3+), this returns a single row with all values for every change in state.
自从您提到PL/pgSQL以来,我希望这个(高度优化的)plpgsql函数性能更好,因为它可以对表进行一次顺序扫描:
Since you mentioned PL/pgSQL, I would expect this (highly optimized) plpgsql function to perform better, since it can make do with a single sequential scan of the table:
CREATE OR REPLACE FUNCTION f_sensor_change()
RETURNS TABLE (sensor_change int -- adapt to actual data types!
, state_1 "char", stime_1 timestamp
, state_2 "char", stime_2 timestamp
, state_3 "char", stime_3 timestamp) AS
$func$
DECLARE
r data%rowtype;
BEGIN
FOR r IN
TABLE data ORDER BY stime
LOOP
CASE r.sensor
WHEN 1 THEN
IF r.state = state_1 THEN -- just save stime
stime_1 := r.stime;
ELSIF r.state <> state_1 THEN -- save all & RETURN
stime_1 := r.stime; state_1 := r.state;
sensor_change := 1; RETURN NEXT;
ELSE -- still NULL: init
stime_1 := r.stime; state_1 := r.state;
END IF;
WHEN 2 THEN
IF r.state = state_2 THEN
stime_2 := r.stime;
ELSIF r.state <> state_2 THEN
stime_2 := r.stime; state_2 := r.state;
sensor_change := 2; RETURN NEXT;
ELSE
stime_2 := r.stime; state_2 := r.state;
END IF;
WHEN 3 THEN
IF r.state = state_3 THEN
stime_3 := r.stime;
ELSIF r.state <> state_3 THEN
stime_3 := r.stime; state_3 := r.state;
sensor_change := 3; RETURN NEXT;
ELSE
stime_3 := r.stime; state_3 := r.state;
END IF;
ELSE -- do nothing, ignore other sensors
END CASE;
END LOOP;
END
$func$ LANGUAGE plpgsql;
致电:
SELECT * FROM f_sensor_change();
使重复使用变得有意义.相关答案:
Makes sense for repeated use. Related answer:
用于Postgres 9.3的SQL Fiddle.
用于Postgres 8.4的SQL Fiddle.
SQL Fiddle for Postgres 9.3.
SQL Fiddle for Postgres 8.4.
这篇关于SQL查询从另一个表中的时间戳获取最新行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!