TimescaleDB:高效选择最后一行 [英] TimescaleDB: efficiently select last row
问题描述
我有一个带有 timescaledb 扩展名的 postgres 数据库.
I have a postgres database with the timescaledb extension.
我的主索引是一个时间戳,我想选择最新的行.
My primary index is a timestamp, and I would like to select the latest row.
如果我碰巧知道在某个时间之后发生的最新行,那么我可以使用如下查询:
If I happen to know the latest row happened after a certain time, then I can use a query such as:
query = 'select * from prices where time > %(dt)s'
这里我指定了一个日期时间,并使用 psycopg2 执行查询:
Here I specify a datetime, and execute the query using psycopg2:
# 2018-01-10 11:15:00
dt = datetime.datetime(2018,1,10,11,15,0)
with psycopg2.connect(**params) as conn:
cur = conn.cursor()
# start timing
beg = datetime.datetime.now()
# execute query
cur.execute(query, {'dt':dt})
rows = cur.fetchall()
# stop timing
end = datetime.datetime.now()
print('took {} ms'.format((end-beg).total_seconds() * 1e3))
定时输出:
took 2.296 ms
但是,如果我不知道输入上述查询的时间,我可以使用如下查询:
If, however, I don't know the time to input into the above query, I can use a query such as:
query = 'select * from prices order by time desc limit 1'
我以类似的方式执行查询
I execute the query in a similar fashion
with psycopg2.connect(**params) as conn:
cur = conn.cursor()
# start timing
beg = datetime.datetime.now()
# execute query
cur.execute(query)
rows = cur.fetchall()
# stop timing
end = datetime.datetime.now()
print('took {} ms'.format((end-beg).total_seconds() * 1e3))
定时输出:
took 19.173 ms
所以慢了 8 倍以上.
So that's more than 8 times slower.
我不是 SQL 专家,但我原以为查询规划器会发现限制 1"和按主索引排序"等同于 O(1) 操作.
I'm no expert in SQL, but I would have thought the query planner would figure out that "limit 1" and "order by primary index" equates to an O(1) operation.
问题:
是否有更有效的方法来选择表格中的最后一行?
Is there a more efficient way to select the last row in my table?
如果有用,这里是我的表的描述:
In case it is useful, here is the description of my table:
# \d+ prices
Table "public.prices"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+-----------------------------+-----------+----------+---------+---------+--------------+-------------
time | timestamp without time zone | | not null | | plain | |
AAPL | double precision | | | | plain | |
GOOG | double precision | | | | plain | |
MSFT | double precision | | | | plain | |
Indexes:
"prices_time_idx" btree ("time" DESC)
Child tables: _timescaledb_internal._hyper_12_100_chunk,
_timescaledb_internal._hyper_12_101_chunk,
_timescaledb_internal._hyper_12_102_chunk,
...
推荐答案
在 TimescaleDB 中获取最后/第一条记录的有效方法:
An efficient way to get last / first record in TimescaleDB:
第一条记录:
SELECT <COLUMN>, time FROM <TABLE_NAME> ORDER BY time ASC LIMIT 1 ;
最后一条记录:
SELECT <COLUMN>, time FROM <TABLE_NAME> ORDER BY time DESC LIMIT 1 ;
这个问题已经回答了,但我相信如果人们来到这里可能会有用.在 TimescaleDB 中使用 first() 和 last() 需要更长的时间.
The question has already answered but I believe it might be useful if people will get here. Using first() and last() in TimescaleDB takes much longer.
这篇关于TimescaleDB:高效选择最后一行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!