TimescaleDB:高效选择最后一行 [英] TimescaleDB: efficiently select last row

查看:391
本文介绍了TimescaleDB:高效选择最后一行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有 timescaledb 扩展名的 postgres 数据库.

I have a postgres database with the timescaledb extension.

我的主索引是一个时间戳,我想选择最新的行.

My primary index is a timestamp, and I would like to select the latest row.

如果我碰巧知道在某个时间之后发生的最新行,那么我可以使用如下查询:

If I happen to know the latest row happened after a certain time, then I can use a query such as:

query = 'select * from prices where time > %(dt)s'

这里我指定了一个日期时间,并使用 psycopg2 执行查询:

Here I specify a datetime, and execute the query using psycopg2:

# 2018-01-10 11:15:00
dt = datetime.datetime(2018,1,10,11,15,0)

with psycopg2.connect(**params) as conn:
    cur = conn.cursor()
    # start timing
    beg = datetime.datetime.now()
    # execute query
    cur.execute(query, {'dt':dt})
    rows = cur.fetchall()
    # stop timing
    end = datetime.datetime.now()

print('took {} ms'.format((end-beg).total_seconds() * 1e3))

定时输出:

took 2.296 ms

但是,如果我不知道输入上述查询的时间,我可以使用如下查询:

If, however, I don't know the time to input into the above query, I can use a query such as:

query = 'select * from prices order by time desc limit 1'

我以类似的方式执行查询

I execute the query in a similar fashion

with psycopg2.connect(**params) as conn:
    cur = conn.cursor()
    # start timing
    beg = datetime.datetime.now()
    # execute query
    cur.execute(query)
    rows = cur.fetchall()
    # stop timing
    end = datetime.datetime.now()

print('took {} ms'.format((end-beg).total_seconds() * 1e3))

定时输出:

took 19.173 ms

所以慢了 8 倍以上.

So that's more than 8 times slower.

我不是 SQL 专家,但我原以为查询规划器会发现限制 1"和按主索引排序"等同于 O(1) 操作.

I'm no expert in SQL, but I would have thought the query planner would figure out that "limit 1" and "order by primary index" equates to an O(1) operation.

问题:

是否有更有效的方法来选择表格中的最后一行?

Is there a more efficient way to select the last row in my table?

如果有用,这里是我的表的描述:

In case it is useful, here is the description of my table:

# \d+ prices

                                           Table "public.prices"
 Column |            Type             | Collation | Nullable | Default | Storage | Stats target | Description 
--------+-----------------------------+-----------+----------+---------+---------+--------------+-------------
 time   | timestamp without time zone |           | not null |         | plain   |              | 
 AAPL   | double precision            |           |          |         | plain   |              | 
 GOOG   | double precision            |           |          |         | plain   |              | 
 MSFT   | double precision            |           |          |         | plain   |              | 
Indexes:
    "prices_time_idx" btree ("time" DESC)
Child tables: _timescaledb_internal._hyper_12_100_chunk,
              _timescaledb_internal._hyper_12_101_chunk,
              _timescaledb_internal._hyper_12_102_chunk,
              ...

推荐答案

在 TimescaleDB 中获取最后/第一条记录的有效方法:

An efficient way to get last / first record in TimescaleDB:

第一条记录:

SELECT <COLUMN>, time FROM <TABLE_NAME> ORDER BY time ASC LIMIT 1 ;

最后一条记录:

SELECT <COLUMN>, time FROM <TABLE_NAME> ORDER BY time DESC LIMIT 1 ;

这个问题已经回答了,但我相信如果人们来到这里可能会有用.在 TimescaleDB 中使用 first() 和 last() 需要更长的时间.

The question has already answered but I believe it might be useful if people will get here. Using first() and last() in TimescaleDB takes much longer.

这篇关于TimescaleDB:高效选择最后一行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆