范围:按两个日期之间的差异排序 [英] Scope: Sort by difference between two dates

查看:49
本文介绍了范围:按两个日期之间的差异排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想创建一个范围,从最近的日期到最远的日期对数据进行排序.

I would like to create a scope which sort data from the closest date to the farthest.

例如我有 3 个值:

<Value id: 1, date: '2012-12-20'>
<Value id: 2, date: '2012-12-28'>
<Value id: 3, date: '2012-12-31'>

然后我想对最接近给定日期的日期进行排序:2012-12-29.
我应该得到这个顺序:2, 3, 1.
如果我选择 2012-12-30 结果必须是:3, 2, 1.

Then I would like to sort the date closest to a given date: 2012-12-29.
I should have as result this order: 2, 3, 1.
If I choose 2012-12-30 the result must be: 3, 2, 1.

我尝试过这样的事情:

scope :order_by_closest_date, lambda{|time| 
  select("*, (date - DATE('#{time}')) AS time").order("time ASC")
}

但它不起作用.
有关信息:Rails 3.2.9 Ruby 1.9.3 Postgresql 9.1.4.
有什么想法吗?

But it doesn't work.
For information: Rails 3.2.9 Ruby 1.9.3 Postgresql 9.1.4.
Any ideas?

推荐答案

简单查询

第一个示例根据初始问题使用 date 列.

不确定 Ruby 语法,但正确的 SQL 语句应该是:

Not sure about Ruby syntax, but a proper SQL statement would be:

SELECT * 
FROM   tbl
ORDER  BY @(date_col - '2012-12-29'::date)

@ 是绝对的"价值"运算符.

@ being the "absolute value" operator.

切勿使用 datetime 作为标识符.虽然在 PostgreSQL 中被允许(有一些限制),但这些是 保留SQL 标准中的词,它会导致混淆错误消息和可能出现意外错误.

Never use date or time as identifiers. While being allowed (with some restrictions) in PostgreSQL, these are reserved words in the SQL standard and it leads to confusing error messages and possibly unexpected errors.

其余部分根据注释中的更新使用 timestamp 列.

对于小表或即席查询,上述解决方案就可以了.对于中型或大型表,如果性能很重要,我建议采用更复杂的方法.

For small tables or ad-hoc queries, the above solution is just fine. For medium or big tables, if performance matters, I suggest a more sophisticated approach.

Condicio sine qua non 是datetimestamp 列上的索引.像这样:

Condicio sine qua non is an index on the date or timestamp column. Like this:

CREATE INDEX tbl_my_timestamp_idx ON tbl(my_timestamp);

有了索引,下面的查询将破坏简单查询的性能,用于更大的表:

With the index in place the following query which will nuke the performance of the simple one for bigger tables:

SELECT *
FROM  (
    (
    SELECT *
    FROM   tbl
    WHERE  my_timestamp >= '2012-12-30 11:32'::timestamp
    ORDER  BY my_timestamp
    LIMIT  3
    )

    UNION ALL
    (
    SELECT *
    FROM   tbl
    WHERE  my_timestamp < '2012-12-30 11:32'::timestamp
    ORDER  BY my_timestamp DESC
    LIMIT  3
    )
    ) x
ORDER  BY @extract('epoch' FROM (my_timestamp - '2012-12-28 11:32'::timestamp))
LIMIT  3;

  • UNION ALL - SELECT 两条腿周围的括号不是可选的.需要对每条腿应用 LIMIT.

    • Parenthesis around the two legs of the UNION ALL - SELECT are not optional. Needed to apply LIMIT to each leg.

      如果您按其他列排序,请在索引中反映这一点 - 使用 多列索引.

      If you order by additional columns, reflect that in your index - use a multi-column index in that case.

      第一个查询使用表达式作为条件.Postgres 必须为每一行计算一个值,然后按结果排序并选择前几行.小桌子没问题,但大桌子非常很贵.O(n);n 是表中的行数.它不能使用普通索引.加上在所有行中排序和挑选获胜者的一些非平凡成本.
      可以在一个表达式上创建一个索引,这会是最快的,但这只适用于一个恒定的时间戳来进行比较——这几乎不是一个现实的用例.

      The first query uses an expression as condition. Postgres has to compute a value for every single row, then order by the result and pick the first few. No problem for a small table, but very expensive for big tables. O(n); n being the number of rows in the table. It can't use a plain index. Plus some non-trivial cost to sort and pick the winners among all rows.
      You could create an index on an expression, which would be fastest, but that only works for a constant timestamp to compare to - hardly a realistic use case.

      第二个查询根据您的时间戳在索引中找到位置,依次读取下几行的元组指针并直接从表中获取它们(或者甚至直接从索引中获取,在 9.2 中只进行索引扫描).两次,一次上升,一次下降,因为我们不知道同行如何比较.但这只是 2 x O(log(n))(典型的 b-tree 查找成本) 仅对少数预先选择的行进行计算.从小样本中挑选获胜者的成本微不足道.

      The second query finds the position according to your timestamp in the index, reads tuple pointers for next couple of rows sequentially and fetches them straight from the table (or even directly from the index with an index-only scan in 9.2). Twice, once up, once down, since we don't know how the peers compare. But that's just 2 x O(log(n)) (typical b-tree look-up cost) Computation is only done for the few pre-selected rows. Picking the winners from the small sample carries a trivial constant cost.

      只需使用 EXPLAIN ANALYZE.在对真实表格的快速测试中,我得到了 因子 1000 和 50k 行的表格.而且它会不断扩展以容纳更大的桌子.

      Just test with EXPLAIN ANALYZE. In a quick test on a real-life table I got a factor 1000 with a table of 50k rows. And it keeps scaling up for bigger tables.

      这篇关于范围:按两个日期之间的差异排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆