通过均匀跳过行来选择固定数量的行 [英] SELECT fixed number of rows by evenly skipping rows

查看：64 发布时间：2021/6/19 18:47:59 sql postgresql window-functions postgresql-9.4

本文介绍了通过均匀跳过行来选择固定数量的行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试编写一个查询，该查询返回任意大小的代表性数据样本.我想通过只选择 n^th 行来做到这一点，其中 n 使得整个结果集尽可能接近任意大小.

I am trying to write a query which returns an arbitrary sized representative sample of data. I would like to do this by only selecting n^th rows where n is such that the entire result set is as close as possible to an arbitrary size.

我希望这在结果集通常小于任意大小的情况下起作用.在这种情况下，应该返回整个结果集.

I want this to work in cases where the result set would normally be less than the arbitrary size. In such a case, the entire result set should be returned.

我发现了这个问题，它显示了如何选择每个 n^第行.

I found this question which shows how to select every n^th row.

这是我目前所拥有的:

Here is what I have so far:

SELECT * FROM (
   SELECT *, ((row_number() OVER (ORDER BY "time"))
               % ceil(count(*)::FLOAT / 500::FLOAT)::BIGINT) AS rn
   FROM data_raw) sa
WHERE sa.rn=0;

这会导致以下错误:

错误:列data_raw.serial"必须出现在 GROUP BY 子句中或用于聚合函数位置:23

ERROR: column "data_raw.serial" must appear in the GROUP BY clause or be used in an aggregate function Position: 23

像这样删除 n 的计算:

SELECT * FROM (
   SELECT *, (row_number() OVER (ORDER BY "time"))
              % 50 AS rn FROM data_raw) sa
LIMIT 500;

我还尝试将计算移至 WHERE 子句:

I also tried moving the calculation to the WHERE clause:

SELECT * FROM (
   SELECT *, (row_number() OVER (ORDER BY "time")) AS rn
   FROM data_raw) sa
WHERE (sa.rn % ceil(count(*)::FLOAT / 500::FLOAT)::BIGINT)=0;

这也会导致错误:

错误:WHERE 位置不允许聚合函数:108

ERROR: aggregate functions are not allowed in WHERE Position: 108

有没有人对如何解决我的查询或更好的方法有任何想法?

Does anyone have any ideas on either how to fix my query or a better way to do this?

我也考虑过使用随机数和概率来选择行，但我宁愿做一些确定性的事情，而不会出现结块的可能性.

I have also thought about using random numbers and probability to select rows, but I would rather do something deterministic without the possibility of clumping.

推荐答案

你第一次尝试的错误是你不能将聚合函数 count(*) 与 un-聚合行选择.您可以使用 count() 作为窗口聚合函数来解决此问题:

The mistake in your first attempt is that you can't mix the aggregate function count(*) with the un-aggregated selection of rows. You can fix this by using count() as window-aggregate function instead:

SELECT * FROM (
   SELECT *, ((row_number() OVER (ORDER BY "time"))
               % ceil(count(*) OVER () / 500.0)::int) AS rn
   FROM   data_raw
   ) sub
WHERE sub.rn = 0;

这里有详细解释:

在应用 LIMIT 之前获得结果计数的最佳方法

@Alexander 为您的最后一次尝试提供了解决方案.

@Alexander has a fix for your last attempt.

这篇关于通过均匀跳过行来选择固定数量的行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

通过均匀跳过行来选择固定数量的行 [英] SELECT fixed number of rows by evenly skipping rows

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

通过均匀跳过行来选择固定数量的行 [英] SELECT fixed number of rows by evenly skipping rows

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭