为什么SQLite CTE JOIN中的RANDOM()行为与其他RDBMS不同? [英] Why does RANDOM() in a SQLite CTE JOIN behave differently to other RDBMSs?

查看:108
本文介绍了为什么SQLite CTE JOIN中的RANDOM()行为与其他RDBMS不同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

公用表表达式(CTE)联接中的

RANDOM()值在 SQLite



SQL:

 
tbl1(n)AS(SELECT 1 UNION ALL SELECT 2),
tbl2(n,r)AS(SELECT n,RANDOM()FROM tbl1)
SELECT * FROM tbl2 t1交叉联接tbl2 t2;

示例SQLite结果:

  nrnr 
1 7058971975145008000 1 8874103142384122000
1 1383551786055205600 2 8456124381892735000
2 2646187515714600000 1 7558324128446983000
2 -1529979429149869800 2 7003770339419606000 $ b

每列中的随机数都不同。但是 CROSS JOIN 重复行-所以我希望每列有两对相同的数字- PostgreSQL Oracle 11g SQL Server 2014 (使用基于行的种子时)。



示例PostgreSQL / Oracle 11g / SQL Server 2014结果:

  nrnr 
1 0.117551110684872 1 0.117551110684872
1 0.117551110684872 2 0.221985165029764
2 0.221985165029764 1 0.117551110684872
2 0.221985165029764 2 0.221985165029764


问题


  1. 行为可以SQLite可以解释吗?是错误吗?

  2. CTE中的表B(基于同一CTE中的表A)是否有办法增加一列随机生成的数字,该数字将保持固定在 JOIN 中使用时?


解决方案

您的问题相当漫长而漫不经心-不是一个问题。但是,这很有趣,而且我学到了一些东西。



此语句不正确:


SQL Server为RAND()函数分配一个随机种子:在
a SELECT中使用时,它仅被种子一次,而不是为每一行种子。


SQL Server具有运行时常量函数的概念。这些函数是从编译查询中提取的,并在查询开始时每个表达式 执行一次。最突出的示例是 getdate()(以及相关的日期/时间函数)和 rand()



如果运行,您可以很容易地看到这一点:

  select rand(), rand()
from(值(1),(2),(3))v(x);

每个列具有相同的值,但列之间的值不同。



大多数数据库-包括SQLite-对 rand() / random()都有更直观的解释。 (作为个人说明,在每行上返回相同值的随机函数非常违反直觉。)每次调用该函数时,您将获得不同的值。对于SQL Server,通常使用使用 newid()的表达式:

 从(值(1),(2),(3))v [x)中选择rand(),rand(),rand(checksum(newid()))
;

对于第二个问题,似乎表明SQLite实现了递归CTE。所以这就是您想要的:

 与tbl1(n)AS(
选择1 UNION ALL SELECT 2
),
tbl2(n,r)AS(
SELECT n,RANDOM()
FROM tbl1
union all
select *
from tbl2
其中1 = 0

选择*
FROM tbl2 t1交叉加入tbl2 t2;

我没有看到这种情况的记录,因此使用后果自负。 这里是DB提琴。



而且,据记录,这似乎在SQL Server中也适用。我刚刚学到了一些东西!



编辑:



如评论中所建议,实现不一定总是发生。它似乎确实适用于同一级别的两个引用:

 与tbl1(n)AS(
SELECT 1 UNION ALL SELECT 2),
tbl2(n,r)AS(
SELECT n,RANDOM()
FROM tbl1
union all
select *
从tbl2
其中1 = 0

选择t2a.r,count(*)
从tbl2 t2a离开在t2a上加入
tbl2 t2b
。 r = t2b.r
GROUP BY t2a.r;


RANDOM() values in a Common Table Expression (CTE) join aren't behaving as expected in SQLite.

SQL:

WITH
  tbl1(n) AS (SELECT 1 UNION ALL SELECT 2),
  tbl2(n, r) AS (SELECT n, RANDOM() FROM tbl1)
SELECT * FROM tbl2 t1 CROSS JOIN tbl2 t2;

Sample SQLite results:

n   r                       n   r
1   7058971975145008000     1   8874103142384122000
1   1383551786055205600     2   8456124381892735000
2   2646187515714600000     1   7558324128446983000
2   -1529979429149869800    2   7003770339419606000

The random numbers in each column are all different. But a CROSS JOIN repeats rows - so I expected 2 pairs of the same number in each column - which is the case in PostgreSQL, Oracle 11g and SQL Server 2014 (when using a row-based seed).

Sample PostgreSQL / Oracle 11g / SQL Server 2014 results:

n   r                   n   r
1   0.117551110684872   1   0.117551110684872
1   0.117551110684872   2   0.221985165029764
2   0.221985165029764   1   0.117551110684872
2   0.221985165029764   2   0.221985165029764

Questions

  1. Can the behaviour in SQLite be explained? Is it a bug?
  2. Is there a way for Table B in a CTE (based on Table A in the same CTE) to have an additional column of randomly generated numbers, which will remain fixed when used in a JOIN?

解决方案

Your question is rather long and rambling -- not a single question. But, it is interesting and I learned something.

This statement is not true:

SQL Server assigns a random seed to the RAND() function: When used in a SELECT, it is only seeded once rather than for each row.

SQL Server has the concept of run-time constant functions. These are functions that are pulled from the compiled query and executed once per expression at the beginning of the query. The most prominent examples are getdate() (and related date/time functions) and rand().

You can readily see this if you run:

select rand(), rand()
from (values (1), (2), (3)) v(x);

Each column has the same values, but the values between the columns are different.

Most databases -- including SQLite -- have the more intuitive interpretation of rand()/random(). (As an personal note, a "random" function that returns the same value on each row is highly counter-intuitive.) Each time it is called you get a different value. For SQL Server, you would typically use an expression using newid():

select rand(), rand(), rand(checksum(newid()))
from (values (1), (2), (3)) v(x);

As for your second question, it appears that SQLite materializes recursive CTEs. So this does what you want:

WITH tbl1(n) AS (
      SELECT 1 UNION ALL SELECT 2
     ),
     tbl2(n, r) AS (
       SELECT n, RANDOM()
       FROM tbl1
       union all
       select *
       from tbl2
       where 1=0
      )
SELECT *
FROM tbl2 t1 CROSS JOIN tbl2 t2;

I have seen no documentation that this is the case, so use at your own risk. Here is a DB-Fiddle.

And, for the record, this seems to work in SQL Server as well. I just learned something!

EDIT:

As suggested in the comment, the materialization may not always happen. It does seem to apply to two references at the same level:

WITH tbl1(n) AS (
      SELECT 1 UNION ALL SELECT 2),
     tbl2(n, r) AS (
       SELECT n, RANDOM()
       FROM tbl1
       union all
       select *
       from tbl2
       where 1=0
      )
SELECT t2a.r, count(*)
FROM tbl2 t2a left JOIN
     tbl2 t2b
     on t2a.r = t2b.r
GROUP BY t2a.r;

这篇关于为什么SQLite CTE JOIN中的RANDOM()行为与其他RDBMS不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆