SQL Select内部连接一对一 [英] SQL Select Inner join one by one
问题描述
我有一个特定的请求要在我的数据库上执行(PostgreSQL v9.4.5),并且我看不到任何纯SQL的优雅解决方案来解决它(我知道我可以使用Python或其他方法来完成它,但是我有数十亿行数据,计算时间将大大增加.
I have a specific request to do on my database (PostgreSQL v9.4.5), and I don't see any elegant solution in pure SQL to solve it (I know I can do it using Python or other, but I have several billions lines of data, and the calculation time would be greatly increased).
我有两个表:交易和事件.这些表都表示一天中订单中发生的交易(这就是为什么我有数十亿行,我的数据超过几年)的原因,事件比交易多em>.
I have two tables : trades and events. These tables both represent the trades occurring in an orderbook during a day (this is why I have several billions lines, my data is over several years) but there are many more events than trades.
两个表都有 time , volume 和 quantity 列,但是每个表都有其他列(分别说 foo 和 bar ),并提供具体信息.我想在 time , volume 和 price 列的两个表之间建立对应关系,因为我知道这种对应关系存在为从交易到事件的注入(如果交易中有 n 行,同时 t ,价格 p 和相同体积的 v ,我知道在事件中还有 n 行,时间为 t ,价格 p 和数量 v ).
Both tables have columns time, volume and quantity, however each one has other columns (let's say respectively foo and bar) with specific information. I want to make a correspondence between the two tables on the columns time, volume and price, as I know this correspondence exists as an injection from trades to events (if there are n rows in trades with the same time t, the same price p and the same volume v, I know there are also n rows in events with the time t, the price p and the volume v).
交易:
id | time | price | volume | foo
-----+-----------+---------+--------+-------
201 | 32400.524 | 53 | 2085 | xxx
202 | 32400.530 | 53 | 1162 | xxx
203 | 32400.531 | 52.99 | 50 | xxx
204 | 32400.532 | 52.91 | 3119 | xxx
205 | 32400.837 | 52.91 | 3119 | xxx <--
206 | 32400.837 | 52.91 | 3119 | xxx <--
207 | 32400.837 | 52.91 | 3119 | xxx <--
208 | 32400.839 | 52.92 | 3220 | xxx <--
209 | 32400.839 | 52.92 | 3220 | xxx <--
210 | 32400.839 | 52.92 | 3220 | xxx <--
事件:
id | time | price | volume | bar
-----+-----------+---------+--------+------
328 | 32400.835 | 52.91 | 3119 | yyy
329 | 32400.837 | 52.91 | 3119 | yyy <--
330 | 32400.837 | 52.91 | 3119 | yyy <--
331 | 32400.837 | 52.91 | 3119 | yyy <--
332 | 32400.838 | 52.91 | 3119 | yyy
333 | 32400.838 | 52.91 | 3119 | yyy
334 | 32400.839 | 52.92 | 3220 | yyy <--
335 | 32400.839 | 52.92 | 3220 | yyy <--
336 | 32400.839 | 52.92 | 3220 | yyy <--
337 | 32400.840 | 52.91 | 2501 | yyy
我想要的是:
time | price | volume | bar | foo
-----------+---------+--------+------+-------
32400.837 | 52.91 | 3119 | xxx | yyy
32400.837 | 52.91 | 3119 | xxx | yyy
32400.837 | 52.91 | 3119 | xxx | yyy
32400.839 | 52.92 | 3220 | xxx | yyy
32400.839 | 52.92 | 3220 | xxx | yyy
32400.839 | 52.92 | 3220 | xxx | yyy
我无法进行经典的INNER JOIN操作,否则我将在两个表之间进行所有可能的交叉操作(在这种情况下,我将拥有6x6,然后是36行).
I cannot do a classic INNER JOIN, or else I will have all the possible crossing between the two tables (in this case I would have 6x6 then 36 rows).
尽管虽然可以容纳几行,但只有一行而不是一行.
The though thing is to have only one row versus one row, although several rows could fit.
谢谢您的帮助.
如我所说,例如,如果我使用经典的INNER JOIN
As I said, if I use a classic INNER JOIN, for example
SELECT * FROM events e,
INNER JOIN trades t
ON t.time = e.time AND t.price = e.price AND t.volume = e.volume
我会有类似的东西:
trade_id | event_id | time | price | volume | bar | foo
---------+----------+-----------+---------+--------+------+-------
205 | 329 | 32400.837 | 52.91 | 3119 | xxx | yyy
205 | 330 | 32400.837 | 52.91 | 3119 | xxx | yyy
205 | 331 | 32400.837 | 52.91 | 3119 | xxx | yyy
206 | 329 | 32400.837 | 52.91 | 3119 | xxx | yyy
206 | 330 | 32400.837 | 52.91 | 3119 | xxx | yyy
206 | 331 | 32400.837 | 52.91 | 3119 | xxx | yyy
207 | 329 | 32400.839 | 52.91 | 3119 | xxx | yyy
207 | 330 | 32400.839 | 52.91 | 3119 | xxx | yyy
207 | 331 | 32400.839 | 52.91 | 3119 | xxx | yyy
208 | 334 | 32400.837 | 52.92 | 3220 | xxx | yyy
208 | 335 | 32400.837 | 52.92 | 3220 | xxx | yyy
208 | 336 | 32400.837 | 52.92 | 3220 | xxx | yyy
209 | 334 | 32400.837 | 52.92 | 3220 | xxx | yyy
209 | 335 | 32400.837 | 52.92 | 3220 | xxx | yyy
209 | 336 | 32400.837 | 52.92 | 3220 | xxx | yyy
210 | 334 | 32400.839 | 52.92 | 3220 | xxx | yyy
210 | 335 | 32400.839 | 52.92 | 3220 | xxx | yyy
210 | 336 | 32400.839 | 52.92 | 3220 | xxx | yyy
但是我想要的是:
trade_id | event_id | time | price | volume | bar | foo
---------+----------+-----------+---------+--------+------+-------
205 | 329 | 32400.837 | 52.91 | 3119 | xxx | yyy
206 | 330 | 32400.837 | 52.91 | 3119 | xxx | yyy
207 | 331 | 32400.839 | 52.91 | 3119 | xxx | yyy
208 | 334 | 32400.837 | 52.92 | 3220 | xxx | yyy
209 | 335 | 32400.837 | 52.92 | 3220 | xxx | yyy
210 | 336 | 32400.839 | 52.92 | 3220 | xxx | yyy
推荐答案
这是我关于row_number的示例.
Here is my example with row_number.
此外,SQL Fiddle: SO 33608351
Also, SQL Fiddle: SO 33608351
with
trades AS
(
select 201 as id, 32400.524 as time, 53 as price, 2085 as volume, 'xxx' as foo union all
select 202, 32400.530, 53, 1162, 'xxx' union all
select 203, 32400.531, 52.99, 50, 'xxx' union all
select 204, 32400.532, 52.91, 3119, 'xxx' union all
select 205, 32400.837, 52.91, 3119, 'xxx' union all
select 206, 32400.837, 52.91, 3119, 'xxx' union all
select 207, 32400.837, 52.91, 3119, 'xxx' union all
select 208, 32400.839, 52.92, 3220, 'xxx' union all
select 209, 32400.839, 52.92, 3220, 'xxx' union all
select 210, 32400.839, 52.92, 3220, 'xxx'
),
events as
(
select 328 as id, 32400.835 as time , 52.91 as price , 3119 as volume , 'yyy' as bar union all
select 329 , 32400.837 , 52.91 , 3119 , 'yyy' union all
select 330 , 32400.837 , 52.91 , 3119 , 'yyy' union all
select 331 , 32400.837 , 52.91 , 3119 , 'yyy' union all
select 332 , 32400.838 , 52.91 , 3119 , 'yyy' union all
select 333 , 32400.838 , 52.91 , 3119 , 'yyy' union all
select 334 , 32400.839 , 52.92 , 3220 , 'yyy' union all
select 335 , 32400.839 , 52.92 , 3220 , 'yyy' union all
select 336 , 32400.839 , 52.92 , 3220 , 'yyy' union all
select 337 , 32400.840 , 52.91 , 2501 , 'yyy'
),
tradesWithRowNumber AS
(
select *
,ROW_NUMBER() over (PARTITION by time, price, volume order by time, price, volume) as RowNum
from trades
),
eventsWithRowNumber AS
(
select *
,ROW_NUMBER() over (PARTITION by time, price, volume order by time, price, volume) as RowNum
from events
)
select t.time,
t.price,
t.volume,
t.foo,
e.bar
FROM tradesWithRowNumber t
inner JOIN
eventsWithRowNumber e on e.time = t.time
AND e.price = t.price
AND e.volume = t.volume
and e.RowNum = t.RowNum
这篇关于SQL Select内部连接一对一的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!