将大量的行分解成较小的查询?平行性 [英] Breaking down a large number of rows into smaller queries? Parallelism
问题描述
我想创建一个外部应用程序,该应用程序将从一个大型Oracle数据库中查询一个表.
I want to create a external application which will query one table from a large Oracle database.
查询每天运行一次,预计将处理30,000+行.
The query will run daily and I am expecting to handle 30,000+ rows.
要分解这些行的大小,我想为存在的每10,000行创建一个新的线程/进程.因此,按照上图来看,将有3个线程来处理所有这些行.
To break down the size of these rows, I would like to create a new thread/ process for each 10,000 rows that exist. So going by the above figure it would be 3 threads to process all those rows.
我不希望每个线程彼此重叠行集,所以我知道我需要在表中添加一列以作为范围标记,row_position
I don't want each thread to overlap each others row set so I know I will need to add a column within the table to act as a range marker, a row_position
逻辑
Get row_count of data set in query parameters
Get first_row_pos
While (row_count > 10,000)
{
Create thread with 10,000 rows starting from first_row_pos
row_count == row_count - 10,000
first_row_pos = first_row_pos + 10,000
}
create thread for remaining rows
all threads run their queries concurrently.
这是目前的基本逻辑,但是我不知道这是多么可行.
This is basic logic at the moment, however I do not know how feasible this is.
这是好方法还是更好的方法?
Is this a good way or is there a better way?
这可以通过与每个线程共享的一个数据库连接来完成,还是为每个线程建立一个单独的数据库连接更好呢?
Can this be done through one database connection with each thread sharing or is it better to have a seperate db connection for each thread?
还有其他建议吗?
注意:我只是意识到,如果这种情况下的行数少于10,000,则执行do while循环会更好.
Note: I just realised a do while loop would be better if there is less than 10,000 rows in this case.
谢谢
推荐答案
Oralce为诸如此类的缝合提供了并行提示,在这种情况下,您需要进行全表扫描或类似的问题,并希望利用多个内核来分配工作量. 此处.
Oralce provide a parallel hint for sutuations such as this where you have a full table scan or similar problem and want to make use of multiple cores to divide the workload. Further details here.
语法非常简单,您可以指定表(或别名)和核数(我通常默认为默认值),例如:
The syntax is very simple, you specify the table (or alias) and the number of cores (I usually leave as default) e.g.:
select /*+ parallel(a, default) */ *
from table_a a
您还可以将其用于多个表,例如
You can also use this with multiple tables e.g.
select /*+ parallel(a, default) parallel(b,default) */ *
from table_a a, table_b b
where a.some_id = b.some_id
这篇关于将大量的行分解成较小的查询?平行性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!