Python中数据库查询的多处理/多线程 [英] Multiprocessing/multithreading for database query in Python
问题描述
我在数据库中有数百万条记录,我想通过 Python 读取它并将其存储在 Pandas 数据框中.问题是select查询处理时间非常长.为了减少查询处理时间,我尝试对其执行多线程我创建了 3 个线程并根据每个线程进行查询,例如
I have millions of records in database and I want to read it through Python and store it in pandas data frame . The problem is the select query processing time is very high. To reduce the query processing time I try to perform multi threading on it I created 3 threads and make the query on basis of each thread like
Select * from ( select *,rownum over (order by col1) rn from table) where rn%3=0
Select * from ( select *,rownum over (order by col1) rn from table) where rn%3=1
Select * from ( select *,rownum over (order by col1) rn from table) where rn%3=2
然后我通过线程包在 Python 中使用线程运行每个查询.
Then I run the each query with threading in Python by threading package.
但它也没有减少多少时间
But it also not reducing the time much
是否有其他方法可以减少查询读取时间.注意-我同时使用了 jdbc 和 odbc 连接
Is there any other approach I can take to reduce the query reading time. Note- I have used both jdbc and odbc connection
推荐答案
下面的链接帮助了我使用 JDBC 连接和池进行多处理我可以在我的 local.machine 上获得大约 25% 的收益.
The below link helped me Multiprocessing with JDBC connection and pooling I can get around 25% gain on my local.machine.
这篇关于Python中数据库查询的多处理/多线程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!