使用python将巨大的XLS数据加载到Oracle中 [英] loading huge XLS data into Oracle using python

查看:108
本文介绍了使用python将巨大的XLS数据加载到Oracle中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个超过3百万条记录的XLS文件,我需要使用python 2.7将其转储到Or​​acle 12C DB(直接转储)中.

I have a 3+ million record XLS file which i need to dump in Oracle 12C DB (direct dump) using a python 2.7.

我正在使用Cx_Oracle python软件包建立与Oracle的连接,但是读取和转储XLS(使用openpyxl pckg)极其缓慢,并且成千上万条记录的性能下降.

I am using Cx_Oracle python package to establish connectivity to Oracle , but reading and dumping the XLS (using openpyxl pckg) is extremely slow and performance degrades for thousands/million records.

从脚本编写的角度来看,使用了两种方法-

From a scripting stand point used two ways-

  1. 我尝试了批量加载,方法是读取数组中的所有值,然后使用游标prepare(带有绑定变量)和游标fetchmany将其转储.不适用于海量数据.

  1. I've tried bulk load , by reading all the values in array and then dumping it using cursor prepare (with bind variables) and cursor fetchmany.This doesn't work well with huge data.

在获取数据时反复加载数据,即使这样也存在性能问题.

Iterative loading of the data as it is being fetched.Even this way has performance issues.

作为最佳实践,我可以部署哪些选项和技术/软件包以将这部分数据从XLS加载到Oracle DB?是否建议通过脚本加载这部分数据?或者我是否必须使用ETL工具? 到目前为止,我只能通过python脚本进行选择,因此请回答前者

What options and techniques/packages can i deploy as a best practise to load this volume of data from XLS to Oracle DB ?Is it advisable to load this volume of data via scripting or should i necessarily use an ETL tool ? As of now i only have option via python scripting so please do answer the former

推荐答案

如果可以将excel fila导出为CSV,那么您所需要做的就是使用sqlldr将文件加载到db

If is possible to export your excel fila as a CSV, then all you need is to use sqlldr to load the file in db

这篇关于使用python将巨大的XLS数据加载到Oracle中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆