加载庞大数据仓库表的更快方法 [英] Faster way to load huge data warehouse table

查看:53
本文介绍了加载庞大数据仓库表的更快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 oracle 11g 中有一个表,大小为 62GB,有 12 亿条记录.表有 4 列,范围按月份分区,哈希子分区按交易号.

I have a table in oracle 11g with size 62GB and 1.2 billion records. Table has 4 columns, range partitioned on month and hash sub partition on transaction number.

每周大约有 8000 万条记录被延迟并重新插入到此表中.这个删除&使用 NOAPPEND PARALLEL 提示和 .

Around 80 million records are delated and re-inserted into this table once in every week. This delete & insert process is taking ~4 hours when done with NOAPPEND PARALLEL hint and .

有什么办法可以加快这个过程吗?我能想到的一种方法是用 APPEND 替换 NOAPPEND 但这会导致空间浪费和表大小的急剧增加.

Is there any way i could speed up this process? One way i could think is to replace NOAPPEND with APPEND but that will lead to space wastage and drastic increase in table size.

推荐答案

在处理如此大的表时,您必须已经对数据进行批处理,以便在单个批处理中仅涉及一个分区.如果有任何方法可以通过更新现有记录或针对已删除记录逻辑设置删除标志并插入新记录来避免物理删除.

When working on such large tables you must already be batching your data to touch only one partition in single batch. If there is any way to avoid physical deletes by updating the existing record or logically set the delete flag against deleted record and insert new records.

在这种情况下优化整个加载过程的另一种方法是:

The other way of optimizing the whole load process in this case would be:

  1. 创建一个具有相同结构和相同分区的单独暂存表,您当前的批处理正在为其执行加载.
  2. 加载该分区的现有数据仍然有效(不是删除候选记录),同时插入要插入到目标表中的新数据.
  3. 一旦暂存表拥有需要放入现有表中的所有数据,将分区从暂存表切换到目标表,与此 页面.

这篇关于加载庞大数据仓库表的更快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆