SQL Loader,触发饱和吗? [英] SQL Loader, Trigger saturation?

查看:83
本文介绍了SQL Loader,触发饱和吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个无法找到解释的情况,这是事实. (我会使用假设的信息,因为原始信息确实很大.)

I have a situation i can't find an explanation for, here it is. (I'll use hypothetical info since the original are really big.)

我有一张桌子,可以说:

I have a table, let's say:

table_a
-------------
name
last name
dept
status
notes

此表在插入时有一个触发器,该触发器会对信息进行大量验证,以根据验证结果更改新记录的状态字段,其中一些验证是:

And this table has a trigger on insert, which does a lot of validation to the info to change the status field of the new record according to the results of the validation, some of the validations are:

- check for the name existing in a dictionary
- check for the last name existing in a dictionary
- check that fields (name,last name,dept) aren't already inserted in table_b
- ... and so on

问题是,如果我通过查询在表上进行插入,例如

The thing is, if I do an insert on the table via query, like

insert into table_a 
(name,last_name,dept,status,notes) 
values
('john','smith',1,0,'new');

只需173毫秒即可完成所有验证过程,更新状态字段并将记录插入表中. (验证过程将通过索引进行所有搜索)

it takes only 173 ms to do all the validation process, update the status field and insert the record in the table. (the validation process does all the searches via indexes)

但是,如果我尝试通过SQLloader读取具有5000条记录的文件,那么验证和插入149条记录大约需要40分钟的时间(当然我杀死了它...)

But if I try this via SQLloader, reading a file with 5000 records, it takes like 40 minutes to validate and insert 149 records (of course i killed it...)

所以我尝试加载数据以禁用触发器(以检查速度) 我知道它可以在不到10秒的时间内加载所有记录.

So I tried loading the data disabling the trigger (to check speed) and I got that it loads like all the records in less than 10 seconds.

所以我的问题是,我该怎么做才能改善这一过程?我唯一的理论是我可能会饱和数据库,因为它加载得如此之快并启动了许多触发器实例,但是我真的不知道.

So my question is, what can I do to improve this process? My only theory is that I could be saturating the database because it loads so fast and launches many instances of the trigger, but I really don't know.

我的目标是加载约60个带有信息的文件,并通过触发器中的过程对其进行验证(尽管愿意尝试其他选项).

My objective is to load around 60 files with info and validate them through the process in the trigger (willing to try other options though).

我真的很感激您可以提供的任何帮助!

I would really appreciatte any help you can provide!!

COMPLEMENT- -------------------------------------- ------------------------------------------

COMPLEMENT---------------------------------------------------------------------------------

感谢您的回答,现在我将阅读所有有关此内容的信息,希望您能在这一部分帮助我.让我解释一下我需要的一些功能(并且我使用了触发器,因为我什么也没想到)

Thanks for the answer, now i'll read all about this, now hope you can help me with this part. let me explain some of the functionality i need (and i used a trigger cause i couldn't think of anything else)

因此表数据带有以下(重要)字段:

so the table data comes with this (important) fields:

pid name lastname birthdate dept cicle notes

数据是这样的

name lastname birthdate dept

现在,触发器将对数据执行此操作:

now, the trigger does this to the data:

  1. 调用一个函数来计算pid(通过算法使用名称,姓氏和生日来计算)

  1. Calls a function to calculate the pid (is calculated based on the name, lastname and birthdate with an algorithm)

调用一个函数来检查字典上的名称(那是因为在我的字典中我只有一个名字,这意味着如果一个人被命名为john aaron smith jones,该函数会将john aaron分为两部分,然后搜索john和aaron在字典中的单独查询中,这就是为什么我不使用外键[以避免有很多组合john aaron,john alan,john pierce..etc])->一个带键的键而不更改字典...也许带CHECK?键,姓氏的外键是个好主意.

Calls a function to check for the names on the dictionary (thats because in my dictionary i have single names, meaning if a person is named john aaron smith jones the function splits john aaron in two, and searches for john and aaron in the dictionary in separate querys, thats why i didn't use a foreign key [to avoid having lots of combinations john aaron,john alan,john pierce..etc]) --->kinda stuck on how to implement this one with keys without changing the dictionary...maybe with a CHECK?, the lastname foreign key would be good idea.

根据dept和当前日期从另一张表中获取该cicle的值(因为一个人可以在同一dept中但在不同cicle中的表中出现两次)--->如何获取此cicle值以更有效的方式进行正确的搜索?

Gets the cicle from another table according to the dept and the current date (because a person can appear twice in the table in the same dept but in different cicle) --->how could i get this cicle value in a more efficient way to do the correct search?

最后,在完成所有验证之后,我需要确切知道不满足哪个验证(因此请注意字段),以便触发器将所有失败的验证字符串连接起来,像这样:

And finally, after all this validation is done, i need to know exactly which validation wasn't met (thus the field notes) so the trigger concatenates all the strings of failed validations, like this:

lastname not in dictionary, cannot calculate pid (invalid date), name not in dictionary

我知道,如果不满足约束检查,我所能做的就是将记录插入具有约束失败错误消息的另一张表中,但这仅使我进行一次验证,对吗?但是我需要对所有这些进行验证,然后将报告发送给其他部门,以便他们可以查看数据并对其进行所有必要的调整.

i know that if the constraint check isn't met all i could do is insert the record in another table with the constraint-failed error message, but that only leaves me with one validation, am i right? but i need to validate all of them and send the report to other department so they can review the data and make all the necessary adjustments to it.

无论如何,这是我现在的情况,我将探索各种可能性,希望您能对整个过程有所了解,非常感谢.

Anyway, this is my situation right now, i'll explore possibilities and hope you can share some light on the overall process, Thank you very much for your time.

推荐答案

您已经是解决方案的一半了:

You're halfway to the solution already:

因此,我尝试加载禁用触发器的数据(以检查速度)……它像所有记录一样在不到10秒的时间内加载."

"So I tried loading the data disabling the trigger (to check speed) ... it loads like all the records in less than 10 seconds."

这不足为奇.当前的实现为插入表B的每一行执行很多单行SELECT语句.这将不可避免地给您带来较差的性能. SQL是一种基于集合的语言,在多行操作中表现更好.

This is not a surprise. Your current implementation executes a lot of single row SELECT statements for each row you insert into table B. That will inevitably give you a poor performance profile. SQL is a set-based language and performs better with multi-row operations.

因此,您需要做的是找到一种替换所有SELECT语句的方法,这是更有效的替代方法.然后,您将能够永久删除触发器.例如,用表A列与参考表之间的外键替换字典上的查找.关系完整性约束是Oracle的内部代码,其性能比我们可以编写的任何代码都要好(并且也可以在多用户环境中工作).

So, what you need to do is find a way to replace all the SELECT statements which more efficient alternatives. Then you'll be able to drop the triggers permanently. For instance, replace the look-ups on the dictionary with foreign keys between the table A columns and the reference table. Relational integrity constraints, being internal Oracle code, perform much better than any code we can write (and work in multi-user environments too).

如果表B中已经存在多个列的组合,则不插入表A的规则更成问题.不是因为很难做到,而是因为它听起来像是糟糕的关系设计.如果您不希望在表B中已经存在的记录已经加载到表B中,为什么不直接将其加载到表B中呢?或者,也许您有一个列子集,应该从表A 表B中提取这些子集,并将其形成表C(该表与A和B具有外键关系)?

The rule about not inserting into table A if a combination of columns already exists in table B is more problematic. Not because it's hard to do but because it sounds like poor relational design. If you don't want to load records in table A when they already exits in table B why aren't you loading into table B directly? Or perhaps you have a sub-set of columns which should be extracted from table A and table B and formed into table C (which would have foreign key relationships with A and B)?

无论如何,您可以将其放在一边,通过使用外部表替换SQL * Loader,可以使用基于集合的SQL.外部表使我们能够将CSV文件呈现为数据库,就像常规表一样.这意味着我们可以在普通的SQL语句中使用它. 了解更多信息.

Anyway, leaving that to one side, you can do this with set-based SQL by replacing SQL*Loader with an external table. An external table allows us to present a CSV file to the database as if it were a regular table. This means we can use it in normal SQL statements. Find out more.

因此,在字典和外部表上具有外键约束的情况下,您可以使用以下语句替换SQL Loader代码(受其他规则归并为"...等等"):

So, with foreign key constraints on dictionary and an external table you can replace teh SQL Loader code with this statement (subject to whatever other rules are subsumed into "...and so on"):

insert into table_a
select ext.* 
from external_table ext
     left outer join table_b b
     on (ext.name = b.name and ext.last_name = b.last_name and ext.dept=b.dept)
where b.name is null
log errors into err_table_a ('load_fail') ;

这使用DML错误日志记录语法以基于集合的方式捕获所有行的约束错误. 了解更多.对于表B中已经存在的行,它不会引发异常.您可以使用多表INSERT ALL 可将行路由到溢出表中,或在事件后使用MINUS set操作在外部表中查找不在表A中的行.取决于您的最终目标以及您需要如何报告事情.

This employs the DML error logging syntax to capture constraint errors for all rows in a set-based fashion. Find out more. It won't raise exceptions for rows which already exist in table B. You could either use the multi-table INSERT ALL to route rows into an overflow table or use a MINUS set operation after the event to find rows in the external table which aren't in table A. Depends on your end goal and how you need to report things.

也许答案比您预期的要复杂. Oracle SQL是非常广泛的SQL实现,它具有许多功能来提高批量操作的效率.阅读《概念指南》和《 SQL参考》以弄清楚我们可以对Oracle进行多少工作,这确实使我们付出了很多.

Perhaps a more complex answer than you were expecting. Oracle SQL is a very extensive SQL implementation, with a lot of functionality for improving the efficient of bulk operations. It really pays us to read the Concepts Guide and the SQL Reference to find out just how much we can do with Oracle.

这篇关于SQL Loader,触发饱和吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆