从Oracle跟踪文件中查找死锁错误的原因 [英] Finding cause of deadlock error from oracle trace file
问题描述
当多个用户正在使用该应用程序时,我的应用程序中经常会出现在等待资源时检测到ora-00060死锁"错误.我已经从oracle Admin获得了跟踪文件,但是在读取它时需要帮助.以下是跟踪文件中的一些数据,我希望这将有助于找到原因.
I have been getting this "ora-00060 deadlock detected while waiting for resource" error often now in my application when multiple users are using the application. I have got the trace file from the oracle Admin, but need help in reading it. Below is bits of data from the trace file, which i hope would help in locating the cause.
*** 2013-06-25 09:37:35.324
DEADLOCK DETECTED ( ORA-00060 )
[Transaction Deadlock]
The following deadlock is not an ORACLE error. It is a deadlock due
to user error in the design of an application
or from issuing incorrect ad-hoc SQL. The following
information may aid in determining the deadlock:
Deadlock graph:
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TM-000151a2-00000000 210 72 SX SSX 208 24 SX SSX
TM-000151a2-00000000 208 24 SX SSX 210 72 SX SSX
session 72: DID 0001-00D2-000000C6 session 24: DID 0001-00D0-00000043
session 24: DID 0001-00D0-00000043 session 72: DID 0001-00D2-000000C6
Rows waited on:
Session 72: no row
Session 24: no row
----- Information for the OTHER waiting sessions -----
Session 24:
sid: 24 ser: 45245 audsid: 31660323 user: 90/USER
flags: (0x45) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40009) -/-/INC
pid: 208 O/S info: user: zgrid, term: UNKNOWN, ospid: 2439
image: oracle@xyz.local
client details:
O/S info: user: , term: , ospid: 1234
machine: xyz.local program:
current SQL:
delete from EMPLOYEE where EMP_ID=:1
----- End of information for the OTHER waiting sessions -----
Information for THIS session:
----- Current SQL Statement for this session (sql_id=dyfg1wd8xa9qt) -----
delete from EMPLOYEE where EMP_ID=:1
===================================================
如果有人能告诉我死锁图::"在说什么,我将不胜感激.同样,在该节上等待的行表示没有行.
I would appreciate if some one can tell me what the "Deadlock graph::" is saying. Also the rows waited on section says no rows.
我还在一些博客中读到,跟踪文件中的"sqltxt"部分可以建议原因.下面是我在该部分中看到的查询.
I also read in some blogs that "sqltxt" section from the trace file can suggest the cause. Below is the query i see in that section.
select /*+ all_rows */ count(1) from "USERS"."EMPLOYEE_SALARY" where EMPSAL_EMP_ID=:1
employee_salary表对EMPSAL_EMP_ID列具有外键约束.
The employee_salary table has foreignkey constraint on EMPSAL_EMP_ID column.
sql提示中显示"all_rows",这是否意味着从employee表中删除记录时该表将获得表级锁定?我目前在外键列上没有索引.在此列上添加索引会有所帮助吗?
The sql hint says "all_rows", so does it mean that this table gets table level lock when deleting records from employee table? i dont have an index on the foreign key column currently. Would adding an index on this column help?
如果需要更多信息,请张贴.
Kindly post, in case any more information is need.
谢谢
推荐答案
首先,select
语句从不锁定Oracle中的任何内容,仅使用最后可用的一致版本的数据.自Oracle 9i起,select ... for update
不会像update
那样锁定数据,但不是这种情况,但是问题查询中没有for update
子句.
First of all, select
statement never lock anything in Oracle, just uses last available consistent version of data. It's not a case for select ... for update
which locks data like update
since Oracle 9i, but there are no for update
clause in the query from question.
Resource Name process session holds waits process session holds waits
TM-000151a2-00000000 210 72 SX SSX 208 24 SX SSX
会话#72拥有行独占"类型(SX)的表级锁(TM),并希望在同一表上获取共享行独占"(SSX)锁.该会话已被会话#24阻止,该会话已经持有相同类型的表级锁(SX),并等待SSX锁可用.
Session #72 holds table-level lock (TM) with "Row Exclusive" type (SX) and want to acquire "Share Row Exclusive" (SSX) lock on same table. This session blocked by Session #24 which already holds table-level lock of a same type (SX) and waits while SSX lock would be available.
Resource Name process session holds waits process session holds waits
TM-000151a2-00000000 208 24 SX SSX 210 72 SX SSX
此(第二行)的情况完全相同,但方向相反:会话#24等待SSX锁可用,但被会话#72阻止,该会话#72已经在同一表上拥有SX锁.
This (second row) demonstrates exactly same situation, but in opposite direction: Session #24 waits for SSX lock become available, but blocked by Session #72 which already holds SX lock on same table.
因此,会话#24和会话#72相互阻塞:发生死锁.
So, Sessions #24 and Session #72 blocks each other: deadlock happens.
这两种锁类型(SX和SSX)都是表级锁.
为了了解这种情况,我建议阅读Franck Pachot的本文.
Both lock types (SX and SSX) are table-level locks.
To understand the situation I recommend to read this article by Franck Pachot.
以下是本文的引文,与您的情况直接相关(请注意,SSX和SRX的缩写是等效的):
Below is citation from this article, which directly relevant to your situation(note that SSX and SRX abbreviations are equivalent):
引用完整性还获得TM锁.例如,常见的 未索引外键的问题导致子表上的S锁 您在父表上发布删除或更新密钥.这是 因为没有索引,Oracle就没有单个较低级别的资源可用于 为了防止并发插入可能会违反 参照完整性.
当外键列为前导时 常规索引中的列,然后是父级的第一个索引条目 值可用作单个资源,并用行级别TX锁定 锁.
如果参照完整性具有on Delete级联,该怎么办?在 除了S模式外,还打算更新 子表,与行X(RX)模式相同.这是分享行 排除(SRX):S + RX = SRX.
Referential integrity also acquires TM locks. For example, the common issue with unindexed foreign keys leads to S locks on child table when you issue a delete, or update on the key, on the parent table. This is because without an index, Oracle has no single lower level resource to lock in order to prevent a concurrent insert that can violate the referential integrity.
When the foreign key columns are the leading columns in a regular index, then the first index entry with the parent value can be used as a single resource and locked with a row level TX lock.
And what if referential integrity has an on delete cascade? In addition to the S mode, there is the intention to update rows in the child table, as with Row X (RX) mode. This is where the share row exclusive (SRX) occurs: S+RX=SRX.
因此,最可能的变体是会话#72和会话#24同时删除EMPLOYEE
表中的某些行,并且on delete cascade
受到on delete cascade
约束,而EMPLOYEE_SALARY
上没有索引EMPSAL_EMP_ID
列首先列出的表格.
So, most probable variant is that Session #72 and Session #24 deletes some rows in EMPLOYEE
table at same time, and there are on delete cascade
constraint for EMPSAL_EMP_ID
in conjunction with absence of index on EMPLOYEE_SALARY
table in which EMPSAL_EMP_ID
column listed first.
这篇关于从Oracle跟踪文件中查找死锁错误的原因的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!