`

转载:enqueue deadlocks,library cache deadlocks.

阅读更多
http://yong321.freeshell.org/computer/deadlocks.txt


There're two common deadlocks. ORA-60, enqueue deadlocks; ORA-4020, library 

cache deadlocks.

********** ORA-60 **********

Reproduction:

This is very easy to reproduce. Just follow this procedure:

In sessionA, update tb set col = 'x' where pk = 'aa'

In sessionB, update tb set col = 'x' where pk = 'bb'

In sessionA, update tb set col = 'x' where pk = 'bb'

In sessionB, update tb set col = 'x' where pk = 'aa'

If neither session commits, 3 seconds after the last statement shown above is 

issued, sessionA will throw error ORA-60, with the corresponding deadlock graph 

generated in the udump trace file:

                       ---------Blocker(s)--------  ---------Waiter(s)---------

Resource Name          process session holds waits  process session holds waits

TX-00010017-000000ba         8       7     X              9       8           X

TX-00020012-000000af         9       8     X              8       7           X

session 7: DID 0001-0008-00000002       session 8: DID 0001-0009-00000002

session 8: DID 0001-0009-00000002       session 7: DID 0001-0008-00000002

Rows waited on:

Session 8: obj - rowid = 00000CBB - AAAAzjAADAAAAFDAAA

Session 7: obj - rowid = 00000CBB - AAAAzjAADAAAAFDAAB

Interpretation:

Resource here simply means lock as in v$resource (technically it should be the

locked resource). TX stands for transaction, usually row lock. The numbers that 

follow are v$lock.id1, v$lock.id2, respectively, in hex radix for TX enqueue.

E.g., 20012[hex] is 131090; ba[hex] is 175. Anjo Kolk's Wait Event paper

(Description of Oracle7 Wait Events and Enqueues, used to be at

www.dbatoolbox.com/WP2001/dbamisc/events.pdf) tells us that id1 is "undo 

segment number << 16 | slot". Indeed we find v$transaction.xidusn = 2 and 

x$transaction.xidslot = 18. So 10[bin] left shifted 16 bits ORed with 18 equals 

131090. id2 should be "sequence"; v$transaction.xidsqn = 175. Process is 

Oracle process ID (not to be confused with the OS level process ID in UNIX or 

thread ID in Windows); it is v$process.pid where v$process.addr matches 

v$session.paddr. Session is v$session.sid. X means it's holding or waiting for 

an exclusive lock. DID is deadlock ID (see Note:6145177.8); the first number is 

always 1; second number is Oracle process ID; I don't know the third number. 

For sessions waiting for row locks, additional

information about the rows is given. CBB[hex] is dba_objects.object_id from

which you'll know the table. Then the rowid is given so that you can quickly

find the row by SELECT * FROM thetable WHERE ROWID = 'rowid'. If the "Rows

waited on" is followed by "no rows", then it's waiting on a table lock, not a

row lock, like in this case:

                       ---------Blocker(s)--------  ---------Waiter(s)---------

Resource Name          process session holds waits  process session holds waits

TX-00060004-00003155        41      45    X            15        26          S

TX-0012000b-00002802        15      26    X            41        45          S

session 45: DID 0001-0029-00000002 session 26: DID 0001-000F-00000002

session 26: DID 0001-000F-00000002 session 45: DID 0001-0029-00000002

Rows waited on:

Session 26: no row

Session 45: no row

Following the deadlock graph is process state dump, which shows state object 

(SO) hierarchy in properly indented form, with the root being a process SO.

Remedy:

If the "Session <SID>" line gives row IDs, simply wait and rerun the 

application later. And consider application design to see if you have two 

programs updating the same table but in different order. [note1]

If the Session line says "no row", the error could be triggered when multiple 

sessions try to update the same block while the free space for the block is 

already too small for Oracle to automatically increase ITLs (interested 

transaction lists), regardless the value of MAXTRANS. You can try rebuilding 

the table with a higher INITRANS (initial ITL entries) and setting a higher 

value of PCTFREE. Beginning with 8i, the easiest way to do this is ALTER TABLE

yourtable MOVE STORAGE (INITRANS xxx).

Another scenario which does happen often is that you don't have indexes on 

child table foreign key columns and multiple sessions are updating the parent 

table primary key columns or deleting rows from the parent table. Both sessions 

will attempt to acquire a shared table lock on the child table and if they

already hold row locks (on different rows), a deadlock results. In this case, 

you may build indexes on foreign keys in the child table. Numerous sites 

provide scripts to find foreign keys that are missing indexes 

(osi.oracle.com/~tkyte/unindex/ and www.jlcomp.demon.co.uk/faq/fk_ind.html). If 

you don't do this (possibly for legitimate reason such as child table DML 

performance), make sure you never or rarely delete rows from the parent table

(updating parent table primary keys is very rare). Or as Steve Adams suggested,

ALTER TABLE childtable DISABLE TABLE LOCK.

Further reading: Metalink Note:62365.1

********** ORA-4020 **********

Reproduction:

SQL> create procedure p1 as begin p2; end;

  2  /

Warning: Procedure created with compilation errors.

SQL> create procedure p2 as begin p1; end;

  2  /

Warning: Procedure created with compilation errors.

SQL> alter procedure p1 compile;

alter procedure p1 compile

*

ERROR at line 1:

ORA-04020: deadlock detected while trying to lock object YONG.P1

The above case is modified from an example in Connor McDonald's "Mastering

Oracle PL/SQL: Practical Solutions". Obviously, you can't have mutual library

cache locks in each other's code.

More commonly, this error is thrown due to invalid objects especially triggers,

or due to Oracle bugs. E.g., Jonathan Lewis in "Practical Oracle8i" reported

that query throws ORA-4020 if partitioned IOT maintenance is going on. An

ORA-4020 deadlock graph looks like the following:

ORA-04020: deadlock detected while trying to lock object YONG.P1

--------------------------------------------------------

  object   waiting  waiting       blocking blocking

  handle   session     lock mode   session     lock mode

--------  -------- -------- ----  -------- -------- ----

797992F8  7AA06C48 79FA5554    X  7AA06C48 79F8B3D0    X

--------------------------------------------------------

Note that in this case, the waiting session is the same as the blocking session

because they have the same session address.

Remedy:

If the deadlock is caused by the same session that experiences the deadlock, 

make sure you don't have any invalid objects on the table reported in the 

deadlock trace file, such as invalid triggers. If you do, manually compile 

them. With this deadlock, it's helpful to get more insight by tracing the event 

4020 as follows: alter session set events '4020 trace name processstate 

forever, level 10'; or if you don't know or can't access the session, put 

event='4020 trace name processstate forever, level 10' in init.ora file and 

bounce the database. The trace file thus created when the deadlock occurs next 

time may have strings like "status=INVL". Look at the corresponding object name 

after "name=". If you do experience ORA-4020 and it happens fairly often, search 

on Metalink or open a tar.

********** Deadlocks in RAC **********

There's no error number reported but both ORA-60 and ORA-4020 are possible. In 

alert.log, we only see "Global Enqueue Services Deadlock detected. More info in 

filen /home/oracle/admin/<SID>/bdump/<SID>_lmd0_<pid>.trc." ["filen" is not 

typo; it means "file name".]

LMD trace file has:

user session for deadlock lock 0x7553ab14

  pid=33 serial=37657 audsid=6450105 user: 246/<LinuxUsername>

  O/S info: user: <WindowsUsername>, term: MISFBB10-54, ospid: 4148:5156, machine: <NTdomain>\<PCname>

            program: sqlplus.exe

  application name: SQL*Plus, hash value=3669949024

  Current SQL Statement:

  update test set a = :"SYS_B_0" where a = :"SYS_B_1"

Global Wait-For-Graph(WFG) at ddTS[0.28] :

BLOCKED 0x7553ab14 5 [0xf001d][0x8353],[TX] [2162689,7995] 0

BLOCKER 0x69cabb5c 5 [0xf001d][0x8353],[TX] [1769474,4478] 1

BLOCKED 0x69cabc30 5 [0x70014][0xe908],[TX] [1769474,4478] 1

BLOCKER 0x7e1424e8 5 [0x70014][0xe908],[TX] [2162689,7995] 0

That was Oracle 9.2.0.7.0 on Linux accessed from a Windows PC. According to 

Note:262226.1, the first process in instance 0 (on the first line) requested for 

a TX lock in mode 5 or exclusive [note2] and was blocked by the second process in 

instance 1 (on the second line), which requested for another TX lock (third 

line) and was blocked by the first process (fourth line). The note interprets 

the numbers after the lock name, e.g. [2162689,7995], as process ID. But if 

there're two numbers in square brackets, those two numbers are transaction_id0 

and transaction_id1 of v$ges_blocking_enqueue, respectively, or equivalently 

the same columns in v$dlm_locks. (But I don't know what the transaction number 

means; it doesn't seem to be concatenation of xidusn.xidslot.xidsqn of 

v$transaction.) In addition, the hex numbers following "BLOCKED" or "BLOCKER" 

are handles (lock pointers) of v$ges_blocking_enqueue, and the two hex numbers 

after lock mode are id1 and id2 in v$lock (refer to 

http://yong321.freeshell.org/oranotes/LocksWhere2FindMoreInfo.txt on how to 

find their meanings).

The above trace shows the SQL involved in the deadlock. But in some cases, the 

SQL is missing,[note3] as in this example:

Global Wait-For-Graph(WFG) at ddTS[0.15] :

BLOCKED 0x70d17058 5 [0xf4dd2a46][0x57e4077],[LB] 30268 0

BLOCKER 0x70d16594 5 [0xf4dd2a46][0x57e4077],[LB] 30268 0

This deadlock was caused by B-namespace library cache lock requested and 

blocked by itself. The process PID is 30268 (a single number, not in brackets). 

Unfortunately it's missing the SQL. Note:262226.1 acknowledges that it would be 

more helpful to add the SQL to the WFG trace. In the meantime, if the deadlock 

can be reproduced with one instance (as in this case because the last column, 

instance number, is the same number), then do so by setting cluster_database 

to false; a deadlock graph in a non-cluster database is much more useful than a 

global WFG, which is created even if all involved processes are on one node.



Yong Huang

yong321@yahoo.com

________________________

[note1]

It's often suggested that applications should be written in a way to ensure the 

same order in running DMLs on the table. This means ProgramI should update

tblX then tblY and ProgramII should also update them in this order. But

there're many other cases than this rule should apply. Consider this case:

if conditionA, update tblX

if conditionB, update tblY

if conditionC, update tblX

if conditionD, update tblY

If this single program (no ProgramII) is run by multiple processes, depending

on the conditions met in each process, a deadlock could occur.

[note2]

According to

http://www.rachelp.nl/index_kb.php?menu=articles&actie=show&id=15

these mode numbers start with 0. So mode 5 here means exclusive, which would 

be mode 6 in non-RAC lock mode.

[note3]

According to Dusan Bolek (message 10 of

http://groups.google.com/group/comp.databases.oracle.server/browse_frm/thread/a1e9aacfc4e16e25/), 

the SQL is present if the deadlock occurs between instances and missing if 

on the same instance.





ORA-12829

ORA-99

12345678901234567890123456789012345678901234567890123456789012345678901234567890


分享到:
评论

相关推荐

    library cache lock等待事件

    Oracle数据库的Library Cache Lock等待事件是数据库性能优化中常见的问题之一,主要涉及到Oracle数据库的缓存管理和并发控制机制。在Oracle中,Library Cache是存储PL/SQL代码、SQL语句及其解析结果的地方,用于减少...

    rspec-enqueue_sidekiq_job:enqueue_sidekiq_job Sidekiq的RSpec块匹配器

    enqueue_sidekiq_job RSpec块匹配器 安装 # Gemfile group :test do gem 'rspec-enqueue_sidekiq_job' end 用法 检查某个作业是否已排入队列。 expect { AwesomeWorker . perform_async } . to enqueue_sidekiq_...

    priority-navigation, Priority 导航—的Javascript实现无相关性.zip

    priority-navigation, Priority 导航—的Javascript实现无相关性 PriorityNavigation.js如果你的菜单项不适合它的父项,那么PriorityNav是一个纯javascript插件,它将移动你的菜单项。 我们将查看网站的收费。...

    OKhttp包下载

    3. **发送请求**:通过OkHttpClient的newCall(Request)方法获取Call对象,然后调用execute()或enqueue()发送同步或异步请求。 4. **处理响应**:同步请求返回Response对象,异步请求则在Callback中接收结果。 在...

    Oracle RAC Cache Fusion介绍.pdf

    在 RAC 环境中,Cache Fusion 机制主要包括三个部分:Global Cache Service (GCS)、Global Enqueue Service (GES) 和 Dynamic Remastering (DRM)。 1. Global Cache Service (GCS) GCS 负责管理数据库中的块,包括...

    Retrofit2简单使用1

    call.enqueue(new Callback() { /* 回调处理 */ }); // 或者 Response&lt;User&gt; response = call.execute(); // 同步调用 ``` 4. **OkHttp集成**:Retrofit2默认使用OkHttp作为网络堆栈,提供了强大的网络请求功能,如...

    CSharp.8.Queue

    Enqueue ( " one " ); numbers . Enqueue ( " two " ); numbers . Enqueue ( " three " ); numbers . Enqueue ( " four " ); numbers . Enqueue ( " five " ); numbers . Enqueue ( " six " ); Obtener y eliminar...

    OkGo基本使用整理

    2. **自定义缓存策略**:可以实现`CacheControl`接口,自定义缓存策略。 3. **自定义Cookie管理**:通过实现`CookieJar`接口,实现自定义的Cookie管理策略。 4. **自定义证书**:对于HTTPS请求,可以配置SSL证书,...

    okHttp 3.6.0 依赖包 jar包 内含okio jar包 下载

    在编写网络请求代码时,可以创建OkHttpClient实例,定义Request对象,最后通过client的execute或enqueue方法发起请求。对于Okio,你可以使用它来优化文件读写或数据处理的性能。 总之,OkHttp和Okio是Java和Android...

    Oracle_RAC知识汇总.docx

    如果某个节点的cache被修改,更改会通过集群软件传播到其他节点,最后写入磁盘以保持一致性。 四、ClusterWare组件 1. CRSd:集群资源服务,管理集群资源的启动、停止和监控。 2. Cssd:集群同步服务,保证集群节点...

    Chain-queue-validation-test.rar_site:www.pudn.com

    在链队列中,我们通常有两个关键操作:入队(enqueue)和出队(dequeue)。入队操作是在队尾添加新元素,而出队操作则是移除队首元素。在链队列实现中,队头和队尾通常用两个指针来标识,分别是front和rear。当front...

    duilie.rar_site:www.pudn.com

    队列的基本操作包括入队(enqueue)、出队(dequeue)、查看队首元素(front)和检查队列是否为空(isEmpty)。 1. **入队操作**:在队尾添加新的元素。如果队列满,可以考虑使用循环队列来避免溢出问题。在C语言中...

    Threadpool:轻便,快速,适用于C ++ 20的线程池

    // Enqueue and return future.auto result = pool.enqueue([]( int x) { return x; }, 42 );// Get result from future.std::cout &lt;&lt; result.get() &lt;&lt; std::endl; 另外, riften::Thiefpool提供了分离...

    AndroidStudioRetrofit:通过square library使用retrofit从json文件中加载图片和文本,并在listview中显示

    为了在ListView中显示结果,我们需要在主线程之外处理这个请求,可以使用`enqueue()`方法: ```java service.getData().enqueue(new Callback&lt;List&lt;DataModel&gt;&gt;() { @Override public void onResponse(Call...

    resque-delayed:Resque 的延迟作业排队

    Resque::延迟Resque 的作业排队延迟。 将仅在指定延迟后或将来特定时间出现以供处理的作业入队。... 调用Resque.enqueue_in或Resque.enqueue_at而不是Resque.enqueue例如: class User after_create :send_call_to

    基础算法-python实现二叉树遍历

    def enqueue(self, item): self.q.append(item) def dequeue(self): # if self.q != []: if len(self.q)&gt;0: return self.q.pop(0) else: return None def length(self): return len(self.q) def ...

    队列应用 入队 出队

    5. EnQueue 函数:EnQueue 函数用于将元素添加到队列的末尾。如果队列已经满了,EnQueue 函数将返回错误代码 ERROR。 6. DeQueue 函数:DeQueue 函数用于删除队列的队头元素,并将其返回给调用者。如果队列为空,...

    数据结构课件

    4. 队列的基本操作:enqueue(入队)、dequeue(出队)。 **第4章 串** 1. 定义:一串字符的集合,可实现字符串处理。 2. 字符串的基本操作:查找、替换、连接、截取等。 3. 串的两种存储方式:定长数组(固定大小...

Global site tag (gtag.js) - Google Analytics