- 浏览: 4416141 次
- 性别:
- 来自: 厦门
文章分类
- 全部博客 (634)
- Oracle日常管理 (142)
- Oracle体系架构 (45)
- Oracle Tuning (52)
- Oracle故障诊断 (35)
- RAC/DG/OGG (64)
- Oracle11g New Features (48)
- DataWarehouse (15)
- SQL, PL/SQL (14)
- DB2日常管理 (9)
- Weblogic (11)
- Shell (19)
- AIX (12)
- Linux/Unix高可用性 (11)
- Linux/Unix日常管理 (66)
- Linux桌面应用 (37)
- Windows (2)
- 生活和工作 (13)
- 私人记事 (0)
- Python (9)
- CBO (15)
- Cognos (2)
- ORACLE 12c New Feature (2)
- PL/SQL (2)
- SQL (1)
- C++ (2)
- Hadoop大数据 (5)
- 机器学习 (3)
- 非技术 (1)
最新评论
-
di1984HIT:
xuexilee!!!
Oracle 11g R2 RAC高可用连接特性 – SCAN详解 -
aneyes123:
谢谢非常有用那
PL/SQL的存储过程和函数(原创) -
jcjcjc:
写的很详细
Oracle中Hint深入理解(原创) -
di1984HIT:
学习了,学习了
Linux NTP配置详解 (Network Time Protocol) -
avalonzst:
大写的赞..
AIX内存概述(原创)
操作系统:Solaris 数据库环境:9i 9.2.0.8
不知道是Oracle的600bug太多还是最近笔者运气太好,经常遇到600bug,上礼拜5在处理完客户数据库无法正常启动后,立马出现了600bug,详细错误如下
ORA-00600: internal error code, arguments: [504], [0x38006F868], [160], [7], [shared pool], [2], [0], [0x38006F778]
刚开始笔者以为只有在数据库起停的时候才会出现这个错误,后来隔天客户给我打电话说,应用系统无法连接,通过查看日志推断出是由于上述bug导致内存溢出,从而使JBoss没有正常连接到数据库。
在网上找到以下资料
信息1
相关:matalink查到信息
Bug 5888835: ORA-600 [504] DURING FLUSH SHARED_POOL
--------------------------------------------------------------------------------
Bug 属性
--------------------------------------------------------------------------------
类型 B - Defect 已在产品版本中修复 -
严重性 2 - Severe Loss of Service 产品版本 9.2.0.8
状态 36 - Duplicate Bug. To Filer 平台 59 - HP-UX PA-RISC (64-bit)
创建时间 17-Feb-2007 平台版本 11.11
更新时间 18-Feb-2007 基本 Bug 5508574
数据库版本 9.2.0.8
影响平台 Generic
产品源 Oracle
相关产品
--------------------------------------------------------------------------------
产品线 Oracle Database Products 系列 Oracle Database
区域 Oracle Database 产品 5 - Oracle Server - Enterprise Edition
Hdr: 5888835 9.2.0.8 RDBMS 9.2.0.8 VOS HEAP MGMT PRODID-5 PORTID-59 ORA-600 5508574
Abstract: ORA-600 [504] DURING FLUSH SHARED_POOL
*** 02/17/07 03:49 am ***
TAR:
----
PROBLEM:
--------
Ct got the following error when upgrading 8.1.7.4 to 9.2.0.8. This seems
to occur when flushing shared_pool on the upgrade script.
ORA-600: internal error code, arguments: [504], [0xC00000038E5B2530],
[640], [7], [shared pool], [2], [0], [0xC00000038E5B2418]
This can also be reproduced by manually flushing shared pool.
The customer's system has 64 CPUs. The ORA-600 [504] was suppressed
when _kgl_latch_count=30 was se
t.
DIAGNOSTIC ANALYSIS:
--------------------
The heapdump level 2 information when ORA-600[504] occurred says the following.
The number of next slot reaches to 255.It seems to be the same issue as bug 5562921 (base bug 5508574).
Thanks.
====
/home/oracle/udump/dz00001_ora_22436.trc
Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
With the Partitioning option
JServer Release 9.2.0.8.0 - Production
ORACLE_HOME = /opt/oracle/product/9.2.0
System name: HP-UX
Node name: DC-0-001
Release: B.11.11
Version: U
Machine: 9000/800
Instance name: DZ00001
Redo thread mounted by this instance: 1
Oracle process number: 17
Unix process pid: 22436, image: oracle@DC-0-001 (TNS V1-V3)
*** 16:17:31.085
*** ID:(20.280) 2007-02-17 16:17:31.083
KGH Latch Directory Information
ldir state: 2 next slot: 255
Slot [ 1] Latch: c0000001a654c5c0 Index: 2 Flags: 3 State: 2 next:
0000000000000000
Slot [ 2] Latch: c000000182a04468 Index: 1 Flags: 3 State: 2 next:
0000000000000000
Slot [ 3] Latch: c0000001a654dcb0 Index: 2 Flags: 3 State: 2 next:
0000000000000000
Slot [ 4] Latch: c000000182a047e8 Index: 1 Flags: 3 State: 2 next:
0000000000000000
Slot [ 5] Latch: c000000182a04940 Index: 2 Flags: 3 State: 2 next:
0000000000000000
Slot [ 6] Latch: c0000001a654e340 Index: 1 Flags: 3 State: 2 next:
0000000000000000
Slot [ 7] Latch: c0000001a6553528 Index: 2 Flags: 3 State: 2 next:
0000000000000000
Slot [ 8] Latch: c0000001a65557f8 Index: 1 Flags: 3 State: 2 next:
0000000000000000
Slot [ 9] Latch: c0000001a6556988 Index: 2 Flags: 3 State: 2 next:
c0000001a65aff98
Slot [ 10] Latch: c0000001a6558328 Index: 1 Flags: 3 State: 2 next:
c0000001a65affb0
WORKAROUND:
-----------
Setting _kgl_latch_count=30
RELATED BUGS:
-------------
bug 5562921
bug 5508574
REPRODUCIBILITY:
----------------
It can be reproduced by manually flushing shared pool.
(without _kgl_latch_count=30)
TEST CASE:
----------
none
STACK TRACE:
------------
ksedmp ksddoa ksdpcg ksdpec ksfpec kgeriv kgesiv ksesic7 kslgetl ksfglt
kghupr_flg kghupr kglrfcl kglobcl kglobfr kglobf0 kglhpd_internal kglhpd
kghfrx kghfrunp kghfsh_helper kghfsh kkyasy opiexe opiall0 kpoal8 opiodr
ttcpip opitsk opiino opiodr opidrv sou2o main
SUPPORTING INFORMATION:
-----------------------
I will trace files later.
24 HOUR CONTACT INFORMATION FOR P1 BUGS:
----------------------------------------
DIAL-IN INFORMATION:
--------------------
IMPACT DATE:
------------
*** 02/17/07 03:58 am *** (CHG: Sta->16)
*** 02/17/07 03:58 am ***
*** 02/17/07 04:40 am *** (CHG: Asg->NEW OWNER OWNER)
*** 02/17/07 05:45 am ***
Well, this seems to be same issue with bug#5562921.
heapdump level 2 shows
-------
*** ID:(20.280) 2007-02-17 16:17:31.083
KGH Latch Directory Information
ldir state: 2 next slot: 255
:
Slot [253] Latch: c000000190c02cb0 Index: 2 Flags: 3 State: 2 next:
0000
01a65b1420
Slot [254] Latch: c000000190c02b98 Index: 1 Flags: 3 State: 2 next:
0000
01a65b1438
This means we have registered 255 latches to kgh latch directory, and it is full. (See bug#5508574 update by Joan at 09/08/06 12:24 pm, there is small bug in this print routine and 255 is not next slot, but last slot)
ORA-600 means latch hierarchy violation. We tried to get 2nd child shared
pool latch with wait get, that is not permitted.
This happend when freeing library cache object from shared pool, we need to get correct library cache latch to do it. However, probablly due to kgl latch is not registered correctly to kgh latch directory, we end up with requesting latch with wrong order, or strange latch. kgl latch is not registered correctly because kgh latch directory is full.It has 255 slots, and it seems full. We regsiter many latches to kgh latchdirectory, however, if kgl latch number is big, we will fill kgh latch
directory and some latch is not registered correctly.
*** 02/17/07 05:47 am ***
In my 9.2.0.8 env (with 2 cpu) we register 208 latches.
number of latches which will be allocated to kgh latch directory is not calculated easily. because we register many kind of latches.
heapdump level 2 will print latch address registered to kgh, so you can find the name of latch via v$latch, v$latch_children and v$latchname.
To see if an instance gets this problem, taking heapdump level 2 and check "next slot:" value. If it is 255, most likely that instance gets
this problem.
*** 02/17/07 05:48 am ***
*** 02/17/07 06:09 am ***
*** 02/17/07 06:35 am ***
*** 02/18/07 04:14 pm *** (CHG: Sta->36 SubComp->VOS HEAP MGMT)
*** 02/18/07 06:51 pm ***
From 9.2.0.8, we allocate many row cache latches compared to elder PSR.This is due to enhauncement introduced in 5040691.So, disable this fix is another workaround.If I set _more_rowcache_latches = false in init.ora,I see (only) 238 latches are registered to kgh latch directory even I set _kgl_latch_count=67 _kghdsidx_count=3
If I don't set _more_rowcache_latches = false,I can see ORA-600 simply startup and shutdown database
信息2
Symptoms
Mon Nov 22 08:50:45 2010
Errors in file /home/oracle/admin/orcl/udump/orcl_ora_127.trc:
ORA-00600: internal error code, arguments: [504], [0x380068D90], [160], [7], [shared pool], [2], [0], [0x380068CA0]
Mon Nov 22 08:51:03 2010
Errors in file /home/oracle/admin/orcl/udump/orcl_ora_444.trc:
ORA-00600: internal error code, arguments: [504], [0x380068E80], [160], [7], [shared pool], [3], [0], [0x380068D90]
The following query returns _kgl_latch_count > 31
SQL>
select a.ksppinm aa, b.ksppstvl bb
from x$ksppi a, x$ksppsv b
where a.indx=b.indx
and a.ksppinm='_kgl_latch_count';
OR
the query may return query return that _kgl_latch_count is 0, but when issuing a 'show parameter cpu_count', it returns a value of 32 or greater.
SQL> show parameter cpu_count;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
cpu_count integer 128
Cause
This is Bug 5888835 ORA-600 [504] DURING FLUSH SHARED_POOL
closed as a duplicate of
Base Bug 5508574 ORA-00600 [99999] , ORA-07445 [KGSCDUMP()+673]
The latch directory size exceeds 255 when _kgl_latch_count > 31.
However, even when the _kgl_latch_count is equal to 0 (default value), if the cpu_count is >=32 the bug still applies.
This is due to the as the _kgl_latch_count default value is calculated as next prime number after the value returned by cpu_count. So, this bug could still apply if the cpu_count=32 as the _kgl_latch_count would be calcuated to the next prime number would be 37.
Solution
1. Upgrade to the 10.2.0.4 patchset or the 11g release.
OR
2. You can use workaround of setting parameter _kgl_latch_count <= 31 in your pfile/spfile.
(please note that this may adversely affect the concurrency)
OR
3. If available for your platform/version, download and apply Patch 5508574
信息3
ora-00600 错误查询地址:http://metalink.oracle.com/metalink/plsql/ml2_documents.showNOT?p_id=153788.1&p_showHeader=1&p_showHelp=0
错误描述:ORA-00600: internal error code, arguments: [504], [0x38006BF08], [160], [7], [shared pool], [5], [0], [0x38006BE18]:
Description
An ORA-600 [504] can occur on the "shared pool" latch while
freeing a kglf heap. kglfall() and kghfrunp() will be in the call stack trace.
Workaround:
Set _kghdsidx_count=1 to disable multiple shared pool subpools
决定修改_kghdsidx_count。
select a.ksppinm,b.ksppstvl from x$ksppi a,x$ksppsv b where a.indx=b.indx and a.ksppinm='_kghdsidx_count';
KSPPINM KSPPSTVL
--------- ---------
_kghdsidx_count 7
alter system set "_kghdsidx_count"=1 scope=spfile;
重启数据库,实际效果如何有待检验。
基本上可以可以得出结论是由于Oracle 9.2.0.8在处理latch时出现的bug,解决方式
1、将Oracle升级到10.2.0.4或者11.1g
2、为Oracle打Patch 5508574
3、将_kghdsidx_count修改成较小的值(虽然资料上显示都是将_kghdsidx_count修改为1,但基于笔者对_kghdsidx_coun的理解,个人感觉应调小该参数即可,没有必要调整为1),将_kgl_latch_count调整为小等于31的值,例如30.
下面介绍下_kghdsidx_count和_kgl_latch_count这两个参数。
_kgl_latch_count
It sets the number of child library cache latches. The default is the least prime number greater than or equal to cpu_count. The maximum is 67. It can safely be increased to combat library cache latch contention, as long as you stick to prime numbers. However it is only effective if the activity across the existing child library cache latches is evenly distributed as shown in V$LATCH_CHILDREN.
adjusting _kgl_latch_count is normally effective to reduce library cache latch contention. But stick to prime numbers less than or equal to 67, and no larger than necessary.
oracle提供了多个library cache latch(这样,每个library cache latch都称为子latch)来保护library cache中的bucket。这些子latch的数量由一个隐藏参数决定:_kgl_latch_count。该参数缺省值为大于等于系统中CPU个数的最小的素数。比如在一个具有4个CPU的生产环境中,library cache latch的个数为5,如下所示。但是oracle内部(9i版本)规定了library cache latch的最大个数为67,即便将这个隐藏参数设置为100,library cache latch的数量也还是67个。
注意:我们去查询_kgl_latch_count有时候显示为0,这是一个bug。
._kghdsidx_count
Oracle 9 ( 大概是9204)之前,shared_pool太大的情况下,会由于shared_pool free list 或者 used list太长造成一些性能问题。Oracle 9204( 大概)之后,增加了shared_pool sub pool的概念,一个大的shared_pool(> 400M左右),会被分割成几个sub pool,每个sub poll有自己的free list, used list. 这样,就不会因shared_pool太大造成性能问题。sub pool的数量通常是默认的,Oracle根据shared pool大小决定的,也可以由 _kghdsidx_count 来决定。
在Oracle 9i中,为了增加对于大共享池的支持,Shared Pool Latch从原来的1个增加到现在的7个。如果系统有4个或4个以上的CPU,并且shared_pool_size大于250MB,Oracle可以把Shared Pool分割为多个子缓冲池进行管理,每个subpool都拥有独立的结构、LRU和Shared Pool Latch。以下查询显示的就是这些Latch:
SQL> select addr, name, gets, misses, spin_gets from v$latch_children where name = 'shared pool';
select * from x$ksppi a,x$ksppsv b where a.indx=b.indx and a.ksppinm='_kgl_latch_count'
参考至:http://www.ixora.com.au/q+a/library.htm
http://www.itpub.net/thread-913955-1-1.html
http://space.itpub.net/464838/viewspace-588908
http://tech.it168.com/oldarticle/2007-07-04/200707041137531_3.shtml
http://www.linuxidc.com/Linux/2011-07/39425p3.htm
http://cuuzhang.blog.163.com/blog/static/608115292008624101721121/
http://hi.baidu.com/lichangzai/blog/item/4d26c6fd0943770208244d53.html
http://blog.sina.com.cn/s/blog_4d22b9720100n4vu.html
http://zhangxu777.blog.163.com/blog/static/146290921200961655213/
本文原创,转载请注明出处、作者
如有错误,欢迎指正
邮箱:czmcj@163.com
发表评论
-
AIX平台下磁盘的PVID对ASM磁盘的破坏
2014-03-19 20:53 2745这篇文章将通过两篇MOS文章来讨论AIX平台下为磁盘分配 ... -
Bug 9020054,ORA-8103 BEING HIT DURING GATHERING OF STATISTICS ON TABLE PARTITION
2013-12-01 09:22 1892Bug 9020054 : ORA-8103 ... -
oracle数据库hanganalyze(原创)
2013-06-23 14:11 8342为什么要使用hanganalyzeOracle 数据库“真 ... -
Oracle:并行操作为什么无法执行(老白)
2013-06-23 10:30 5749在一次系统割接的时候,我们碰到一个十分奇怪的现象。由于进行 ... -
补写的2小节DBA日记
2013-06-05 21:52 13956月8日 ITL等待引发的RAC性能问题 从这几天的情况 ... -
ORA-27054故障排除
2013-03-08 17:57 12976在数据备份过程中,由于目标是使用NFS文件系统,因此在导入 ... -
长时间latch free等待——记一次系统异常的诊断过程
2013-01-09 19:17 3749今天发现一个报表数据库中SQL运行异常,简单记录一下问题的诊断 ... -
ORA-15063: ASM discovered an insufficient number of disks for diskgroup(原创)
2012-11-25 16:59 13105ORA-15063: ASM discovered an in ... -
libpthread.so.0: cannot open shared object file解决方法(原创)
2012-11-24 17:33 17778在linux 5上装10G RAC时,常常会碰到“libpth ... -
ora-02020故障诊断详解(原创)
2012-07-16 12:54 3217ORA-2020错发生在一个分布式事务使用的dblink数超过 ... -
Oracle数据库CPU 100%故障诊断实例(原创)
2012-07-05 13:55 16929前言 这两天一只对外提供查询的数据库CPU使用率 ... -
DBA手记:System State转储分析之问题定位
2012-04-19 22:20 2114在 Oracle 数据库的运行过程中,可能会因为一些异常遇 ... -
ORA-02050故障诊断一例
2012-04-05 17:20 8590昨天客户反映说在下午某时间段有几个事务失败了,让我查下当时数据 ... -
ORA-00308: cannot open archived log(原创)
2012-03-23 09:36 16272笔者在为客户配置DG时发现主备库日志无法正常传输,经仔细检查后 ... -
ORA-00600: internal error code, arguments: [kcblasm_1], [103]原创
2012-03-23 09:36 3121故障报错 Mon Mar 19 11:30:03 GMT ... -
ORA-00600: internal error code, arguments: [kglhdda-bad-free](原创)
2012-03-22 09:16 2968故障报错如下 Thu Mar 15 09:51:29 G ... -
ORA-27300,ORA-27301,ORA-27302: failure occurred at: skgpalive1(原创)
2012-03-22 08:58 3360故障报错如下 Wed Mar 07 16:4 ... -
ora-07445错误相关内容
2012-03-01 17:14 1837本文档主要介绍ora-07445 错误相关内容,并给出了对这 ... -
记一次Oracle 生产库还原归档日志经历(原创)
2012-02-17 10:12 4395中午刚去吃饭,就接到同事电话说急着要恢复生产库上的归档日志。系 ... -
SMON: Parallel transaction recovery tried 引发的问题
2012-01-04 11:26 2409SMON: Parallel transaction rec ...
相关推荐
NULL 博文链接:https://gembler.iteye.com/blog/346242
在oracle里面运行一下,解决Exception java.sql.SQLException ORA-00600 内部错误代码
描述中提到的 "ORA-00600: INTERNAL ERROR CODE, ARGUMENTS: [17059], [0X93953434]" 是一个Oracle数据库内部错误代码,表示遇到了无法处理的系统级异常。ORA-00600错误通常涉及到数据库的内部一致性问题,这可能是...
### 如何处理错误ORA-29275:部分多字节字符 #### 问题背景与描述 在Oracle数据库操作过程中,用户可能会遇到一个特定的错误提示——ORA-29275:部分多字节字符。这一错误通常出现在执行查询`SELECT * FROM V$...
### ora-01033: Oracle Initialization or Shutdown in Progress 解决方法 #### 一、问题背景及原因 **标题**: “ora-01033: Oracle initialization or shutdown in progress 解决方法” **描述**: “ora-01033: ...
ORA-00604: 递归SQL层1出现错误 ORA-03106: 致命的双工通信协议错误 ORA-02063: 紧接着line(源于dblink) 以及 ORA-04052: 在查找远程对象时出错 ORA-00604: 递归SQL层1出现错误 ORA-03120: 双工转换例行程序:整数...
在Oracle数据库管理中,"ORA-00990: 权限缺失或无效"是一个常见的错误代码,它表示用户尝试执行的操作没有足够的权限。这个错误通常发生在试图访问、修改或者管理数据库对象(如表、视图、索引等)时。在本篇文章中...
Oracle 11g 内部错误代码 ORA-00600 是一个非常通用的错误,它表示数据库遇到了一个无法处理的内部错误或异常情况。这个错误通常涉及到Oracle数据库的底层结构,如数据块、索引或者内存管理等,且参数列表可以提供...
ORA-32001:write to spfile requested but no spfile is in use请求写入spfile,但没有使用spfile的解决方法 在输入以下语句中报了这样的错误: SQL>alter system set control_files=’/u01/app/oracle/oradata/prod/...
ORA-12154: TNS: 无法解析指定的连接标识符的解决方法
在Oracle数据库系统中,"ORA-01036:非法的变量名/编号"是一个常见的错误,通常出现在PL/SQL代码或者SQL查询语句中,当你尝试使用一个不正确或者未定义的变量时,Oracle数据库会抛出这个错误。这个错误可能是由于...
在Oracle数据库系统中,"ORA-00060: 等待资源时检测到死锁" 是一个常见的错误提示,它表明两个或多个事务在执行过程中陷入了无法继续进行的状态,因为彼此都在等待对方释放资源。这种情况通常发生在并发操作中,比如...
#### ORA-00022: Error Getting Lock ID 获取锁ID时出错,通常是由于内部数据库错误或并发控制机制的问题。 #### ORA-00023: Unable to Lock Resource in Library Cache 当无法锁定库缓存中的资源时触发。这可能与...
Oracle数据库报错ORA-00600是一个内部错误,通常表示系统遇到了未预见的问题或已知的软件缺陷。这个错误通常与Oracle的内核代码有关,而不是用户的SQL操作。在描述中,我们看到错误与“kcblasm_1”和参数103相关联,...
在Oracle数据库中,"ORA-00904"是一个常见的错误代码,它表示尝试引用一个不存在或无效的标识符。在这个场景下,错误信息提到的是"WM_CONCAT"函数,这表明在Oracle 19c数据库环境中,用户尝试使用WM_CONCAT函数,但...
调整参数后引起,ORA-00838: Specified value of MEMORY_TARGET is too small, needs to be at least 3 2768M 的解决方案
Oracle 错误代码 ORA-00600 [16703] 分析和解决方案 Oracle 错误代码 ORA-00600 [16703] 是一种内部错误代码,通常是由于数据字典基表存在不一致或对象号不存在所致。在本文中,我们将对这个错误代码进行详细的分析...
oracle12c程序连接时异常: ORA-01017: 用户名/口令无效; 登录被拒绝 的解决方案。
ORA-07445: exception encountered: core dump [opiaba()+639] [SIGSEGV] [ADDR:0x0] [PC:0x184CD17] ...ORA-00600: internal error code, arguments: [17147], [0x0641B8FE0], [], [], [], [], [], [], [], [], [], []
关于CRA-00600:内部代码错误解决,这一主题主要聚焦于Oracle数据库中一个常见的技术难题——ORA-00600错误的识别与处理。ORA-00600是一种内部错误,通常指向Oracle数据库内核中的某种异常情况,其参数列表如[19004]...