Oracle 11gR2 下RAC 安装后,启动CRS. 错误如下:
[root@rac1 bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
从这个错误提示,可以看到是CRS启动失败了。 CRS是关键进程。 它不能启动,Clusterware 也是启动不了。 导致这个问题的原因很多。
Log 如下:
[root@rac1 rac1]# tail -50 /u01/app/11.2.0/grid/log/rac1/crsd/crsd.log
ORA-15077: could not locate ASM instance serving a required diskgroup
2010-11-16 17:13:44.286: [ OCRASM][3046411024]proprasmo: kgfoCheckMount returned [7]
2010-11-16 17:13:44.286: [ OCRASM][3046411024]proprasmo: The ASM instance is down
2010-11-16 17:13:44.287: [ OCRRAW][3046411024]proprioo: Failed to open [+CRS]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.
2010-11-16 17:13:44.287: [ OCRRAW][3046411024]proprioo: No OCR/OLR devices are usable
2010-11-16 17:13:44.287: [ OCRASM][3046411024]proprasmcl: asmhandle is NULL
2010-11-16 17:13:44.287: [ OCRRAW][3046411024]proprinit: Could not open raw device
2010-11-16 17:13:44.287: [ OCRASM][3046411024]proprasmcl: asmhandle is NULL
2010-11-16 17:13:44.287: [ OCRAPI][3046411024]a_init:16!: Backend init unsuccessful : [26]
2010-11-16 17:13:44.288: [ CRSOCR][3046411024] OCR context init failure. Error: PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=7, opn=kgfoAl06, dep=15077, loc=kgfokge
ORA-15077: could not locate ASM instance serving a required diskgroup
] [7]
2010-11-16 17:13:44.288: [ CRSD][3046411024][PANIC] CRSD exiting: Could not init OCR, code: 26
2010-11-16 17:13:44.288: [ CRSD][3046411024] Done.
这里的提示是ASM 没有启动造成的。 这里牵涉到的问题较复杂。
这篇文章不打算去具体分析这个问题。 Oracle 官网上有一篇文章对这个问题进行了非常详细的说明。转到了我的Blog。 参考:
How to Troubleshoot Grid Infrastructure Startup Issues [ID 1050908.1]
http://blog.csdn.net/tianlesoftware/archive/2010/11/17/6013763.aspx
In this Document
Goal
Solution
Start up sequence:
Cluster status
Case 1: OHASD.BIN does not start
Case 2: OHASD Agents does not start
Case 3: CSSD.BIN does not start
Case 4: CRSD.BIN does not start
Case 5: GPNPD.BIN does not start
Case 6: Various other daemons does not start
Case 7: CRSD Agents does not start
Network and Naming Resolution Verification
Log File Location, Ownership and Permission
Network Socket File Location, Ownership and Permission
Diagnostic file collection
References
在这里写下我分析问题的思路:
1. 根据log,看能否找到问题的原因。 如果不能清楚的定位问题。 就只能继续分析。
2. 根据CRS 启动的顺序来分析。
在启动的时候,要先启动ASM 实例, 这里牵涉到存储问题。
(1)网络是否正常
(2)存储是否正常的映射到相关的位置, 我的实验采用的是multipath,将存储映射到/dev/mapper/* 目录下。 在遇到问题的时候,会去检查这个问题是否有相关的映射。
(3)存储的权限问题。 因为映射之后,默认是的root用户。 我在/etc/rc.d/rc.local 文件里添加了改变权限的脚本。 开机启动的时候,就将相关映射文件改成Oracle 用户。
3. 如果这些都正常,没有问题, 可以尝试重启CRS 或者重启操作系统。
补充:
在网上还搜索到一个导致CSSD启动失败的原因。 这个我关注的是,它讲到了一个知识点。 讲到了 /tmp/.oracle 和 /var/tmp/.oracle 这两个目录的作用。 每次Server重启的时候,会在这两个文件里存放锁的信息。 当某次重启后,这两个文件不能被删除,就会导致锁不能更新,从而不能启动。
由此也理解了,在删除Clusterware的时候,为什么需要删除这2个目录了。
在RAC 删除的那篇文档里提到了卸载RAC时要删除这2个目录。 参考:
RAC 卸载 说明
http://blog.csdn.net/tianlesoftware/archive/2010/09/18/5892225.aspx
crs.log 日志内容:
2007-04-11 14:37:34.020: [ COMMCRS][1693]clsc_connect: (100f78610) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_drdb1_crs))
2007-04-11 14:37:34.020: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9
2007-04-11 14:37:34.021: [ CRSRTI][1] CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2007-04-11 14:37:35.740: [ COMMCRS][1695]clsc_connect: (100f78610) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_drdb1_crs))
2007-04-11 14:37:35.740: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9
When we checked ocssd.log it contained the following
CSSD]2007-04-11 12:53:56.211 [6] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/rdsk/c5t8d0s5)
[ CSSD]2007-04-11 12:53:56.211 [10] >TRACE: clssnmvKillBlockThread: spawned for disk 1 (/dev/rdsk/c5t9d0s5) initial sleep interval (1000)ms
[ CSSD]2007-04-11 12:53:56.211 [11] >TRACE: clssnmvKillBlockThread: spawned for disk 0 (/dev/rdsk/c5t8d0s5) initial sleep interval (1000)ms
[ CSSD]2007-04-11 12:53:56.228 [1] >TRACE: clssnmFatalInit: fatal mode enabled
[ CSSD]2007-04-11 12:53:56.269 [13] >TRACE: clssnmconnect: connecting to node 1, flags 0×0001, connector 1
[ CSSD]2007-04-11 12:53:56.274 [13] >TRACE: clssnmClusterListener: Listening on (ADDRESS=(PROTOCOL=tcp)(HOST=drdb1-priv)(PORT=49895))
[ CSSD]2007-04-11 12:53:56.274 [13] >TRACE: clssnmconnect: connecting to node 0, flags 0×0000, connector 1
[ CSSD]2007-04-11 12:53:56.279 [14] >TRACE: clsclisten: Permission denied for (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1))
[ CSSD]2007-04-11 12:53:56.279 [14] >ERROR: clssgmclientlsnr: listening failed for (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1)) (3)
[ CSSD]2007-04-11 12:53:56.279 [14] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1))
[ CSSD]2007-04-11 12:53:56.279 [14] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_drdb1_crs))
[ CSSD]2007-04-11 13:07:36.516 >USER: Oracle Database 10g CSS Release 10.2.0.2.0 Production Copyright 1996, 2004 Oracle. All rights reserved.
[ clsdmt]Fail to listen to (ADDRESS=(PROTOCOL=ipc)(KEY=drdb1DBG_CSSD))
[ CSSD]2007-04-11 13:07:36.516 >USER: CSS daemon log for node drdb1, number 1, in cluster crs
[ clsdmt]Terminating clsdm listening thread
[ CSSD]2007-04-11 13:07:36.536 [1] >TRACE: clssscmain: local-only set to false
[ CSSD]2007-04-11 13:07:36.545 [1] >TRACE: clssnmReadNodeInfo: added node 1 (drdb1) to cluster
[ CSSD]2007-04-11 13:07:36.588 [5] >TRACE: clssnm_skgxnmon: skgxn init failed, rc 1
[ CSSD]2007-04-11 13:07:36.588 [1] >TRACE: clssnm_skgxnonline: Using vacuous skgxn monitor
解决方法:
By checking the above logs we have realised the listener of CSS deamon was unable to start.
the reason why it was unable to start was that each time server reboots it creates a socket at /tmp/.oracle or /var/tmp/.oracle directory .
Also if there are previously existing sockets they cannot be reused or deleted automatically from this directory .oracle.
Therefore the solution to above problem was obtained by deleting all the files inside .oracle directoery in /var/tmp or /tmp.
Hence the crs started and cluster came up.
------------------------------------------------------------------------------
Blog: http://blog.csdn.net/tianlesoftware
网上资源: http://tianlesoftware.download.csdn.net
相关视频:http://blog.csdn.net/tianlesoftware/archive/2009/11/27/4886500.aspx
DBA1 群:62697716(满); DBA2 群:62697977(满)
DBA3 群:62697850 DBA 超级群:63306533;
聊天 群:40132017
--加群需要在备注说明Oracle表空间和数据文件的关系,否则拒绝申请
分享到:
相关推荐
标题《Understanding Oracle RAC Internals - The Cache Fusion Edition》涉及的主题是Oracle RAC内部机制,特别是Cache Fusion技术。Oracle RAC(Real Application Clusters)是Oracle数据库的一个重要特性,它允许...
CRS-4535: Cannot communicate with Cluster Ready Services CRS-4530:Communications failure contacting Cluster Synchronization Services daemon CRS-4534: Cannot communicate with Event Manager 3. 解决步骤...
Oracle Cluster Resource Manager (CRS) 是RAC的核心组件,负责集群资源的管理和监控。本文将深入探讨Oracle CRS的管理与维护,主要包括查看集群状态、启动与关闭服务资源以及使用CRS相关的命令。 1. 查看集群状态 ...
CRS (Cluster Ready Services) 是Oracle RAC的核心组件,负责集群资源的管理和监控。以下是一些关于Oracle RAC CRS的常用命令及其详解: 1. **crs_stat -t**: 这个命令用于查看集群中所有资源的状态。输出包括资源...
### Oracle 11g RAC 安装参考手册知识点总结 #### 一、Oracle 11g RAC 概述 - **Oracle Real Application Clusters (RAC)**:Oracle RAC 是一种数据库集群技术,允许多个数据库实例同时访问同一组数据文件。这种...
Oracle 11g R2 RAC with ASM 存储迁移手记 本文详细介绍了如何将 Oracle RAC 的数据库数据迁移到新的存储设备上,并提供了详细的迁移步骤和图文说明。整个迁移过程中,使用了 ASM DISKGROUP 的方式来完成存储迁移,...
### Oracle RAC CRS 常用管理命令详解 Oracle RAC (Real Application Clusters) 是 Oracle 数据库的一个组件,它提供了高可用性和可扩展性的解决方案。CRS (Cluster Ready Services) 是 Oracle RAC 的核心服务之一...
小麦苗高可用课堂 Oracle 11g RAC 安装--基于 openfiler 存储+多路径+udev 方式 小麦苗高可用课堂 Oracle 11g RAC 安装是基于 openfiler 存储+多路径+udev 方式的高可用解决方案。该解决方案主要介绍了 Oracle 11g ...
Oracle 11g Real Application Clusters (RAC) 是Oracle数据库的一个重要特性,它提供了高可用性和可伸缩性,允许多个数据库实例共享同一物理数据库。这个安装参考手册是为那些希望部署Oracle 11g RAC环境的IT专业...
Oracle 11g RAC(Real Application Clusters)是一种高可用性和可伸缩性的数据库解决方案,它允许多个实例在共享存储上同时访问一个数据库。本文将深入解析Oracle 11g RAC中的集群应用层命令,特别是用于维护和监控...
### ORACLE RAC重建CRS知识点详解 #### 一、背景与概述 Oracle Real Application Clusters (RAC) 是一种数据库集群技术,旨在提供高可用性和可扩展性。CRS (Cluster Ready Services) 是Oracle RAC的核心组件之一,...
根据提供的文档信息,本文将详细解析Oracle 11g R2 RAC(Real Application Clusters)在Linux环境下的安装配置过程。此文档适用于希望在Linux平台上部署Oracle 11g R2 RAC集群的IT专业人士。 ### Oracle 11g R2 RAC...
### Oracle 11g R2 RAC 在 RHEL 6 中部署最佳实践 #### 概述 Oracle 11g R2 RAC (Real Application Clusters) 是 Oracle 数据库的一个重要版本,它提供了高可用性和负载均衡功能。RHEL (Red Hat Enterprise Linux) ...
### Oracle 12c RAC 白皮书关键知识点解析 #### 一、Oracle Real Application Clusters (RAC) 概述 Oracle Real Application Clusters (RAC) 是一款先进的数据库集群技术,允许用户在多台服务器上同时运行多个...
oracle-rac-11.2.0.4.8升级测试验证报告,测试内容实例,监听,切换,bug补丁修复验证。
- 如果出现类似“CRS-4535: Cannot communicate with Cluster Ready Services”这样的消息,则表示集群已经成功停止。 #### 三、启动RAC **知识点7:启动集群服务** - **背景**: 启动集群服务是启动RAC集群的第...
《Oracle RAC 11g R1 on HP-UX》是关于在HP-UX操作系统上部署和管理Oracle Real Application Clusters (RAC) 11g Release 1的详细指南。Oracle RAC是一种高可用性解决方案,允许多台服务器共享同一个数据库实例,...
### Oracle RAC 安装部署规范文档知识点解析 ...以上是对《oracle-rac-安装部署规范文档》中涉及的关键知识点的详细解释。在实际操作过程中,需要严格按照文档的指导步骤进行,以确保Oracle RAC数据库系统的稳定运行。
Linux--Oracle-11g-R2-RAC-安装配置详细过程V3.0(图文并茂).zipLinux--Oracle-11g-R2-RAC-安装配置详细过程V3.0(图文并茂).zipLinux--Oracle-11g-R2-RAC-安装配置详细过程V3.0(图文并茂).zip
标题中的“Linux5-X86-Oracle-Rac-rpm包”揭示了这是一组针对Linux 5(可能是RHEL 5或CentOS 5)X86架构的Oracle Real Application Clusters (RAC)的RPM安装包。RAC是Oracle数据库的一个特性,允许在多台服务器上...