`

Oracle 11g Database Recover Adviser(原创)

 
阅读更多

Data Recovery Advisor
The Data Recovery Advisor (DRA) is a new tool that automatically diagnoses failures in the database, determines the repair options and, optionally, executes the repairs if you want. A database failure could be a corruption or loss of data, including the loss of datafiles. You can access DRA through the RMAN client or Enterprise Manager.
The DRA uses Oracle Database 11g’s new diagnosability infrastructure and the new diagnostic concepts such as problems, incidents, and failures to automatically diagnose data failures. The DRA can help you fix problems such as lost datafiles,data block corruption, I/O failures, and a crashed database. Here are the main goals of the DRA:

  • Early detection of data failures, which limits damage
  • Automatic detection of failures, along with recommendations and even repair of the failures
    Reduced database downtime

The Data Recovery Advisor can diagnose and help fix problems such as the following:

  • Inaccessible datafiles and control files
  • Physical data corruption such as invalid block header field values
  • Inconsistencies such as one datafile being older than others
  • I/O failures such as hardware errors

A checker or check is a specific diagnostic procedure registered with the Health Monitor. You can proactively invoke these checkers, known as data integrity checks,yourself, and they are also run reactively by the database itself. The Health Monitor automatically runs a data integrity check whenever it finds an error in the database.
The data integrity check searches the database for any failures that caused the errors.A failure is defined as a problem such as data corruption that is diagnosed by the database. All diagnosed failures are stored in the ADR.
You can use the DRA to fix the data failures that caused the errors, but you can only do so after the failure diagnostics are stored in the ADR. You can use the Data Recovery Advisor to diagnose a failure and in most cases, fix it, either through a manual course of action or through an automated repair strategy. The failure data is stored outside the database. Consequently, you can view failures and fix them even when the database is in the nomount state.
All findings in the DRA are organized into groups pertaining to a specific failure.The DRA also assigns severity levels such as critical and high to each database failure it captures. Once the DBA requests the DRA for repair advice, the DRA provides all the automatic and manual repair choices available to you, along with its advice. You can choose to fix the problem yourself using the manual methods or to have the
DRA perform the repairs.
Note that the DRA consolidates related failures into a single failure. If 100 data blocks are corrupted in a datafile, the DRA will show a single failure. You can drill down to the level of an individual sub-failure.
The DRA doesn’t automatically tag a database error that is logged in the alert log file, such as an ORA-600 error, as a failure. To be considered a failure, a problem must be diagnosed by a check (or checker) registered with the Health Monitor.So, the chain of causation is this: an error leads to a Health Monitor data integrity check, which investigates the failures related to the error. If the database finds related failures, it lodges the information about the failure in the ADR. It is at this point that you can call on the DRA to generate repair advice.The preceding chain of causation relates to a reactive data integrity check, but the logic is the same when you launch your own proactive data integrity checks. A failure that your check reveals will be treated just as a failure that’s detected by a Health Monitor data integrity check. Also, any validate or backup commands you issue through RMAN will invoke an automatic data integrity check.
Failure
You must be aware of three failure characteristics: status, priority and grouping:
Failure status 

When the database first reports a failure, the failure has a status of open. Once the failure is repaired, the status changes to closed.When you run the list failure command, it automatically closes all failures that have been resolved. If, for some reason, the database doesn’t set the status of a fixed failure to closed, you can do so by using the change failure . . . closed command. But remember that if you try to set the status of an unresolved failure to closed, the Data Recovery Advisor will re-create the failure with a different ID following the next data integrity check, when the check encounters the unresolved failure.
Failure priority 

A failure can be assigned three levels of priority: critical,high, or low. A high priority level, such as the one assigned to a missing current control file, could bring the database down. Events such as a missing
archived red log are assigned the high level of priority. The DRA, however,assigns only the critical or high level of priority to a failure, and the list failure command shows you only the failures with these two levels
of priority. If, in your opinion, a failure with a high priority isn’t really a big deal and can wait for fixing, you can manually change the priority level of that failure to low, thus keeping that failure from appearing in the list
failure output. You can’t, however, change the priority level of a failure from critical to low.
Failure grouping 

The DRA always groups related failures together under a single parent failure. You can, however, use the detail option in the list failure command to view details about the individual failures in a group.

Repair Options
The DRA offers you two types of repair options in some cases, manual and automatic.It’s your responsibility to perform the manual repair actions, whereas the DRA performs the automatic repair actions itself. The DRA first checks to see if it can perform the automatic repair before offering the repair choice to you. For example, in the case of a failure repair that involves restoring and recovering a datafile, the DRA will offer to automatically perform this repair only after first verifying that the necessary datafile backups and archived redo logs do exist and that RMAN can access them. It may further divide the manual repair actions into mandatory or optional actions. If a manual option is easier to perform than undertaking a more drastic repair in the way of restoring and recovering datafiles, for example, the Data Recovery Advisor presents both the manual and the automatic repair choices. If the Data Recovery Advisor knows that a problem can’t be fixed with an automatic repair, the DRA will state that the manual repair action is mandatory. If a repair action can be performed either manually or automatically, such as restoring and replacing a missing datafile, the DRA will offer both options.
Whenever the DRA offers an automatic repair option, it shows you the directory location where the repair script it’s going to use is stored. You can then edit and execute this file yourself, if you so desire.Wherever it’s possible to do so, the DRA tries to consolidate its repairs for multiple failures into a single repair consisting of several individual steps. If it can’t repair a failure that results from, say, a missing controlfile, it’ll report that information to you and suggest that you first fix those other problems. In this case, you may create a new control file and issue the list failure, advise failure, and repair failure commands once again, so the DRA can fix the failure for you.

Oracle does inform you that a DRA failure repair sometimes may not do the trick because the DRA doesn’t check every single byte in a backup file or an archived redo log. This leaves room for the possibility that a corrupt block in either a backup file or an archived redo log may still keep you from fixing the database failure.

In Oracle Database 11g Release 1, the Data Recovery Advisor supports only a single instance and not Oracle RAC databases. If all the instances of an Oracle RAC setup crash, you can mount the database in the single instance mode and use the DRA to fix certain problems as well as invoke data recovery checks. You can’t, however, use the DRA to detect failures such as an inaccessible datafile, which is local to another instance in the Oracle RAC configuration.

You can invoke the DRA through RMAN as well as through Enterprise Manager.In Enterprise Manager, the DRA is integrated with both the Support Workbench and the Health Monitor.

Using RMAN to Manage the Data Recovery Advisor
You can use the RMAN interface to invoke the Data Recovery Advisor and manage and resolve various types of data failures. Before you actually walk through a data failure repair with RMAN, it’s important to review the key RMAN commands pertaining to the DRA.

Viewing Failures
Use the list failure command to view a list of database failures that were caught by invoking an automatic or manual database check prompted by an error in the database. Here’s an example:

RMAN> list failure;
The V$IR_FAILURE view stores the results of all the list failure commands you issue.The list failure command lists a failure only if the database has diagnosed a failure. The list failure command doesn’t initiate a data integrity check by itself—it merely reports an already diagnosed failure.

You can view individual sub-failures by issuing the list failure . . . detail command. Because the Data Recovery Advisor consolidates failures wherever possible, you use the detail option to list individual failures. You can provide options such as critical, high, low, or closed to limit the list failure command’s output to only those failures that belong to a specific priority level. By default, the list failure command shows all failures with a priority level of critical or high. Here are some list failure command variations:

RMAN> list failure critical;
RMAN> list failure open;
RMAN> list failure closed;
RMAN> list failure exclude failure 12345;

Dealing with Failures
Use the change failure command to change the priority level of a database failure. This command is useful when you want to close a failure that’s really trivial.Note that you can change a priority of high to low and vice versa, but you can’t change a critical priority level. The default output list of the list failure command shows all high-priority failures. If there are too many high-priority failures that you’re already aware of, you may want to change a failure’s status from high to low to avoid seeing those failures in the output of the list failure command.The following example shows how to change the priority level of a command
from high to low:
RMAN> change failure 1 priority low;

RMAN> change failure 102 closed;
You can change a problem’s status to closed any time you want, even if the failure wasn’t really fixed. However, if the DRA gets new checker data pertaining to that problem, it will re-create the failure with a different ID.
Getting Repair Advice
Use the advise failure command to get advice on a failure. The command prints a summary of the failure. It also implicitly closes all open problems that have been fixed. By default, the command reports on all critical and high priority repairs. The command lists all failures that have a critical or high level of priority.The command shows a summary of the repair options as well as a single repair option for each failure. The repair options can be manual as well as automated. The two repair options are a no-data-loss and a data-loss repair.

Manual repair options can be mandatory or optional. Often, a manual repair option may be the best option, as it avoids more extreme measures such as restoring and recovering entire datafiles, as in the automatic repair option. After showing the available repair options, RMAN generates a repair script to implement the repair option it recommends. You can then use this script, also called a manual checklist, after making any necessary changes to it, to repair the failure.

Using the RMAN-generated repair script may often help you fix a problem quicker than letting RMAN perform the repair. For example, it’s far easier to manually restore a lost table from another database rather than to let RMAN perform a laborious restore and recover operation to fix the problem.

Once you issue the advise failure command,

RMAN> advise failure;
List of Database Failures
=========================
Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
82         HIGH     OPEN      05-OCT-13     One or more non-system datafiles are missing
analyzing automatic repair options; this may take some time
using channel ORA_DISK_1
analyzing automatic repair options complete
Mandatory Manual Actions
========================
no manual actions available
Optional Manual Actions
=======================
1. If file /u01/app/oradata/orcl/users01.dbf was unintentionally renamed or moved, restore it
Automated Repair Options
========================
Option Repair Description
------ ------------------
1      Restore and recover datafile 4
  Strategy: The repair includes complete media recovery with no data loss
  Repair script: /u01/app/diag/rdbms/orcl/orcl/hm/reco_1324291144.hm

you can query the V$IR_ MANUAL_CHECKLIST view to examine the recommended repair, as shown here:

SQL>select advise_id,rank,required,message from v$ir_manual_checklist;
 ADVISE_ID       RANK REQUIRED  MESSAGE
---------- ---------- --------- --------------------------------------------------------------------------------
       101          0 NO        If file /u01/app/oradata/orcl/users01.dbf was unintentionally renamed or moved,
                                restore it

In the V$IR_MANUAL_CHECKLIST view, the MESSAGE column shows a summary of the repair advice offered by the DRA.

Note that the DRA always tries to consolidate the repair actions so multiple failures can be fixed with a single repair job.
Repairing Failures

Once the advise failure command reports its repair recommendations, you can either manually repair the failures or choose the automatic repair option. If you want the Data Recovery Advisor to automatically fix the failures, simply issue the repair failure command so RMAN can fix the failure. Sometimes, the advise failure command may ask you to perform some manual repairs first before executing the repair failure command. Here’s the basic repair failure command:

RMAN> repair failure;

You must issue the advise failure command before you can fix problems with the repair failure command. Otherwise, you’ll get an error, as shown in this example:
RMAN> repair failure;
using target database control file instead of recovery catalog
RMAN-00571: =======================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ========
RMAN-00571:================================================================
RMAN-03002: failure of repair command at 10/21/2007 12:15:24
RMAN-06954: REPAIR command must be preceded by ADVISE command in same session
By default, the repair failure command will implement the single repair recommendation shown in the output for the advise failure command. You can also choose to merely view RMAN’s repair procedure without actually repairing the failure by using the preview option with the repair failure command:

RMAN> repair failure preview;
Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/app/diag/rdbms/orcl/orcl/hm/reco_1324291144.hm
contents of repair script:
   # restore and recover datafile
   sql 'alter database datafile 4 offline';
   restore datafile 4;
   recover datafile 4;
   sql 'alter database datafile 4 online';

The repair failure preview command is helpful in ascertaining the actual steps of the RMAN repair procedure, without embarking on the repair process. The V$IR_REPAIR view shows the results of the repair failure command. Here’s the structure of the V$IR_REPAIR view:

SQL> select repair_id,advise_id,summary,rank
  2  from v$ir_repair;
 REPAIR_ID  ADVISE_ID SUMMARY                         RANK
---------- ---------- ------------------------- ----------
       102        101 NO DATA LOSS                       1

You can also check the v$ir_failure_set VIEW,it's a cross-reference of failure and advise identifiers.

SQL> select advise_id,failure_id,manual_repairs_only from v$ir_failure_set;
 ADVISE_ID FAILURE_ID MANUAL_RE
---------- ---------- ---------
       101         85 NO

Following is a sample to use repair failure command to automatically repaire failure.

RMAN> repair failure;
Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/app/diag/rdbms/orcl/orcl/hm/reco_1324291144.hm
contents of repair script:
   # restore and recover datafile
   sql 'alter database datafile 4 offline';
   restore datafile 4;
   recover datafile 4;
   sql 'alter database datafile 4 online';
Do you really want to execute the above repair (enter YES or NO)? Yes
executing repair script
sql statement: alter database datafile 4 offline
Starting restore at 05-OCT-13
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00004 to /u01/app/oradata/orcl/users01.dbf
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/dbs/01olkkka_1_1
channel ORA_DISK_1: piece handle=/u01/app/oracle/dbs/01olkkka_1_1 tag=TAG20131005T084914
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished restore at 05-OCT-13
Starting recover at 05-OCT-13
using channel ORA_DISK_1
starting media recovery
media recovery complete, elapsed time: 00:00:00
Finished recover at 05-OCT-13
sql statement: alter database datafile 4 online
repair failure complete

Although RMAN helped us in this case by providing a repair option and easily let us fix the problem, the Data Recovery Advisor can’t fix every failure ,like data dictionary corruption. Here’s an example:

RMAN> list failure;
using target database control file instead of recovery catalog
List of Database Failures
=========================
Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- --------------------
42725      CRITICAL OPEN      03-OCT-07     SQL dictionary health check:
dependency$.dobj# fk 126 on object DEPENDENCY$ failed
42722      CRITICAL OPEN      03-OCT-07     SQL dictionary health check:
dependency$.dobj# fk 126 on object DEPENDENCY$ failed

Issue the advise failure command, to view the repair options for the failures:

RMAN> advise failure;
List of Database Failures
=========================
Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
42725      CRITICAL OPEN      03-OCT-07     SQL dictionary health check:
dependency$.dobj# fk 126 on object DEPENDENCY$ failed
42722      CRITICAL OPEN      03-OCT-07     SQL dictionary health check:
dependency$.dobj# fk 126 on object DEPENDENCY$ failed
Mandatory Manual Actions
========================
1. Please contact Oracle Support Services to resolve
failure 42725: SQLdictionary health check:
dependency$.dobj# fk 126 on object DEPENDENCY$ failed
2. Please contact Oracle Support Services to resolve failure
42722: SQLdictionary health check: dependency$.
dobj# fk 126 on object DEPENDENCY$ failed
Optional Manual Actions
=======================
no manual actions available
Automated Repair Options
========================
no automatic repair options available

The Data Recovery Advisor doesn’t provide either a manual or an automated repair option in this case. Instead, it asks you to contact Oracle Support Services to resolve the failure reported by the SQL dictionary health check.

Proactive Checks

Although the database runs reactive checks on its own, Oracle recommends that your run proactive database health checks on a regular basis. It’s also best to run a proactive check to dig deeper into a problem when a reactive check indicates there is a database component failure. You can run these proactive checks through the Health Monitor or by executing the new validate database command.

The new validate database command in Oracle Database 11g lets you invoke a proactive health check for the database and its components. The validate database command is similar to the old validate backupset command, but is more sophisticated. Any time the validate database command detects a database failure, it initiates a failure assessment and logs the failure in the ADR. You can view these failure findings when you issue the list failure command. Here’s an example showing how to use the command:

RMAN> validate database;
Starting validate at 16-OCT-07
using target database control file instead of recovery catalog allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=109 device type=DISK
RMAN-06169: could not read file header for datafile 7 error reason 7
RMAN-00571: ===================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =======
RMAN-00571: ===================================================
RMAN-03002: failure of validate command at 10/16/2007 12:25:33
RMAN-06056: could not access datafile 7
You can employ the validate database command to validate at a fine level of granularity because you can validate individual backup sets or even individual data blocks now with the command. By default, the command checks for physical and not logical database corruption. However, you can make the command check for logical corruption by specifying the CHECK LOGICAL option. Of the two types of block corruption, intrablock and interblock, the validate database command checks for intrablock corruption only.

You can refer the VALIDATE command syntax from the following link:

http://docs.oracle.com/cd/E11882_01/backup.112/e10643/rcmsynta2024.htm#RCMRF162

 

参考至:《McGraw.Hill.OCP.Oracle.Database.11g.New.Features.for.Administrators.Exam.Guide.Apr.2008》

本文原创,转载请注明出处、作者

如有错误,欢迎指正

邮箱:czmcj@163.com

0
3
分享到:
评论

相关推荐

    Oracle11g通过DBF恢复数据

    在Oracle11g中,可以使用RMAN的`RESTORE DATABASE`或`RESTORE DATAFILE`命令来恢复数据文件。 4. **应用重做日志**:恢复数据文件后,必须应用从故障时间点到当前的重做记录,以使数据达到一致状态。`RECOVER ...

    Oracle11g 崩溃后-dbf数据库文件恢复

    当Oracle 11g数据库遭遇崩溃时,如何有效地恢复数据,特别是dbf(数据文件)变得尤为关键。Oracle 11g版本,即11.2.0,提供了多种恢复策略来应对这种情况。在这个过程中,log文件(redo logs)和ctl控制文件起着决定...

    Oracle 11g rman 异机还原再升级到 12c PDB

    Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production PL/SQL Release 11.2.0.4.0 - Production CORE 11.2.0.4.0 Production TNS for Linux: Version 11.2.0.4.0 - Production NLSRTL ...

    oracle 11g rman 全备份脚本

    ### Oracle 11g RMAN全备份脚本详解 #### 一、脚本环境配置 在脚本开始之前,我们看到一系列的环境变量配置。这些配置对于RMAN(Recovery Manager)工具来说至关重要,因为它们定义了Oracle数据库的运行环境。 - ...

    oracle10g或11g通过dblink访问sql server数据库

    ### Oracle 10g 或 11g 通过 DBLink 访问 SQL Server 数据库 #### 概述 本文档详细介绍了如何在Oracle 10g或11g环境中使用DBLink来访问SQL Server数据库的方法。这种方法对于那些习惯于使用Oracle进行数据库操作的...

    oracle11g12c断电数据文件介质损坏故障恢复方法.txt

    Oracle11g、12c因断电数据文件损坏,未开启归档模式,通过recover database until cancel;恢复损坏的介质文件失败,重建控制文件修复数据库过程的笔记。

    oracle 11g RAC<--&gt;单机 高可用dataguard 主备正确切换步骤

    Oracle 11g RAC高可用Dataguard 主备正确切换步骤 Oracle 11g RAC高可用Dataguard是Oracle公司推出的高可用解决方案,旨在提供高可用性和灾难恢复能力。本文将详细介绍Oracle 11g RAC高可用Dataguard 主备正确切换...

    perform oracle 11g physacal dataguard failover

    Oracle 11g 物理数据保护(Physical Data Guard)是一种强大的高可用性解决方案,它通过维护一个实时同步或接近实时同步的备用数据库,确保在主数据库发生故障时能够快速切换,最小化业务中断。执行Oracle 11g物理...

    oracle 11g dataguard步骤

    根据提供的文件信息,以下是对Oracle 11g Data Guard配置步骤的详细解析: ### 1. 判断DataGuard是否已安装 首先确认Oracle环境是否支持Data Guard功能。可以通过查询`v$option`视图来检查Oracle Data Guard是否...

    aix6.1 Oracle11g rac实施报告

    - Oracle Database 11g Release 2 ##### 2.2.7 调试操作系统参数 根据Oracle官方文档调整相关的操作系统参数,以确保最佳性能。 ##### 2.2.8 调整虚拟内存管理(VMM) 优化虚拟内存管理策略,以适应Oracle RAC的运行...

    OCPOCA认证考试指南全册:Oracle Database 11g(1Z0-051,1Z0-052,1Z0-053)--详细书签版(第2/2部分)

    OCPOCA认证考试指南全册:Oracle Database 11g(1Z0-051,1Z0-052,1Z0-053) 共2部分:此为第002部分 基本信息 原书名: OCA/OCP Oracle Database 11g All-in-One Exam Guide with CD-ROM: Exams 1Z0-051, 1Z0-052...

    ORACLE 11G 搭建DATAGUARD步骤.pdf

    Oracle 11g Data Guard是Oracle数据库的一种高可用性和灾难恢复解决方案,它通过创建和维护一个或多个备用数据库,来保护生产数据库免受数据丢失的影响。以下是对搭建Oracle 11g Data Guard步骤的详细说明: 1. **...

    Oracle 11g 基于RMAN的备份与恢复.docx

    Oracle 11g 的备份与恢复是数据库管理中的关键环节,确保数据的安全性和可用性。在Oracle数据库中,恢复管理器(RMAN)是进行备份和恢复的主要工具。以下是对Oracle 11g基于RMAN的备份与恢复的详细说明: 1. 备份...

    Oracle11g dbf数据库文件恢复经验.rar

    - **使用`ALTER DATABASE RECOVER`命令**:在某些情况下,可以尝试使用这个命令来恢复数据库,Oracle会尝试应用redo日志中的未提交更改。 - **跟踪和分析错误**:如果上述方法无效,可能需要深入分析Oracle的跟踪...

    Oracle 11GR2 dataguard日常管理

    在Oracle 11g Release 2 (11GR2) 中,Data Guard提供了多种管理和操作功能,确保数据的安全性和业务连续性。以下是关于Oracle 11GR2 Data Guard日常管理的详细知识点: 1. **关闭和开启数据库** - 关闭数据库:...

    McGraw.Oracle.RMAN.11g.Backup.and.Recovery.2010

    在Linux环境下部署Oracle Database 11g的Real Application Clusters (RAC)时,RMAN的备份和恢复策略需要考虑集群的特性。RAC环境下的备份通常需要协调所有节点,确保一致性。RMAN的并行执行能力在此时显得尤为重要,...

    oracle standby database

    ### Oracle Standby Database 创建与管理知识点详解 #### 一、Oracle Standby Database 概述 在Oracle数据库系统中,为了实现高可用性和灾难恢复能力,引入了Standby Database的概念。Standby Database是一种辅助...

    Oracle 11gR2迁移升级至12cR2 RAC

    - **数据文件的重命名与恢复**:通过`SET NEWNAME FOR DATAFILE`命令指定新的数据文件路径,并使用`RESTORE DATABASE`和`RECOVER DATABASE`命令来恢复数据库。 - **重命名重做日志与临时文件**:通过`ALTER DATABASE...

    windows平台下ORACLE11g DG同步

    ### Windows平台下Oracle 11g DG同步详细指南 #### 一、服务器环境 本教程适用于以下环境: - **主库**: Windows Server 2008 R2 64位 - **备库**: Windows Server 2008 R2 64位 #### 二、配置前准备 在进行...

    Oracle Database 10g: The Top 20 Features for DBAs

    ### Oracle Database 10g: The Top 20 Features for DBAs #### 1. Flashback Versions Query **Description:** This feature allows database administrators (DBAs) to query the history of data changes within ...

Global site tag (gtag.js) - Google Analytics