10G RMAN Testing Part 1
In part 1 of this 2 part series, I'll provide you with a few backup and recovery tips, tricks and techniques that I think will help both beginners and experts alike. Since we are RMAN aficionados here at Giant Eagle, I'll finish the blog with an introduction to RMAN. In part 2, I'll highlight a few of the 10G RMAN new features we are interested in and provide you with Giant Eagle's detailed 10G RMAN test plan.
It's the Little Things That Bite You
Most botched recoveries can be attributed to human error. Make sure all tapes have proper retention periods, verify that all backups are executing correctly and run test recoveries on a regular basis. Don't let missing tapes or backups cause you to lose data. You don't want to hear UNIX support say "the retention on that tape was supposed to be how long?" in the middle of a recovery. COMMUNICATE with others that are responsible for all other pieces of the recovery "pie" (system admins, operators) on a regular basis to ensure you have everything you need to recover a crashed database.
Keep Your Skills Sharp
Don't let your recovery skills get rusty. Create one database that you and your fellow administrators can trash on a regular basis. Take turns and make a game out of it. DBAs can be pretty creative when causing problems for others when it's all in fun. Spend dedicated time keeping your recovery skills sharp. If you are a senior-level DBA, make sure you keep the junior folks on their toes. A staff that has up-to-date recovery experience means less calls to you at 2AM.
RELAX and Plan Your Attack
When you are notified of a database failure, take a deep breath and relax. Don't immediately begin to paste the database back together without a plan. Create a recovery plan, put it on paper, have others review it if you can, and then execute it. You shouldn't be trying to determine what the next step is in the middle of the recovery process.
Don't Be Afraid to Ask Others
I have 18 years of experience using Oracle, am an ex-Oracle Instructor and have done my fair share of database recoveries. If possible, I still have others review my recovery strategy and recovery steps before I execute them. I'll be the first one to admit that I can still make a mistake during a recovery so I'm hapy to have others review my strategy. Don't be afraid to ask others and don't be afraid to call Oracle support if you have to. That's what they get paid by your company to do - support you. Don't make a database unrecoverable by guessing. I once viewed over 100 different commands in an alert log after a junior DBA botched a production database recovery. An ego that was too big to allow that person to ask questions created a database that was virtually unrecoverable.
V$RECOVERFILE
If you aren't using RMAN to perform the recovery, you can query the V$RECOVERFILE table to determine how many files are missing or corrupted before you have them restored. The operators will like you much better and the recovery process will be a lot faster if you restore all of the files that have been lost at the same time.
Watch the NOLOGGING Option
The NOLOGGING option is great for performance but it tends to complicate recoveries that require redo log entries to be applied. If you load or insert data using the NOLOGGING option and you don't immediately take a backup, you're asking for trouble. If you have to execute a database recovery, the database will be out of synch. During the application of the redo log entries, the data loaded or inserted using the NOLOGGING option will not be in the redo logs. This means that the data will not be replayed during the database recovery. If transactions that depend on the missing data are replayed during the recovery, they will be accessing data that's not there! Bad things will happen to your recovery as a result. Take a backup after a NOLOGGING statement or utility execution.
Striped File Systems
If mirroring is not used, exercise caution when placing redo logs and/or control files on the same disk stripes as data files. If you lose any disk contained in the stripe, you also lose your ability to recover the database to the current point-in-time. Call me old-fashioned, I still try to sneak a copy of the control files and redo logs on a non-striped drive. If the stripe containing the other copies becomes unusable, the copies on the non-stripe disk are available to use as input to a forward recovery. I just can't bring myself to place all of my multiplexed redo log files and control files on the same stripe (whether that stripe is mirrored or not).
Data Files and Redo Logs
If mirroring is not used, don't place redo logs and data files on the same disk. Once again, you may lose your ability to recover the database to the current point-in-time.
Redo Log and Control File Multiplexing
Make sure your redo log groups have two or more members in each group with each member being on a different disk drive. Multiple copies of the control file should also be maintained on different disks. There should be a very limited number of reasons why you have to recover a database due to the loss of all members of a redo log group or control file set.
Recovery Manager
Whether you love it or hate it, the Oracle Recovery Manager (RMAN) tool is Oracle's preferred strategy for backup and recovery. The tool really does simplify the recovery process. Recovery Manager allows administrators to write scripts, test them and store them in the recovery catalog. During a recovery scenario, the administrator executes one of the scripts stored in the catalog to recover the database. Running a script or two at 2AM in the morning is usually preferable to sitting at your PC trying to paste together a recovery procedure.
Here is some general information on the Recovery Manager toolset:
Recovery Manager is a utility that allows administrators to create database backups and use them as input to database recoveries. RMAN uses a special program interface to the server for invoking backup and restore operations.
Recovery Manager provides the following benefits:
- Automates backup operations.
- Creates printable logs of all backup and recovery operations.
- Automates both restore media and recovery operations.
- Automates backup and recovery processes.
- Configures backups for later execution.
- Automatically finds datafiles that need a backup based on user-specified limits on the amount of redo that must be applied if the datafiles were to be recovered.
Notice that the key word is automate! Many of the operations that once required manual intervention by the administrator are now automated. Recovery Manager automatically starts Oracle server processes to backup and recover the target databases.
Recovery Manager can be run in either interactive or batch mode. In interactive mode, Recovery Manager will immediately execute commands as you enter them at the prompt. Batch mode allows backups to be run by third party job schedulers.
Recovery Catalog
The recovery catalog can be loosely compared to a database catalog. Recovery manager uses the catalog's repository of information to keep track of backup and recovery operations. The recovery catalog must be contained in an Oracle database but does not require the creation of an additional database. The catalog can be placed in an existing database if desired. It is highly recommended that you create the RMAN repository in a separate database that is used only for RMAN operations.
The recovery catalog contains information on:
- Tablespaces and datafiles
- Archived redo logs (and copies of archived redo logs)
- Datafile copies
- Datafile and archivelog backup sets and backup pieces
- Stored scripts which contain user created sequences of commands that can be executed by the Recovery Manager
Administrators are able to create scripts and store them in the recovery catalog. Stored scripts allow administrators to chain a sequence of backup or recovery operations together and execute them as a single unit.
Channels
A channel is a data stream that is sent to a device. The most common devices are disk and tape. Administrators are able to automatically configure channels using the RMAN CONFIGURE command or they can explicity define the channel in the RMAN backup or recovery script.
The example below shows an RMAN script that allocates a device type of SBT (TAPE) and names the channel TAPE1. The next line tells RMAN to back up the entire database including all of the archive logs.
RUN
{
ALLOCATE CHANNEL tape1 DEVICE TYPE sbt;
BACKUP DATABASE PLUS ARCHIVELOG;
}
This example allocates a channel to disk with all files created by this channel having the format '/u01/rpprod1/backup/23005tools.bak'. The script backs up one database datafile.
RUN
{
ALLOCATE CHANNEL d1 DEVICE TYPE disk
FORMAT = '/u01/rpprod1/backup/23005tools.bak';
BACKUP DATAFILE '/u01/oracle/oradata/rpprod1/tools01.dbf';
}
Our last example allocates 3 disk channels, performs an incremental level 0 (base) backup and uses channel d1 to backup data files 1, 4 and 5, d2 to backup up data files 2, 3 and 6 and so on. The naming convention of the output files will begin with the characters "df_" followed by database name, set name and piece name.
RUN
{
ALLOCATE CHANNEL d1 DEVICE TYPE disk;
ALLOCATE CHANNEL d2 DEVICE TYPE disk;
ALLOCATE CHANNEL d3 DEVICE TYPE disk;
BACKUP
INCREMENTAL LEVEL = 0
FORMAT = '/u01/rpprod1/backup/df_%d_%s_%p.bak';
(DATAFILE 1, 4, 5 CHANNEL d1 tag=df1)
(DATAFILE 2, 3, 6 CHANNEL d2 tag=df2)
(DATAFILE 7, 8, 9 CHANNEL d3 tag=df3);
sql 'alter system archive log current';
}
Recovery Manager Backup Types
Recovery Manager supports two different types of backups: backup sets and image copies:
Backup Sets
Backup sets consist of datafiles or archivelogs. A single backup set cannot contain a combination of archivelogs and datafiles. A backup set can contain a combination of datafile and control file backups. Recovery Manager allows you to move archived logs from disk to tape. Backup sets containing moved archived logs are called archivelog backup sets.
Backup sets consist of one or more individual backup files. The individual files contained in a backup set are called backup pieces. Recovery Manager uses the backup sets as input for recovery operations. Backup sets can be written to disk or sequential output media (tape). The V$BACKUP_DEVICE contains a list of backup devices that are supported by your platform.
Backup sets can be full or incremental. A full backup is a backup of all of the blocks that make up a datafile or datafiles. Recovery Manager allows you to take full backups of datafiles, datafile copies, tablespaces, archive logs, control files and databases. Incremental backups copy blocks that have been changed since a previous backup. Incremental copes can be taken of datafiles, tablespaces and databases. Recovery Manager also provides cumulative backups. Cumulative backups copy all blocks that have been changed since the most recent incremental backup.
Image Copies
An image copy is a single datafile that you can use as input to a recovery. The Oracle server process validates the blocks in the file during backup and registers the copy in the recovery catalog. Image copies do not require the execution of a recovery operation, the datafile can be renamed to the image copy.
As a result, image copies:
- Do not store header or footer control information.
- Must be written to disk.
- Cannot be compressed.
- Cannot contain multiple input or output files.
- Cannot be multiplexed (discussed later).
Parallel Backup and Recovery
Recovery Manager is able to parallelize a single backup, recovery or restore operation, but is unable to process multiple commands in a stored script in parallel.
The RMAN script below uses three channels to back up three data files identified by their file number. This creates three separate server processes each one sending a data stream to the tape device.
RUN
{
ALLOCATE CHANNEL t1 DEVICE TYPE sbt;
ALLOCATE CHANNEL t2 DEVICE TYPE sbt;
ALLOCATE CHANNEL t3 DEVICE TYPE sbt;
BACKUP DATAFILE 3,4,5 ;
}
Multiplexed Backup Sets
A backup process is able to take concurrent backups of multiple data files, or one or more multi-file tablespaces and multiplex the output into a single stream.
Backup/Recovery Reports
Recovery Manager provides two commands to provide information relating to backups and image copies:
- Report - produces a report of files that require backups, files that are unrecoverable and backup files that can be deleted.
- List - produces a listing of the contents of the recovery catalog. The list command allows you to find out what backups or copies are available.
Database Restore and Recovery
The database restoration process is performed by executing the RMAN RESTORE and RECOVER commands.
The RESTORE command tells RMAN to retrieve files from RMAN backups based on the contents of the RMAN repository.
You use the RECOVER command in Recovery Manager to perform media recovery and apply incremental backups and redo logs (archived or online) if necessary. Recovery Manager uses the recovery catalog to select the backup sets or image copies to use as input to the recovery operation.
There are four variations of the recover command:
- RECOVER DATABASE - recovers an entire database.
- RECOVER TABLESPACE - recovers an entire tablespace consisting of one or more datafiles.
- RECOVER DATAFILE - recovers a single datafile or datafiles.
- RECOVER CONTROLFILE - recovers the database control files.
This sample recovery script allocates three disk channels, performs a point-in-time recovery to the specified time and resets the logs when opening the database:
RUN
{
ALLOCATE CHANNEL d1 DEVICE TYPE disk;
ALLOCATE CHANNEL d2 DEVICE TYPE disk;
ALLOCATE CHANNEL d3 DEVICE TYPE disk;
SET UNTIL TIME = '2005-02-28:11:00:00';
RESTORE DATABASE;
RECOVER DATABASE;
SQL 'alter database open resetlogs';
}
RMAN Examples
The examples below should give you a general understanding of how RMAN is used to back up and recover an Oracle database:
- rman - Operating system command to activate RMAN. There is also a GUI version available in later releases.
- CONNECT TARGET SYS/sysdba@prod1 - Connecting to the target database (the one that is to be backed up or recovered).
- CONNECT CATALOG rman/rman@rcat - Connecting to the database containing the RMAN catalog.
- rman TARGET SYS/target_pwd@target_str CATALOG rman/cat_pwd@cat_str - Getting high-tech and connecting to both the target and RMAN catalog at the same time.
- CREATE CATALOG; - Creating the catalog in the RMAN repository database.
- REGISTER DATABASE; - Registers the target database in he RMAN repository database.
- REPORT SCHEMA; - Displays the datafiles currently in the target database.
- LIST BACKUP SUMMARY; - Displays a summary of all backups recorded in the catalog.
- LIST COPY OF DATABASE ARCHIVELOG ALL; - Lists archived logs and copies of logs.
Sample RMAN Scripts
This command uses two backupSpec clauses to back up tablespaces and datafiles and lets RMAN perform automatic parallelization of the backup. A channel must be allocated when it accesses a disk or tape drive. The backup is identifying files by tablespace name and fileid.
RUN
{
ALLOCATE CHANNEL dev1 DEVICE TYPE DISK FORMAT '/fs1/%U';
ALLOCATE CHANNEL dev2 DEVICE TYPE DISK FORMAT '/fs2/%U';
BACKUP
(TABLESPACE SYSTEM,sales1,sales2,sales3 FILESPERSET 10)
(DATAFILE 12,14,15);
}
The following example takes the USERS tablespace offline, uses automatic channels to restore and recover it , then brings it back online. Notice that we don't care what the backupset file names are or where the archived logs are located - RMAN handles it all!
RUN
{
SQL "ALTER TABLESPACE users OFFLINE IMMEDIATE";
RESTORE TABLESPACE tools;
RECOVER TABLESPACE tools;
SQL "ALTER TABLESPACE tools ONLINE";
}
Next Up
In my next blog, we'll discuss a handful of new features that we are interested in and provide you with our detailed RMAN test plan.