Skip to content

DBAzine.com

Sections
Personal tools
You are here: Home » Blogs » Chris Foot Blog » Chris Foot's Oracle10g Blog » The Non-Technical Art of Being a Successful DBA – Database Recovery Best Practices
Seeking new owner for this high-traffic DBAzine.com site.
Tap into the potential of this DBA community to expand your business! Interested? Contact us today.
Who Are You?
I am a:
Mainframe True Believer
Distributed Fast-tracker

[ Results | Polls ]
Votes : 3623
 

The Non-Technical Art of Being a Successful DBA – Database Recovery Best Practices The Non-Technical Art of Being a Successful DBA – Database Recovery Best Practices

We are going to cover a lot of different topics in the next few blogs. The information will range the spectrum, from keeping our environments organized and uncluttered to backup and recovery best practices.

This blog will focus on the most important responsibility we are charged with as DBAs - ensuring that our organization’s databases can be quickly and easily recovered.

Introduction
Let me begin by stressing how strongly I feel about this topic. As I stated in my first blog of this series, I always started my Oracle backup and recovery class with this statement "The fastest way to lose your job in this profession is to lose data for your company. You can be a Tom Kyte and a Jonathan Lewis X 2, but if you can't recover a database, you aren't of any use to your employer."

That always seemed to ensure that my students paid attention during the remainder of the class. The backup and recovery class was one intense week of instruction that consisted of me pounding information and best practices into my students' collective heads. I would often stop after an important topic and bellow "DO YOU UNDERSTAND?!?!!" as loud as I could. By the end of the week, the class would immediately yell back "YES!!!" as loud as they could.

What can I tell you, that style worked for me. Trust me when I say that when my students left that class, they could all backup and recover an Oracle database. The backup and recovery classes were the ones responsible for my courses being labeled as "Foot Camp" by the Oracle student population. That was OK by me.

My first "mentor", if you could call him that, was an ex-Marine Corps drill instructor that went into IT after he retired. When I started my career, he was the first senior DBA I worked for. Every so often, he would walk up to the back of my chair as I was facing my terminal, lean in real close to my ear and say "You know what Foot, the next time I see you make a mistake, I'm not even going to tell you. I'm just gonna wait 5 minutes then come back and kick your *&^%$ up around your shoulder blades." Not motivational for sure, but I made few mistakes. Maybe some of that rubbed off...

Oracle Backup and Recovery
Recovering an Oracle database is a wonderfully complex task. Data files, log files, control files, full backups, hot backups, RMAN and point-in-time recoveries all combine to make many administrators lie awake nights thinking about whether their databases can be easily recovered (or not).

The next few sections will provide some useful information on the Oracle backup and recovery process. My intent is to not cover any technical topics in-depth. You can get that information from a myriad of sources. My focus will be on the non-technical tips and tricks that will help you improve your recovery skills and ensure trouble free recoveries.

It's the Little Things That Bite You
Most botched recoveries can be attributed to human error. Make sure all tapes have proper retention periods, verify that all backups are executing correctly and run test recoveries on a regular basis. Don't let missing tapes or backups cause you to lose data. You don't want to hear UNIX support say "the retention on that tape was supposed to be how long?" in the middle of a recovery. COMMUNICATE with others that are responsible for all other pieces of the recovery "pie" (system admins, operators) on a regular basis to ensure you have everything you need to recover a crashed database. Pick a database, identify the backup output files and verify that they are available when you need them. Remember, YOU are the technician that is ultimately responsible for ensuring that your organization's databases can be recovered. Not O/S support, operations, application developers....

Document Your Recovery Environment!
OK, I'm yelling already. I already covered the importance of good documentation in a previous blog. You know by now that I work for a remote database services provider. It is absolutely imperative for us to know EVERYTHING about our customer's existing backup and recovery strategies. Part of our assimilation process is to document our customer's environment. Here's a quick list of some of the questions we ask. The document we actually use is a standardized Word template that uses drop downs, text boxes and help buttons but this shortened text document should provide you with a starting point to help you build your own backup and recovery documentation library.

Keep Your Skills Sharp
Don't let your recovery skills get rusty. The more test recoveries you do the easier the production recoveries become. Create one database that you and your fellow administrators can trash on a regular basis. Take turns and make a game out of it. DBAs can be pretty creative when causing problems for others when it's all in fun.

It can actually become quite an interesting game competing for bragging rights over who has the current title of "the most devious database destroyer." During one of my test recoveries, I couldn't even bring up the monitor. I waltzed down to the server room and saw an open drive bay on our test server with a bundle of unconnected wires sticking out. There was a single note attached below the opening telling me to look for the next note. 15 notes later and I found the drive. Dumped it in, fired it up and found that the database was deleted. I recovered it from a tape backup. THAT was devious.

If you are a senior-level DBA, make sure you keep the junior folks on their toes. I have never personally seen the database make a mistake during the recovery process. That leaves incomplete backups and DBA error as the most likely causes of "good recoveries gone bad."

At RemoteDBAExperts, we have dozens of customers that we have to support. We test recoveries and failovers on a regular basis. Not four hours ago, we had three of our folks perform a cold database backup and recovery on a Linux platform. Ensuring our recovery skills are sharp is that important to us. I still do test recoveries. Its important to me to ensure that I am ready to go when the time comes. If it were up to me, I would have our receptionist test her recovery skills too.

RELAX and Plan Your Attack
When you are notified of a database failure, take a deep breath and relax. Don't immediately begin to paste the database back together without a plan. Create a recovery plan, put it on paper, have others review it if you can, and then execute it. You shouldn't be trying to determine what the next step is in the middle of the recovery process. I will plan my attack on paper for all recoveries, no matter how simple they are.

Don't Be Afraid to Ask Others
I have over 20 years of experience using Oracle and have done my fair share of database backups and recoveries. During my career as an Oracle instructor, I have assisted in hundreds (and hundreds) of database recoveries in Oracle's classroom environments. If possible, I still have others review my recovery strategy and recovery steps before I begin the recovery process. A second opinion may prevent you from making a mistake or overlooking a key part of the recovery process.

Don't be afraid to ask others and don't be afraid of calling Oracle support if you have to. That's what they get paid by your company to do - support you. Don't make a database unrecoverable by "guessing." When I first took over as the Database Group Manager for a large financial organization many years ago, I viewed the execution of over 70 different commands in an alert log after a botched recovery performed by a junior DBA. An ego that was too big to allow that person to ask questions created a database that was almost unrecoverable.

The Importance of Formal Education
Read the Oracle Backup and Recovery Guides before reading third-party books. The manuals will provide you with a firm foundation of knowledge on backup and recovery strategies and procedures. Then move on to third-party books (like this one) for helpful hints and tips that may assist you in the recovery process.

Take the Oracle classes! Oracle's instructors understand the importance of backups and recoveries. You will receive days of instruction and hours of hands-on labs. You'll learn everything from simple O/S cold backups to RMAN incomplete recoveries using backup control files.

Oh, and now that I'm retired from teaching, you won't have to worry about me yelling at you.

Thanks for Reading,

Chris Foot
Oracle Ace


Monday, August 21, 2006  |  Permalink |  Comments (1)
trackback URL:   http://www.dbazine.com/blogs/blog-cf/chrisfoot/blogentry.2006-08-18.4294842011/sbtrackback

What about..

Posted by cmullins at 2006-08-21 10:48 AM
Nice job... but are you going to touch on disaster recovery issues in a future posting?

Also, you mention a third party book but not its title - I think you were going to provide a link to it but neglected to do so. Is it your book (which I think is excellect - http://www.amazon.com/exec/obidos/redirect?tag=mullinassoci-20&creative=9325&camp=211189&link_code=as2&path=ASIN/0974435538 ) or were you going to reference an Oracle Backup and Recovery book like this one - http://www.amazon.com/exec/obidos/ASIN/0072263172/mullinassoci-20/102-4833761-4300911?%5Fencoding=UTF8&camp=1789&link%5Fcode=xm2

Keep up the great work on this blog Chris... I always enjoy reading it!
 

Powered by Plone