Ask the Exchange Pro 10-Minute Solution

Exchange Disaster Recovery Basics: Part I
By Ben M. Schorr

Exchange has gone from a "gee whiz" messaging system to a mission-critical application—in some companies the most mission-critical application. Therefore, as administrators we need to be prepared for when servers fail—not if they fail, but when. In the next two articles I discuss some best practices for making your Exchange server as fail-proof as possible in three steps.

The first step is to plan. Make your server as bulletproof as possible by using RAID 5 (striping with parity) disk arrays for your information stores and, if possible, RAID 1 (mirroring or duplexing) disk arrays for your transaction logs, applications, and operating system. The second step is to protect. Employ a solid backup strategy, sound configuration, active monitoring, and virus protection. The third step is to prepare. One day you may have to restore your server. Know the tools and techniques you'll need.

Step 1: Plan Your Server Setup
You can go a long way towards protecting your data simply by placing it on the most reliable hardware.

To start, place your information stores (pub.edb and priv.edb) on a RAID 5 array—that's striping with parity. You'll need at least three physical hard drives for this. The data will be striped across them, plus a parity bit will be written for protection.

RAID 5's primary advantage (at least when it comes to disaster recovery) is that you can lose one drive entirely and the system will continue to run. How does it do that, you ask? With the parity bit. Simply put, the system takes the data to be written to the drive and adds up all of the bits (0s and 1s). It then adds a parity bit to make the total come out the same way for each block of data. For example, if the byte is 11010110 you add up those bits (1+1+0+1 etc.) to get 5. If the parity is set to "odd," the machine sets the parity bit to 0 because 5+0 = 5, which is odd. If the parity is set to "even," the machine sets the parity bit to 1 because 5+1 = 6, which is even. The system will set all the blocks the same (odd or even) and stripe the data across the drives. So if a single drive is missing, the system can tell by simple addition what the bit on the missing drive must have been. If the remaining data adds up to be an odd number when it was supposed to be even, then the missing bit must be a 1. If the remaining data adds up to an even number, and that's correct, then the missing bit must be a 0.

Through that method, a RAID 5 array can compensate for the loss of a drive on the fly—that means the system keeps running and when the lost drive is replaced it automatically regenerates the data that is supposed to be on it. Note the small performance penalty when you lose the drive: the system slows down because it has to regenerate the lost data on the fly. But a slow server is better than a down server.

The next consideration is the log files for Exchange. The log files are vital, because all data to be written to the databases is first written to the log files. When database corruption stops the server, the log files might contain data that doesn't yet exist in the database. To recover your system fully, you need both the log files and the databases. Your first instinct might be to just place the log files on the same RAID 5 array as the databases, but placing them on their own, separate RAID 1 (mirrored or duplexed) array is actually preferable. You'll get better performance with the log files on their own physical drives, and keeping them separate can only improve your reliability.

Finally, the operating system itself is best placed on its own RAID 1 array.

So the optimal Exchange server configuration actually has at least seven physical hard drives: three for the RAID 5 array and two each for the RAID 1 arrays that host the log files and operating system. If you have to skimp somewhere, put the operating system on a single drive—you can always reinstall it and restore back to your previous configuration if you have to. The data in the log files and databases could be irreplaceable.

Placing the files on these drives is easy once they're in place—just run the Exchange Optimizer tool. It will evaluate your hardware and recommend optimum placement for the files. In many cases it already will have recommended moving the files to the proper drives. In other cases (such as if you're using hardware-based RAID) it might not realize which arrays are RAID 1 and which are RAID 5, so you'll have the option of manually specifying which files go to which drives. Once you've done that, the Exchange Optimizer will move all the files and make the needed configuration changes for you.

Keep in mind that all hard drives will fail eventually—it's just a matter of time. Make sure you know how to recognize that one of your drives has failed. The downside of RAID 5, if you can call it a downside, is that you might not know you've lost a drive right away. If one drive fails, the system will continue to run. But you'll need to replace the failed drive as quickly as possible, because if a second drive fails while the first is still down, your server will go down. You then will have to rely upon your backups and log files to get your data back.

Is it possible just to place all of the files on a single RAID 5 array? Yes—if you don't have the option simply to add the other drives, you can run the whole server off a single RAID 5 array. But realize that this is not optimal for performance or protection. It will work, and it gives you some measure of safety, but the additional drives are worth fighting for—so pitch them to management vigorously.

The amount of RAM you give your Exchange server is largely an optimization rather than a reliability issue, but I would prefer to start with 128MB and scale up from there. Certainly, I wouldn't even consider installing Exchange with less than 64MB RAM.

Be aware of the latest service packs and hot fixes for Windows NT and Exchange. You don't need to rush out and install each one the day it ships, but be aware of what the current levels are, what each one fixes, and what kinds of experiences other users are having. As a general rule, it's best to have the current service packs applied. But if your server is running well, you can sometimes afford to lag a version or two—especially if others in the community are reporting problems with the latest pack.

If you can afford it, set up an extra Exchange server in a "lab" that you can use for testing. Install the latest fixes and patches on it first. Make sure that they don't break any of your applications and that you fully understand the installation and side effects before you deploy that fix or patch to your production servers.

The lab server doesn't have to be a full-sized server. You can use an older machine that's been rotated out of production, a lightweight server, or even a souped-up workstation in many cases. Because the lab server doesn't actually have to support (m)any users, its performance is not so critical and you can afford to use less of a machine.

The lab server may also prove an invaluable resource if you ever find yourself needing to rush a replacement server into place with little or no notice. Think of it as a limited-service spare tire—useful for getting your systems back on the road until you can get your regular server repaired or replaced.

Step 2: Protect Your Server
The single most important thing you can do to protect the data in your Exchange server is to have solid, regular, verified, online backups of your data. To do this you need to have an Exchange-aware backup product like Backup Exec (with the Exchange add-in) or NT's own NTBackup.exe.

Online Backups vs. Offline Backups
Perhaps you're confused about which kind of backup to use or even what the differences between them are. An online backup is a backup conducted while the Exchange server is running and all of the services are active. It backs up the Information Stores and Directory. An online backup requires backup software that is Exchange-aware, hence my earlier reference to Backup Exec with the Exchange add-in or NTBackup.exe.

An offline backup is one performed with all of the Exchange services stopped. It's essentially just a file-level backup, similar to what you perform on your file server to back up documents, spreadsheets, etc. An offline backup can be done with any NT-supported backup program because it's really just a file backup.

For your everyday backups, online backups are the best way to go, primarily because of the log files.

Log Files
As I mentioned previously, when data is written to the Exchange databases it is first written to the log files. When the conditions are right for optimal performance, it is committed to the actual databases. If there is data corruption or failure during the interval between the transaction being written to the logs and the information being committed to the database, the data can be recovered simply by restoring your most recently backed-up databases and "playing back" the log files since your most recent backup.

After an online backup backs up the databases, it deletes the log files—in part to reset them so that if you have to do a restore your log files will begin where the last backup finished, and in part to release the disk space they were using for more log files. The alternative, which is not pretty, is circular logging.

Circular Logging
Circular logging is the default in Exchange 5.5. Put simply, it creates log files and when they start to fill up it starts to write over the beginning of the log files. In other words, they will never fill the drive because the newest log files will overwrite the oldest ones. The upside is the savings in disk space and assurance that your log files will never overflow your disk drive. The downside is the very real possibility that if you have a crash you will lose some data.

Disabling circular logging is the smart move, which you can do by going into Exchange Administrator, selecting your server, choosing File | Properties | Advanced Tab and clearing the checkboxes for circular logging of the directory and information stores.

If you're doing regular online backups, as you should be, then disk space for the log files shouldn't be an issue. The log files (which are always 5MB in size, by the way) will be deleted each time you do a backup.

One thing to consider when laying out your backup plan is that current computer data, and e-mail in particular, increasingly is being demanded as evidence in legal proceedings. I encourage you to consult with your legal counsel to determine the optimum backup schedule that will protect your data without causing you undue liability or hardship should the unfortunate ever happen and your company be involved in legal proceedings.

Don't Forget the Registry
While you can always reinstall NT Server and Exchange Server from your CDs, and reinstalling the service packs is annoying but relatively trivial, there is one other component of the server that you absolutely will need to restore your Exchange data cleanly: the Registry. Why? A lot of service-related and connector configuration information rests in the Registry. Also, if you're a small shop and the Exchange server is also your only domain controller, then it contains all of your account and security information. It's important to keep not one but two recently updated Emergency Repair Diskettes handy (keep one of them off-site).

Backup Types
There are four different types of backups you can do with Exchange, and you can do them in combination:

  • Normal or Full: This backup gets all of the data currently on the system. Done online it will also delete the log files when it's finished. If possible, this is the preferred backup to do—though a full backup every day won't be practical in some instances.
  • Incremental: This backs up only the log files and then deletes them. Typically, you would do one full backup a week, perhaps on a weekend when the system is less busy, and then do incremental backups each day. An incremental backup, because it only backs up the log files, is quite fast, but restoring is lengthier and has more chances for error. You can do only incremental backups, by the way, if you have circular logging disabled.
  • Differential: Differential backups are similar to incremental backups in that they back up only the log files. The primary difference, however, is that a differential backup does not delete the log files. Like an incremental backup, you would typically do a weekly full backup, then do daily differential backups. Like incremental backups, the first differential backup—because it's acting only on the log files—is quite fast. Unlike incremental backups, however, it allows the log files to continue to build up (until the next full backup deletes them), so each day the backup will take longer and longer. Restoring from a differential backup can be much faster and easier than restoring from an incremental one. As with incremental backups, this method will work only if you have circular logging disabled.
  • Copy: This backup is essentially the same as a full backup except that a copy backup doesn't delete the log files or update the backup records. As far as the system is concerned, no backup occurred. The only time I recommend one of these is if you're doing an "extra" backup and don't want to disrupt your usual schedule of full/incremental/differential backups.

What About Brick-Level Backups?
Brick-level backups are backups that store each mailbox separately, allowing you to restore individual mailboxes or items. It sounds like a good idea, but in reality it tends to be quite slow, consumes a lot of resources, and is rather inefficient for mass backups of large servers. Also, you need to use backup software that is specially designed to do brick backups. I'd recommend brick-level backups only as an addition to your regular backup scheme—if you have frequent need to restore individual mailboxes or items. In most instances, deleted-item retention, regular backups, and user education are more than sufficient.

In Part II, I'll take a look at Step 3, restoring your backups in the event of a disaster, and also cover server monitoring and virus scanning.

 
Other 10-Minute Solutions
 Personalizing Your Journal Entries
 Reliable E-mail Auto-forwarding
 Fine-Tuning Your Exchange Server: Part I
 Fine-Tuning Your Exchange Server: Part II
 Fine-Tuning Your Exchange Server: Part III
 Don't Go Relayin'...
 Using Public Folders to Share E-Newsletters
 Exchange Disaster Recovery Basics: Part I
 Cleaning the Nasty Stuff Off Your Exchange Server
 Handling Automatic Attachments in Outlook
 One-Click Pony Express
 Creating Custom Forms
 Using Combination and Formula Fields in Outlook Applications
 Backup and Restore in Exchange 2000
 Pulling a Switcheroo on Contact Data
 Regain Control of Outlook by Configuring the Security Patch
 The Right Format for the Right Recipient


Ask the Exchange Pro | Who Is the Pro? | Usage Policies | Ask a Question | Search | Feedback


Sponsored Links


Advertising Info  |   Member Services  |   Contact Us  |   Help  |   Feedback  |   Site Map
Jupiterweb networks

internet.comearthweb.comDevx.comClickZ

Search Jupiterweb:

Jupitermedia Corporation has four divisions:
JupiterWeb, JupiterResearch, JupiterEvents, and JupiterImages

Copyright 2004 Jupitermedia Corporation All Rights Reserved.
Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

Jupitermedia Corporate Info | Newsletters | Tech Jobs | E-mail Offers