🔙 Quay láșĄi trang táșŁi sĂĄch pdf ebook Backup & Recovery Ebooks NhĂłm Zalo Backup & Recovery Table of Contents Preface I Wish I’d Had This Book Only the Recovery Matters Products Change Backing Up Databases Is Not That Hard Bare-Metal Recovery Is Not That Hard How This Book Is Organized Part I Part II Part III Part IV Part V Part VI What’s New in This Book What’s Missing? Speaking of BackupCentral.com Conventions Used in This Book How to Contact Us SafariÂź Enabled This Book Was a Team Effort Contributors Technical Editors Horror Stories Special Mention I Don’t Know It All How Can I Say Thanks? 1. Introduction 1. The Philosophy of Backup Champagne Backup on a Beer Budget Why Should I Read This Book? Schadenfreude You Never Want to Say These Words You’re Curious About Open-Source Backup Products You Want to Learn About Disk-Based Backup Why Back Up? What Will Lost Data Cost You? What Will Downtime Cost You? Wax On, Wax Off: Finding a Balance Don’t Go Overboard Get the Coverage That You Need Why the Word “Volume” Instead of “Tape”? 2. Backing It All Up Don’t Skip This Chapter! The Impossible Job That No One Wants Deciding Why You Are Backing Up Deciding What to Back Up Plan for the Worst Take an Inventory Are You Backing Up What You Think You’re Backing Up? Back Up All or Part of the System? Deciding When to Back Up Backup Levels Which Levels Do You Run and When? “In the Middle of the Night...” Deciding How to Back Up Be Ready for Anything: 10 Types of Disasters Automate Your Backup Plan for Expansion Don’t Forget Unix mtime, atime, and ctime Don’t Forget ACLs Don’t Forget Mac OS Resource Forks Keep It Simple, SA Storing Your Backups Storage in General On-Site Storage Off-Site Storage Testing Your Backups Test Everything! Test Often Monitoring Your Backups You Can Always Make It Better If It’s Not Baroque, Don’t Fix It Following Proper Development Procedures Unrelated Miscellanea Protect Your Career Get the Money Your Backups Need Good Luck 2. Open-Source Backup Utilities 3. Basic Backup and Recovery Utilities An Overview How Mac OS Filesystems Are Different cpio ditto dd dump and restore ntbackup rsync System Restore tar Other Utilities Backing Up and Restoring with ntbackup Creating a Simple Backup Configuration Executing Your Simple Backup Restoring with ntbackup Using System Restore in Windows Creating Restore Points Recovering Windows Using a Restore Point Backing Up with the dump Utility Syntax of the dump Command The Options to the dump Command What a dump Backup Looks Like Restoring with the restore Utility Is the Backup Volume Readable? Blocking Factor Byte-Order Differences Different Versions of dump Syntax of the restore Command The Options to the restore Command Limitations of dump and restore Features to Check For Backing Up and Restoring with the cpio Utility The Syntax of cpio When Backing Up The Options to the cpio Command Restoring with cpio cpio’s Restore Options Telling cpio Which Device to Use Examples of a cpio Restore Using cpio’s Directory Copy Feature Backing Up and Restoring with the tar Utility The Syntax of tar When Backing Up The Options to the tar Command Syntax of tar When Restoring Some Other Neat Things About tar Backing Up and Restoring with the dd Utility Basic dd Options Using dd to Copy a File or Raw Device Using dd to Convert Data Using dd to Determine the Block Size of a Tape Using dd to Figure out the Backup Format Using rsync Basic rsync Syntax Restoring with rsync Backing Up and Restoring with the ditto Utility Syntax of ditto When Backing Up The Options to the ditto Command Syntax of ditto when Restoring Comparing tar, cpio, and dump Using ssh or rsh as a Conduit Between Systems 4. Amanda Summary of Important Features Client/Server Architecture Using Nonproprietary Tools Amanda Security Holding Disk Backup Scheduling Tape Management Device Management Configuring Amanda Backing Up Clients via NFS or Samba Backing Up Using NFS Backing Up via Samba Amanda Recovery Community and Support Options Future Plans 5. BackupPC BackupPC Features How BackupPC Works Installation How-To Security Versus Ease of Use Basic Sizing Installing BackupPC Starting BackupPC Using the CGI Interface Configuration Files Per-Client Configuration The BackupPC Community The Future of BackupPC 6. Bacula Bacula Architecture Bacula Components Interaction Between Components Bacula Features An Example Configuration Setting Up the Server Initial Backup (Linux Client) Initial Restore (Linux Client) Windows Backup Mac OS X Backup Advanced Features Bare-Metal Recovery Backup Traffic and Storage Encryption Python Script Support Client Script Support Autochanger Support ANSI and IBM Tape Labels File-Based Intrusion Detection Future Directions Pool Migration Tracking Deleted/Renamed Files Python-Based GUI Tool Base Job Support Client-Initiated Backups Plug-in Support for File Daemons 7. Open-Source Near-CDP rsync with Snapshots An Example Beyond the Example Understanding Hard Links Hard-Link Copies Restoring from the Backup Things to Consider rsnapshot Platform Support When Not to Use rsnapshot Setting Up rsnapshot The rsnapshot Community rdiff-backup Advantages Disadvantages Quick Start Windows, Mac OS X, and the Future 3. Commercial Backup 8. Commercial Backup Utilities What to Look For Full Support of Your Platforms Should You Back Up Special Files? Backup of Raw Partitions Backup of Very Large Filesystems and Files Aggressive Requirements LAN-Free Backup Server-Free (or Serverless) Backup De-Duplication Backup Systems Snapshots Replication Near-Continuous Data Protection Systems Continuous Data Protection Systems Remote Office Backup Simultaneous Backup of Many Clients to One Drive Disk-to-Disk-to-Tape Backup Simultaneous Backup of One Client to Many Drives Data Requiring Special Treatment Network-Mounted Filesystems Custom User Scripts Databases Storage Management Features Archives Hierarchical Storage Management Information Lifecycle Management Reduction in Network Traffic Keep Backup Traffic at the Subnet Level Use Client-Side Compression Incorporate Throttling Storage Area Networks Support of a Standard or Custom Backup Format Standard Backup Formats Custom Backup Formats A Reality Check Ease of Administration Security Ease of Recovery Protection of the Backup Index Robustness Automation Volume Verification Cost Vendor Final Thoughts 9. Backup Hardware Decision Factors Reliability Duty Cycle Transfer Speed Flexibility Time-to-Data Capacity Removability Cost Summary Using Backup Hardware Compression Density Versus Compression How Often Should I Change My Media? Cartridge Care Drive Care Nearline and Offline Storage Tape Drives Tape Drives Must Be Streamed Compression Makes It Harder to Stream Drives Variable Speed Tape Drives Helical and Linear Tape Drives Are Different Cartridges Versus Cassettes Midrange Tape Drive Types Optical Drives Optical Recording Methods Optical Recording Formats Automated Backup Hardware Disk Targets Disk-As-Disk Targets Disk-As-Tape: Virtual Tape Libraries Disk Features to Consider Disk-As-Tape: Virtual Tape Cartridges 4. Bare-Metal Recovery 10. Solaris Bare-Metal Recovery Using Flash Archive Backup and Recovery Overview Initial Considerations Preparing for an Interactive Restore Creating Flash Archive Images Bare-Metal Recovery with Flash Archive Setup of a Noninteractive Restore Noninteractive Setup Files Creating a Noninteractive Tape Image Creating a Noninteractive Disk Image Post-Recovery Procedures Final Thoughts 11. Linux and Windows How It Works If Then GOTO Choosing Backup Methods The Steps in Theory Step 1: Back Up Important Metadata Step 2: Back Up the OS with a Native Utility Step 3: Boot the System from Alternate Media Step 4: Restore the Boot Block Information Step 5: Partition and Format the New Root Drive Step 6: Restore the OS to the New Root Drive Assumptions Alt-Boot Full Image Method Create the Bare-Metal Backup Perform a Bare-Metal Recovery Alt-Boot Partition Image Method Create the Bare-Metal Backup Perform a Bare-Metal Recovery Live Method Create the Bare-Metal Backup Perform a Bare-Metal Recovery Alt-Boot Filesystem Method Create the Bare-Metal Backup Perform a Bare-Metal Recovery Automate Bare-Metal Recovery with G4L Advantages of G4L Drawbacks of G4L Setting Up G4L Using G4L Customizing G4L Commercial Solutions 12. HP-UX Bare-Metal Recovery System Recovery with Ignite-UX Ignite-UX Overview Network Services and Remote Boot Protocols Differences Between HP Integrity and HP9000 Clients Planning for Ignite-UX Archive Storage and Recovery Considerations for the Remote Booting of Clients Sizing the Recovery Archive Configuring an Ignite-UX Network Server Recovery Archive Management Implementation Example Command-Line Examples Verifying Archive Contents Troubleshooting Recovery Operations System Cloning Security System Recovery and Disk Mirroring 13. AIX Bare-Metal Recovery IBM’s mksysb and savevg Utilities mksysb and savevg Format Preparing to Use mksysb and savevg Backing Up with mksysb mksysb Summary Backing Up rootvg to Locally Attached Tape Backing Up rootvg to a Remote Tape Drive Backing Up to Disk Making a Bootable DVD/CD from an Existing mksysb Creating a CD/DVD Backup in One Step Setting Up NIM Setting Up a NIM Server Adding a Client Definition to NIM Setting a mksysb Definition for a Client savevg Operations Using savevg to Back Up a Volume Group Verifying a mksysb or savevg Backup Restoring an AIX System with mksysb System Cloning AIX 4.x Operating System AIX 5.x Operating System 14. Mac OS X Bare-Metal Recovery How It Works Preparing for a Bare-Metal Recovery Performing a Bare-Metal Recovery A Sample Bare-Metal Recovery Perform the Backup Recover the System 5. Database Backup 15. Backing Up Databases Can It Be Done? Confusion: The Mysteries of Database Architecture The Muck Stops Here: Databases in Plain English What’s the Big Deal? Database Structure The Power User’s View: Logical Elements of a Database The DBA’s View: Physical Elements of a Database Environment An Overview of a Page Change ACID Compliance What Can Happen to an RDBMS? Backing Up an RDBMS Physical and Logical Backups Get Every Instance Transaction Log Dumps Are Not Incremental Backups Do It Yourself: Creating Your Own Backup Utility Calling a Professional Restoring an RDBMS Loss of Any Nondata Disk Loss of a Data Disk Online Partial Restores Documentation and Testing Unique Database Requirements 16. Oracle Backup and Recovery Two Backup Methods rman User-Managed Backups Oracle Architecture The Power User’s View The DBA’s View Finding All Instances Physical Backups Without rman Cold Backup Hot Backup Debunking Hot-Backup Myths Physical Backups with rman Important New rman Features Automating rman Flashback Other Commercial Backup Methods Managing the Archived Redo Logs Recovering Oracle Using This Recovery Guide Seriously, Think About rman Step 1: Try Startup Mount Step 2: Are All Control Files Missing? Step 3: Replace Missing Control File Step 4: Are All Datafiles and Redo Logs OK? Step 5: Restore Damaged Datafiles or Redo Logs Step 6: Is There a “Backup to Trace” of the Control File? Step 7: Run the create controlfile Script Step 8: Restore Control Files and Prepare the Database for Recovery Step 9: Recover the Database Step 10: Does “alter database open” Work? Step 11: Are There Damaged Datafiles for Required Tablespaces? Step 12: Restore All Datafiles in Required Tablespaces Step 13: Damaged Nonrequired Datafile? Step 14: Take Damaged Datafile Offline Step 15: Were Any Datafiles Taken Offline? Step 16: Restore and Recover Offline Datafiles Step 17: Is There a Damaged Online Log Group? Step 18: Are Any Rollback Segments Unavailable? Step 19: Recover Tablespace Containing Unavailable Rollback Segment Step 20: Is the Current Online Log Damaged? Step 21: Restore and Recover All Database Files from Backup Step 22: Run alter database open resetlogs Step 23: Is an Active Online Redo Log Damaged? Step 24: Perform a Checkpoint Step 25: Is an Inactive Online Redo Log Damaged? Step 26: Drop/Add a Damaged, Inactive Log Group You’re Done! Logical Backups Performing a Logical Backup Recovering with a Logical Backup A Broken Record 17. Sybase Backup and Recovery Sybase Architecture Overview of the Sybase Architecture Sybase Command-Line Utilities Required Environment Variables The Power User’s View Server Engine Database Transaction Table System Table Index Stored Procedures The DBA’s View Page Extent Datafiles and Devices Segment Configuration File Transaction Log What Happens When Transaction Logs Fill Up? The interfaces File The SYBASE.sh and SYBASE.csh Files Backup Server Dump Device Hot and Cold Backups Protecting Your Database dbcc: The Database Consistency Checker Reorgs Update Statistics Configuration Audits Implement Mirroring and Disk Striping How to Back Up Your Servers Have a Run Book Backup Automation Through Scripting Backup Automation Basics Logical Backups Physical Backups with a Storage Manager Recovering Your Database Recovering from a Disaster Restoring from Backups Common Sybase Procedures Procedure 1: How to Start Sybase Procedure 2: How to See Whether Your Server Is Alive Procedure 3: How to Shut Down Your Server Procedure 4: How to Set Server Configuration Options Procedure 5: How to Set Database-Level Options Procedure 6: How to Run a Query Sybase Recovery Procedure Step 1: Can You Connect to Your Server Using isql? Step 2: Run the Stored Procedure sp_who Step 3: Blocked Processes Step 4: Log Suspend Step 5: You Can’t Connect Using isql Step 6: Check the Sybase Server Error Log Step 7: Check Whether Your Server Is Running Step 8: Running Server but Can’t Connect Remotely Step 9: Restart Your Server Step 10: Startup Failure Step 11: Contact Sybase Support Immediately Step 12: Able to Get Shared Memory? Step 13: Master Device Failure Step 14: Disk Device Failure 18. IBM DB2 Backup and Recovery DB2 Architecture The Power User’s View The DBA’s View The backup, restore, rollforward, and recover Commands The backup Command Recovery Types The restore Command The rollforward Command The recover Command Recovering Your Database Performing an In-Place Version Recovery Performing a Redirected Version Recovery Performing a Rollforward Recovery Reorganizing Data and Collecting Statistics 19. SQLServer Overview of SQLServer Connecting to and Administering SQLServer SQLServer Authentication The Power User’s View Instance Databases Tables Stored Procedures Memory Management The DBA’s View Database Files Filegroups Transaction Log Pages Extents Partitions Table and Index Specifics Snapshot Backups (2005) Backups Backup Devices Recovery Models Backup Types Backup/Restore of System Databases Viewing Information About the Backup Verify Backups Backup Expiration Date How to Back Up Transaction Log Backups Master Database Backups Scheduling a Backup Logical (Table-Level) Backups Restore and Recovery Components of a Restore Recovery Roadmap Database Restore Master Database Restore 20. Exchange Exchange Architecture Database Structure Extensible Storage Engine Stores Storage Groups Transaction Logfiles Checkpoint Files Reserve Logfiles General Logfile Info Circular Logging Other Files Single Instance Storage Automatic Database Maintenance Storage Limits Backup Backup Strategy Backup Types Determining What to Back Up Backup Methods Using ntbackup to Back Up Making a Basic Backup Verifying the Backup Restore Repair or Restore? Common Tasks for Repair or Restore Exchange Repair Exchange Restore Overview Restoring Exchange Mailbox or Public Folder Stores Offline Database Restore Recovery Storage Group Overlooked (and Often Easy) Restore Methods Using ntbackup to Restore 21. PostgreSQL PostgreSQLArchitecture Clusters Tablespace Pagefile/Datafile Startup Scripts System Tables Large Objects Rollback Process Write Ahead Log Backup and Recovery Using pg_dump with pg_restore Using pg_dump with psql Using pg_dumpall with psql Point-in-Time Recovery Creating a Backup to Use with Point-in-Time Recovery Restoring from a Point-in-Time Backup 22. MySQL MySQLArchitecture Shared Architectural Elements MyISAM Storage Engine InnoDB Storage Engine Other Storage Engines MySQLBackup and Recovery Methodologies SQL-Level Backup and Recovery File-Level Backup and Recovery Using Point-in-Time Recovery MySQLCluster Hot Backup and Recovery 6. Potpourri 23. VMware and Miscellanea Backing Up VMware Servers VMware Architecture VMware Backups Using Bare-Metal Recovery to Migrate to VMware Volatile Filesystems Missing or Corrupted Files Referential Integrity Problems Corrupted or Unreadable Backup Torture-Testing Backup Programs Using Snapshots to Back Up a Volatile Filesystem Demystifying dump Dumpster Diving Answers to Our Questions A Final Analysis of dump How Do I Read This Volume? Prepare in Advance Wrong Media Type Bad or Dirty Drive or Tape Different Drive Types Wrong Compression Setting/Type The Little Endian That Couldn’t Block Size (Tape Volumes Only) Determine the Blocking Factor AIX and Its 512-Byte Block Size Unknown Backup Format Different Backup Format Damaged Volume Reading a “Flaky” Tape Multiple Partitions on a Tape If at First You Don’t Succeed... Gigabit Ethernet Disk Recovery Companies Yesterday Trust Me About the Backups 24. It’s All About Data Protection Business Reasons for Data Protection Mitigating Risk Reducing Costs Improving Service Levels Technical Reasons for Data Protection Device Issues External Threats Backup and Archive What Needs to Be Backed Up? What Needs to Be Archived? Examples of Backup and Archive Can Open-Source Backup Do the Job? Very Active Filesystems Very Large Filesystems Filesystems with Too Many Files Information Stored in Databases Information Stored on Shared Storage Disaster Recovery Everything Starts with the Business Define the Core Competency of the Organization Prioritize the Business Functions Necessary to Continue the Core Competency Correlate Each System to a Business Function, and Prioritize Define RPO and RTO for Each Critical System Create Consistency Groups Determine for Each Critical System What to Protect from Determine the Costs of an Outage Plan for All Types of Disasters Prepare for Cost Justification Storage Security Plain-Text Communication Poor Authentication and Authorization Systems Backup Flaws Conclusion Index Backup & Recovery W. Curtis Preston Editor Mike Loukides Copyright © 2009 O'Reilly Media, Inc. O'Reilly Media Preface I hope you learn half as much reading this book as I did writing it. This was quite an interesting project, where we took the original book and expanded its scope so much that we had to change its title. I wrote Unix Backup and Recovery seven years ago, and a lot has changed since then—both in the industry and in my life. The biggest change in the industry has been the proliferation of Windows, Mac OS, Exchange, and SQLServer in the data center. (I never saw the Apple Xserve coming.) The biggest change for me has been having my eyes opened to backup and recovery applications beyond those considered “traditional.” It’s true that I spend most of my professional life consulting with large companies that spend enough on backup software and hardware to fund a small army. I enjoy doing that. It’s very rewarding to show a company how to save millions of dollars a year and make their backups and restores faster and more reliable in the process. (By the way, if you need help with your backup system, drop an email to [email protected]—that’s what I do for a living.) I also spend a good deal of the time traveling the world speaking to users about how to do this themselves. And when I do, I always get questions like these: I got a quote for backup software from XYX, and they want $XXXX for backup software! Where am I supposed to get that kind of money!? I couldn’t afford backup software from XYZ, so we bought ABC instead, and it stinks. Can you recommend something better? None of the commercial utilities can back up my MySQLor PostgreSQLdatabase. How do I do that? How do I do bare-metal recovery on ABC operating system? Aren’t there open-source utilities that do this kind of thing? So while I’m actually preparing to write my next book on how to select, install, and manage commercial backup software systems, I felt that this book needed to come first. This book is aimed at the people who feel that the commercial software products aren’t meeting all their needs. Perhaps you’re a small shop that can’t spend $10,000 just to get decent backup software. Perhaps you’re already using a commercial backup software package, but you don’t want to spend thousands of dollars on their agent to back up your DB2 databases, or you can’t find anybody to back up your MySQLor PostgreSQLdatabases. This book is about giving you options—free options. Almost everything I talk about in this book is either included with your operating system or application, or is available as an open-source project. (The commercial products I do mention cost only $99.) You may be amazed at what you can do for free or almost free. I Wish I’d Had This Book I wanted to write a book that would ensure that no one would ever have to start from scratch again, and I believe that my contributors and I have done just that. It contains every backup tool that I wish I had when I first entered the backup business and every lesson and trick that I’ve learned along the way. It covers how to back up and recover everything from a basic Linux, Windows, or Mac OS workstation to a complicated DB2, Oracle, or Sybase database—and a lot of things in between. Whether your budget barely stretches to cover the cost of the backup media or allows you to buy a silo bigger than your house, this book has something for you. Whether your task is to figure out how to back up, with no commercial utilities, an environment such as the one I first encountered or to choose from among more than 50 commercial backup utilities, this book will tell you how to do it. With that in mind, let me mention a few things about this book that are unique. Only the Recovery Matters As my friend Joe Fitzpatrick used to tell me, “No one cares if you can back up—only if you can recover.” Yet how many backup chapters have you read that dedicate less than 10 percent to recovery? You won’t find that in this book. I have tried very hard to ensure that recovery is given equal treatment. Products Change Some people may be surprised that there are no product names mentioned in the commercial backup section. I did this for several reasons, the main one being that products change constantly. It would be impossible to keep this book up to date with more than 50 backup products that are available for Unix alone. In fact, the book would be out of date by the time it hit the shelves. Instead, this book explains the concepts of commercial backup and recovery software, allowing you to apply those concepts to the claims that the vendors are currently making. Up-to-date information about specific products is available on http://www.backupcentral.com. Backing Up Databases Is Not That Hard If you’re a database administrator (DBA), you may not be familiar with the commands necessary to back up your database. If you’re a system administrator (SA), you may not be familiar with the architecture of the database platform your DBA is using. Both concepts are explained in detail in this book. I explain the backup utilities in plain language so that any DBA can understand them, and I explain database architecture in such a way that an SA, even one who has never before seen a database, can understand it. Bare-Metal Recovery Is Not That Hard One of these days you will lose the operating system disk for an important system, and you will need to recover it. This is called a bare-metal recovery. The standard recovery method described in many backup products’ documentation is to install a minimal operating system and restore on top of it. This is the worst possible method to do a bare-metal recovery of a system; among other problems, you end up overwriting some of the system files while the system is running from the very disk to which you are trying to restore. The best ways to do bare-metal recoveries for AIX, Solaris, HP-UX, Windows, Linux, and Mac OS are covered in detail in this book. How This Book Is Organized This book is divided into six parts, which are described in the following sections. Part I Part I of this book contains just enough information to whet your backup and recovery appetite. Chapter 1, The Philosophy of Backup Describes my philosophy about backup, such as why you should back up, and a little bit about how to do it, too. Chapter 2, Backing It All Up Goes into detail about the essential elements of a good backup and recovery system. Part II This section covers the basic backup utilities that are available to back up your system, and several open source backup systems to help you manage those backups. Chapter 3, Basic Backup and Recovery Utilities Covers the basic backup and recovery utilities you’re likely to find in Unix, Windows, or Mac OS, including dump, tar, cpio, dd, ditto, ntbackup, and rsync. Chapter 4, Amanda Covers the ever-popular Advanced Maryland Disk Archiver, or Amanda. Chapter 5, BackupPC Explains the disk-only backup system called BackupPC, which can actually back up far more than just your PC. Chapter 6, Bacula Covers Bacula. It roams the data center at night and sucks the vital essence from your computers. Chapter 7, Open-Source Near-CDP Covers three near continuous data protection (near-CDP) products, including rsync with snapshots, rsnapshot, and rdiff-backup. Part III If you have outgrown the capabilities of free utilities or would just like to take advantage of new backup and recovery technologies, you’ll need to look at a commercial product. You should also know about the latest hardware that is on the market to assess your full range of backup and recovery options. Chapter 8, Commercial Backup Utilities Is your guide to the hundreds of features available in the over 50 commercial backup products available on the market today, allowing you to make an educated purchase decision. Chapter 9, Backup Hardware Explains the many different types of backup hardware available today, and provides criteria to help you decide which type of backup drive is right for you. Part IV A bare-metal recovery is the fastest way to bring a dead system back to life, even if its operating system drive is completely destroyed. Chapter 10, Solaris Bare-Metal Recovery Explains Sun’s flash archive product, which is the Solaris equivalent of AIX’s mksysb. Chapter 11, Linux and Windows Explains a number of procedures and tools that can be used to perform bare-metal recovery of both Linux and Windows systems. It includes a discussion of Ghost for Linux (G4L) an open-source ghosting product. Chapter 12, HP-UX Bare-Metal Recovery Covers the make_net_recovery and make_tape_recovery tools, which now come with HP-UX to perform bare-metal recoveries. Chapter 13, AIX Bare-Metal Recovery Discusses AIX’s mksysb, probably one of the oldest and best-known bare-metal recovery tools. Chapter 14, Mac OS X Bare-Metal Recovery Covers how to perform your own bare-metal recovery of a Mac OS X machine. Part V This section explains in plain language an area that presents some of the greatest backup and recovery challenges that a system administrator or database administrator will face—backing up and recovering databases. Chapter 15, Backing Up Databases Explains database architecture while relating each architectural element to the appropriate term in DB2, Exchange, Informix, MySQL, Oracle, PostgreSQL, SQLServer, and Sybase. This chapter will be your friend if you’re an SA who’s afraid of databases or a DBA learning a new database. Chapter 16, Oracle Backup and Recovery Explains how to perform Oracle hot backups using rman or user-managed backup. Chapter 17, Sybase Backup and Recovery Shows how to use the backup server to back up Sybase ASE. Chapter 18, IBM DB2 Backup and Recovery Explains how to back up and recover DB2 databases. Chapter 19, SQLServer Explains how to back up and recover SQLServer databases. Chapter 20, Exchange Explains how to back up and recover Exchange databases using the built-in ntbackup plug-in for Exchange. Chapter 21, PostgreSQL Explains how to back up and recover PostgreSQLdatabases. Chapter 22, MySQL Provides an overview of the various backup and recovery options available for MySQL. Part VI The information contained in this part of the book is by no means unimportant; it simply wouldn’t fit anywhere else! Chapter 23, VMware and Miscellanea Includes VMware backups, the oft-debated “live filesystem dumps” question, and even some backup poetry. Chapter 24, It’s All About Data Protection Provides some food for thought, discussing the fact that backups are not the answer to all problems; you should also be thinking about other areas of data protection, such as archiving, disaster recovery, and storage security. What’s New in This Book See preceding section. Seriously, this book has about 75 percent new material when compared with Unix Backup & Recovery. Some chapters in the first book were completely rewritten for this book. Here are the highlights of those changes: A new philosophy This book reflects my new backup philosophy, which is that it’s all about disk—especially for smaller shops. New backup commands We’ve added ntbackup, ditto, and rsync to the basic utilities chapter. Amanda The Amanda chapter is completely updated to reflect the developments of the past seven years. Commercial utilities The commercial utilities chapter has been updated to reflect the advances in backup and recovery in the past seven years. HP-UX The make_net_recovery and make_tape_recovery tools have changed, and so has the chapter covering them. Backup hardware Boy, has hardware changed in seven years! Disk targets, virtual tape libraries, and data de duplication systems. I cover it all. There are eleven completely new chapters, significantly expanding the scope of this book. Here are the topics covered in these new chapters: DB2 How to back up DB2 using its built-in capabilities Exchange How to back up Exchange using ntbackup SQLServer How to back up SQLServer using its built-in capabilities MySQL How to back up and recover MySQLdatabases based on the MyISAM, InnoDB, and NDB storage engines PostgreSQL How to back up and recover this popular open-source database using either pg_dump or pg_dumpall BackupPC How to use BackupPC, a completely disk-based backup and recovery system with a web frontend Bacula How to use Bacula, an open-source backup product that roams the datacenter at night and sucks the vital essence from your computers Near-CDP How to use snapshots and replication to make backups Solaris How to do bare-metal recovery using flash archive Linux and Windows bare-metal recovery How to use a Linux LiveCD or Ghost for Linux to perform bare-metal recovery of Windows and Linux operating systems Mac OS X How to use the built-in, bare-metal recovery in OS (it isn’t too hard) What’s Missing? For various reasons, some chapters from Unix Backup & Recovery did not make it into this book. All of the following chapters are now available online at http://www.backupcentral.com. The one challenge, of course, is that these chapters have not been updated. Therefore, we’ve put them in our wiki so that anyone who wants to help us update them can do so. Tru64 Bare-Metal Recovery IRIX Bare-Metal Recovery Informix Backup and Recovery Clearcase Backup and Recovery High Availability Speaking of BackupCentral.com We’ve completely redesigned http://www.backupcentral.com using a content management system, forums, and MediaWiki. My number one goal for the new Backup Central is to make it much easier to provide you dynamic content and to build a strong community around backup and recovery issues. The new Backup Central has some really great features: phpBB forums for various backup-related topics, including one for discussing the book. Come join the discussions. A mailing list for each forum, allowing you to follow the discussions via the forum or email. Any posts in the forum are sent to the mailing list, and emails sent to the mailing list result in posts or replies in the forum. A multidirectional connection between backup-related Usenet newsgroups, mailing lists, and phpBB forums. One of the things I was reminded of while writing this book is that Usenet is alive and well, and I want to bring this great resource to the Backup Central community and to create another portal into this underutilized resource. Each relevant Usenet newsgroup has an associated mailing list and forum, and all messages to Usenet, the mailing list, or the forum go to the appropriate forum, mailing list, and newsgroup. A wiki based on MediaWiki, the same software that drives Wikipedia. One of the things you will find there is a wike entry for every chapter in this book. We’re going to use these entries to update and further the ideas you find in this book. We’re doing this for two reasons: The first reason is that one of the problems with writing a technical book is that the second you go to press, something changes. While we’re in the process of getting this book printed, MySQL will come out with three more storage engines, QTParted will probably support NTFS, and Bacula’s Windows Server will become generally available. We’ll use the wike to keep things like this up to date. The second reason is because my contributors and I don’t know all the answers, folks. We did our best to come out with a solid book for you, but we haven’t seen everything you’ve seen. We’d love it if you help us further the ideas mentioned in this book, help us to explain the scenarios under which a given procedure won’t work, or how a given procedure should be enhanced. (For example, right now a friend of mine is trying to help me understand how to get rsync to better handle millions of files. His testing won’t be done in time for press. Put it in the Wiki, Jason.) Join the new Backup Central and save the world—or at least its data. I’ll see you at http://www.backupcentral.com! Conventions Used in This Book The following typographical conventions are used in this book: Constant width Indicates command-line computer output, computer-generated messages, and code examples. It is also used when referring to commands and parameters in text. Constant-width italic Indicates variables in text. Constant width bold Indicates user input in command-line examples. Constant width italic bold Indicates variables in command-line examples. Italic Introduces new terms and indicates URLs, files, directories, hostnames, and file extensions. How to Contact Us We have tested and verified all the information in this book to the best of our ability, but you may find that features have changed (or even that we have made mistakes!). Please let us know about any errors you find, as well as your suggestions for future editions, by writing to: O’Reilly Media, Inc. 1005 Gravenstein Highway North Sebastopol, CA 95472 800-998-9938 (in the United States or Canada) 707-829-0515 (international/local) 707-829-0104 (fax) We have a web page for the book that lists examples, errata, or any additional information. You can access this page at: http://www.oreilly.com/catalog/9780596102463/ To comment or ask technical questions about this book, send email to: [email protected] For more information about books, conferences, Resource Centers, and the O’Reilly Network, see the O’Reilly web site at: http://www.oreilly.com SafariÂź Enabled When you see a SafariÂź Enabled icon on the cover of your favorite technology book, that means the book is available online through the O’Reilly Network Safari Bookshelf. Safari offers a solution that’s better than e-books. It’s a virtual library that lets you easily search thousands of top tech books, cut and paste code samples, download chapters, and find quick answers when you need the most accurate, current information. Try it for free at http://safari.oreilly.com. This Book Was a Team Effort It’s true; my license plate does say MRBAKUP. But that doesn’t mean I know everything about backup and recovery. In fact, I’ve never even used some of the operating systems or database platforms covered in this book! It would be a disservice to you, the reader, for me to write chapters on those products—but I wanted the chapters in the book. So I hired a team of experts to write the chapters for you. Approximately 250 pages of this book were written by others, and contributors are recognized at the beginning of the chapter(s) they wrote. Contributors It’s not an easy thing to write a chapter in someone else’s book. Not only do you have to write, but you have to write based on someone else’s design. There are also tight deadlines, and the process is nothing but hurry up and wait. I couldn’t have done it without them, so please allow me to formally thank all of my contributors. Amanda Contributed by Dmitri Joukovski and Stefan G. Weichinger. Thanks for seeing this one through. BackupPC Contributed by Don “Duck” Harper. Quack! Bacula Contributed by Adam Thornton. Thanks for bringing Bacula to the book. Near-CDP Contributed by Michael Rubel, Ben Escoto, and David Cantrell. This chapter morphed a few times, and I appreciate your patience as it gelled in my head. AIX bare-metal recovery Contributed by Mark Perino. I think you are the fastest writer on the team. HP-UX bare-metal recovery Contributed by Eric Stahl and Ron Goodwyn. Great collaborative effort, guys. Linux and Windows bare-metal recovery Contributed by Reed Robins. Maybe we can do it this way, or that way, or that way! Did I change the scope of the chapter enough? Thanks. Mac OS X bare-metal recovery Contributed by Leon Towns-von Stauber. Thanks, Leon. I sure am glad Mario told me to give you a call. Your chapter was perfect. Solaris bare-metal recovery Contributed by Aaron Gersztoff. Be careful what you ask for, right, Aaron? DB2 backup and recovery Contributed by Jeff Richardson, Kulvir S. Bhogal, and Kondal Yennaram. You guys all came through in a pinch, and I’m most grateful. Exchange backup and recovery Contributed by Scott Harris. More pictures! Fewer pictures! Make it like this, no make it like that! Isn’t it fun writing for me? SQLServer backup and recovery Contributed by Scott Harris. Look at that, Scooter! You’re the only one who was crazy enough to write two chapters for me. Thanks. Sybase backup and recovery Contributed by Edward Barlow, who updated a chapter originally written by Bryn Smith. Another contributor I couldn’t have done without. Thanks. Dump internals Contributed by David Young. When are you going to move out here with your Mom? Without these folks, this book would contain substantially less information than what you find here. Technical Editors Another group of people I must thank is our technical reviewers, and we had a lot of them! The problem with writing a book with a scope this big is that you need specialists and tech reviewers as well. Because of that, most technical editors reviewed only one or two chapters. I couldn’t have done it without them. I’m sure I’ve miss a few, but here’s my best attempt at listing them all (alphabetically by first name): Adrin Kow Andy Shellam Anthony Johnson Axel Schwenke Ben Garrett Brian Eliassen Brian Peasland Charles Whealton Chris Thomas Christoph Haas Craig Barratt D.A. Morgan Dana Diederich Daniel Callahan Dave Mehler David Boyd Edward Conba Eric Gilmore Eric Stahl Finn Henningsen Frank Sweetser Greg Lehey Ian Gorrie Ian Herd James Bougor Jayesh Thakrar Jeff Badger Jeff Frost Jeff Harbert Jeff Richardson Jeffrey P. Humes John Haight John Hurley John Madden Kern Sibbald Kumar Sundaram Lenz Grimmer Marcel Lans Mark D. Powell Mark Dawson Mark Perino Massimiliano Daneri Matthew Huff Megan Restuccia Mike Harrold Mohammed Mehdi Neal A. Lucier Norbert Munkel Patrick Matthews Paul Muggeridge Ralph R. Hirtler Rob Worman Rodrigo Real Satyaprakash Pandey Scott Boss Scott Harris Shane Seymour Simon Riggs Steve Hanson Stewart Smith Tammy Bednar Todd Toles VĂ­ctor A. RodrĂ­guez Vitalis Jerome Wil Coulbourn WilliamCole Horror Stories Also giving this book some flavor are those who contributed horror stories. Even if I couldn’t use your story in the book, I want to thank you for sending one in. Brian O’Neill Brian Sakovitch Chris Pritchard David Bregman David J. Young Harry Tirrell Hywel Matthews Jack Coats James Hunt Jason Frankovitz Jason Shupe JimDamoulakis JimDonnellan John Merryman Jorgen Lie Karl Langdon Kevin Suttle Mark Perino Michael Rice Michael Tobin Natalie Meek Richard Ackerman Scott Boss Theo Van Dinter WilliamBirch WilliamS. Duncanson Special Mention There were a few people who were extremely helpful in one way or another throughout this project. I’d like to send a special thank you to them. Anthony Johnson It’s not every CEO who would volunteer to tech review a chapter that is essentially a free competitor with his own product. I hope you and Storix do very well. You’ve got quite the Linux and AIX bare metal recovery tool there. Brian Peasland You beat the living crap out of the first draft of the Oracle chapter, and rightfully so. The new chapter is significantly better thanks to your thorough and honest review. (You made me rewrite half of it, dude!) Deb Cameron Isn’t it fun editing a book with 18 authors from 3 continents, several time zones, a few different native languages, using about 60 technical editors? Let’s do it again sometime! Joshua D. Drake Thanks, Joshua for spending the time you did with me on the phone to help me better understand PostgreSQL. Next time, you should be more forthright with your own opinions. A guy really doesn’t know where he stands with you. Lenz Grimmer You were my liaison in the MySQLcommunity, and I definitely needed the help. Thanks to you and the whole MySQLteam. Lynn Stone Thank you for helping get this project off the ground. I couldn’t have done it without you. Only I know your secret identity. Tammy Bednar For such a busy woman, you gave me exactly what I needed for the Oracle part of this project. You’ve obviously done a lot of work on the Oracle backup products, and it shows. I hope you’ll see my newfound respect for your products in the Oracle chapter. Zmanda Thanks for the support on the Amanda chapter, and for bringing commercial support to a very popular open-source backup tool. Mario Obejas Thank you so much for referring me to Leon. I Don’t Know It All If there’s one thing I learned while writing this book, it’s that I do not know everything there is to know about backups. If you have a better way to do anything described in this book, have learned any special tricks, or have written any neat utilities that you think would help other people do backups and recoveries, let me know. Email me at [email protected]. Your tricks or utilities may be included in the next edition of the book and listed immediately on http://www.backupcentral.com. How Can I Say Thanks? How can I begin to thank the hundreds of people who helped me? To God: May any praise for this book go to You alone. To my wife, Celynn: I say “thank you” for the many nights you spent alone while I pounded away at my keyboard somewhere around the globe. You’re a special woman who never gave up on me or my dream. I love you. To my daughter, Nina: You were only seven when the first book came out. Now you’re a beautiful young lady who is growing up so fast. I’m going to have to get a gun and sit on the porch. To my daughter, Marissa: You were only two when the first book came out. Now you’re a beautiful nine year old—my, how time has flown. Let’s go to the park and ride our bikes together. To my parents: What can I say? You always believed in me. You always used to tell me, “I don’t care if you’re a ditch digger. Just be the best darn ditch digger in the world.” Well, being a backup guy is as close as you can get to being a ditch digger in the computer business, and I “wrote the book” on that. To Bob Walker for helping me get my first job in backups, and Ron Rodriguez for being all too eager to give it to me. To Susan Davidson, who didn’t fire me when I couldn’t recover that purchasing database in 1992: that second chance was all I needed to become the expert in backup that I am today. If you had fired me (and I’m sure a few people wanted you to), who knows where I’d be today. (If you’re curious about the story, look for the sidebar “The One That Got Away” in Chapter 1.) To Collective Technologies for helping me round out my skills enough to see that I wanted to specialize in backup and recovery, and for supporting me when I wrote the first book. To Jason Stege, Robin Young, Jeff Williams, Reed Robins, and Elia Harris: Thanks for believing in me when I started my own company. I hope I did right by you. To Mark Shirman and all my friends at GlassHouse: Thanks for giving me a place where I finally feel like I’m using my talents. To my wife’s family: Thank you for raising such a wonderful lady. Thank you for treating me as one of your own and supporting us on our quest. Pahingi ng sinagang? To all the teachers who kept trying to get me to live up to my potential: You finally got through. To O’Reilly: Thank you for the opportunity to bring this much-needed book to market. To Deb Cameron and Michael Loukides, my editors: We’ll have to actually meet one of these days! I don’t know how you do this, reading the same book over and over, without letting your eyes just glaze over. You’re great editors, and I could really tell that you put your all into this project. Thank you, thank you, and thank you. (Now don’t edit that sentence, OK?) To the reader: Thank you for purchasing this book. I hope you learn as much reading it as I did writing it. Part 1. Introduction Part I consists of the following two chapters: Chapter 1, The Philosophy of Backup Describes why you should back up, and a little bit about how to do it. Chapter 2, Backing It All Up Goes into detail about the essential elements of a good backup and recovery system. Chapter 1. The Philosophy of Backup I back up; therefore, I will be. When I look at the title of this chapter, I think about the old Steve Martin stand-up routine in which he said that in philosophy class, “you learned just enough to screw you up for the rest of your life.” (Steve studied the important questions, like “Is it OK to yell ‘movie’ in a crowded fire house?” I promise not to do that.) However, “The Philosophy of Backup” did seem like an appropriate name for this chapter, since we’re going to talk about the why of backup. (We’ll also talk a little about the how, of course.) Champagne Backup on a Beer Budget A good backup and recovery system is essential for a company of any size. Unfortunately, IT doesn’t always get the budget it needs, and the backup system almost never gets the money that it needs. Well, if you agree that you need a very good backup system, but you don’t have enough money to pull that off, know that this book was written with you in mind. You need champagne backup on a beer budget. Welcome to the club. Just because you have a small budget doesn’t mean you have to do without backup. Most of the backup systems in this book can be implemented in small environments for a few hundred dollars—including hardware. Tip Don’t worry, enterprise customers—there’s plenty in here for y ou as well. The more y ou use the techniques taught in this book, the more money y ou can save for other IT projects. By the time you’re done implementing all the ideas in this book, hopefully my next bookwill be done, which will be right up y our alley. It will cover nothing but commercial data protection solutions, including multiplatform commercial backup and recovery sy stems, continuous data protection, near continuous data protection, data de-duplication backup systems, replication, and the like. Now that you’ve read this far, you may find yourself asking questions like these: Why should I read this book? Can I really back up with open-source backup software? Why should I be using disk? Why should I back up at all? How do I find a balanced way to back up (wax on/wax off)? Let’s get started answering these questions. Why Should I Read This Book? If you’ve been doing system administration for some time, you may be asking yourself this question. There are many answers. Perhaps self-preservation is your primary motivator. You’d like to make sure you don’t lose your job the next time a disk drive dies. Perhaps you’ve already got a decent backup system and you’d just like to make it better. Maybe you are looking for some new ideas about how to deal with upcoming backup and recovery needs. What follows are some of the reasons I think you should read this book. Schadenfreude Schadenfreude is a German word that means to take joy in the misfortunes of others. It’s why we watch those weird videos on the Internet where some idiot tries to do something stupid and ends up hurting himself. Each of the sidebars in this book is a true horror story that really happened to someone I know. These are not urban legends or horror stories passed on from admin to admin. These are firsthand encounters with disaster. There’s a schadenfreude element to reading these stories, of course. But each story also makes a point, and it was not just made up to make that point. The things that I warn about in this book really happen. This can be a very tough job if you are not prepared, so read closely. You might want to start by reading the sidebar “The One That Got Away” later in this chapter. It’s the story of the defining moment in my career. The One That Got Away “You mean to tell me that we have absolutely no backups of paris whatsoever?” I will never forget those words. I had been in charge of backups for only two months, and I just knew my career was over. We had moved an Oracle application from one server to another about six weeks earlier, and there was one crucial part of the move that I missed. I knew very little about database backups in those day s, and I didn’t realize that I needed to shut down an Oracle database before backing it up. This was accomplished on the old server by a cron job that I never knew existed. I discovered all of this after a diskon the new server went south. “Just give us the last full backup,” they said. I started looking through my logs. That’s when I started seeing the errors. “No problem,” I thought, “I’ll just use an older backup.” The older logs didn’t look any better. Frantic, I looked at log after log until I came to one that looked as if it were OK. It was just over six weeks old. When I went to grab that volume, I realized that we had a six-weekrotation cy cle, and we had overwritten that volume two day s before. That was it! At that moment, I knew that I’d be looking for another job. This was our purchasing database, and this data loss would amount to approximately two months of lost purchase orders for a multibillion-dollar company. So I told my boss the news. That’s when I heard, “You mean to tell me that we have absolutely no backups of paris whatsoever?” Isn’t it amazing that I haven’t forgotten its name? I don’t remember any other system names from that place, but I remember this one. I felt so small that I could have fit inside a 4 mm tape box. Fortunately, a sy stem administrator worked what, at the time, I could only describe as magic. The dead diskwas resurrected, and the data was recovered straight from the disk itself. We lost only a few day s’worth of data. Our department had to send a memo to the entire company say ing that any purchase orders entered in the last two day s had to be reentered. I should have framed a copy of that memo to remind me what can happen if y ou don’t take this job seriously enough. I didn’t need to though; its image is permanently etched in my brain. Some of this book’s reviewers said things like, “That’s pretty bold! You’re writing a book on backups, and y ou start it out with a story about how y ou messed up. Some authority y ou are!” Why did I include it? Through all the y ears, and all the outages, this one sticks in my mind. Perhaps that’s because it’s the only one that almost “got me.” Had it not been for the miraculous efforts of a wonderful administrator named Joe Fitzpatrick, my career might have been over before it started. I include this anecdote because: It’s the one that changed the direction of my career. There are several valuable lessons that I learned from it, which I discuss in this book. It could have been avoided if I had had a book like this one. You must admit that it’s pretty darn scary. You Never Want to Say These Words “We lost only a few days’ worth of data.” In the sidebar “The One That Got Away,” I said that we lost only a few days worth of data. I swore the day I said these words that I would never say them again. From that day forward, I was convinced of the importance of backups. I never again assumed anything, and I began to study everything I could about backup technology. This book represents my attempt to compile what I have learned about inexpensive backups into a single volume, and it is written so that no one who reads it should ever need to utter the preceding statement. In my opinion, no amount of data loss is acceptable . I would also wager that you would be hard-pressed to find an end user who would feel much different. Whether it’s a spreadsheet that one person created or a customer database representing hours or days of sales invoices and the efforts of hundreds of people—ask the person who needs the data how much data loss they think is acceptable. Every statement, every opinion, every story, and every chapter in this book is based on the premise that any data loss is unacceptable. Let me state that again for emphasis. Tip With the technology that is now available, there is no reason for any data to be lost—that is, if backups are given the proper attention and priority that they need. You’re Curious About Open-Source Backup Products Just a few years ago, you could perform your backups with a few scripts and dump, tar, or cpio, or ntbackup. The demand for midrange computers grew astronomically, and the need for bigger databases, larger drives or filesystems, long filenames, and long pathnames grew proportionally. These large databases and filesystems started shipping, which then created a large market for commercial backup utilities, and one or two such products emerged; scores of others eventually followed. Some of these early products were just GUIs and volume management built on top of existing native backup utilities to provide enhanced levels of functionality. Other companies felt that these native utilities had many limitations that could not be fixed without abandoning them altogether. Those companies chose to develop custom, even proprietary, backup methods. They attempted to overcome the limitations that products based on dump and tar could not overcome. In recent years, the demand for centralized backup and recovery has also given rise to a number of open source backup and recovery tools, six of which are covered in this book. The open-source backup market followed a pattern similar to the commercial products mentioned. The original open-source backup product, Amanda, is a wrapper around the native utility of your choice. BackupPC leaves data in its original format, and Bacula uses a custom format designed to overcome the limitations of GNU tar. There are now a number of choices in the open-source backup market. It’s quite possible that one or more of the open-source products covered in this book can meet your backup and recovery needs. This book is currently the only resource that covers all of these tools in a single place. You Want to Learn About Disk-Based Backup If you haven’t heard of disk-based backup or disk-to-disk-to-tape (D2D2T) backup, then it’s time to turn off the digital video recorder (DVR) and pick up a trade magazine or two. (Of course, your DVR is nothing more than disk-based backup of your TV. And if you’re occasionally making VHS tapes of your DVR shows, it’s even a D2D2T system.) The use of disk in backup and recovery systems has exploded in the last few years, and it’s really solving a lot of problems. Chapter 9 covers backup hardware and goes into much more detail about why disks have become a very attractive backup target. Here is a quick summary of some of those reasons: Cost The biggest reason that disk has become such an attractive backup target is that the cost of disk has been dramatically reduced in the last few years. The cost of a reasonably priced disk array is now approximately the same price as a similarly sized tape library filled with media. When you consider some of the things you can do with disk, such as eliminating full backups and redundant files, disk becomes even less expensive. Reliability Unlike tapes, disks are closed systems that aren’t susceptible to outside contaminants. In addition, the actual media of a hard drive is, well, hard when compared to a piece of tape media. The result is that an individual disk drive is inherently more reliable than a tape drive. Disk drives become even more reliable when you put them in a RAID array. Flexibility Generally speaking, tape drives can only go two speeds: stop and very fast. Yes, some tape drives support variable speeds. However, they can usually only slow down to about 40 percent of the rated speed of the drive. Disk drives, on the other hand, work at whatever speed you need them to go. If you need to go a few hundred megabytes per second, put a few drives in a RAID group, and blast away. Then if you need that same RAID group to write at 10 KB/s, go ahead. Unlike tape drives, disk drives have no problem writing slowly, then quickly, then slowly, then.... You get the picture. This makes disk a perfect match for unpredictable backup streams. Once all that random data has been written in a serial fashion on your disk device, the disks can easily stream backup data to tape —if that’s what you want to do. Some people are foregoing that step altogether and replacing it with replication. Try doing that with a tape drive. Disk-based backups are also an extremely economical way to bring completely automated backups to small and medium businesses (SMBs). While a large tape library can be very inexpensive (on a dollars per-gigabyte basis) and very expandable, the same is not always true of smaller libraries aimed at the SMB market. The big challenge is expandability. The less expensive a tape library is, the less expandable it usually is. (There are always exceptions, of course.) By comparison, some of the completely automated open-source backup products mentioned in this book can be used with a single disk drive costing less than $100. If you need to expand beyond that, just buy another disk and add it to your volume manager. You can also buy RAID controllers that allow you to start with one disk and add more as your needs grow. You can use this method to expand from hundreds of gigabytes to many terabytes of capacity. Why Back Up? I’ve heard it all. I’ve been accused of caring only about backups. It’s been said that I think the whole world revolves around a cartridge reel. I’ve said that someday the world’s going to crash, and I’m going to have the backup. The question is: how serious are you about protecting your data? To help you come to a decision on this matter, let’s talk about what happens if you don’t have good backups. That’s Not What I Meant! I was administering a QA group for a major software company. When the software did its setup, it created a directory in $HOME/foo so the software could install the user portion. A QA person was doing the install under root. The software created the directory $HOME/foo; literally, $HOME was the directory name. He submitted the bug fix, and decided to get rid of the useless directory. You’ve probably guessed it: # rm -rf $HOME (This was on a standard Unix sy stem where $HOME for root was still /.) Once I finally stopped laughing, I went and got the install media to rebuild the machine (no jumpstart image for that one, unfortunately, nor any backups for the various QA servers). Fortunately, most of the critical data was on the NFS server. William Birch What Will Lost Data Cost You? To answer this question, you need to consider what kind of data you are backing up. This is a perfect time to include people who may not consider themselves computer people. Get input from other departments to answer this question. When all those 1s and 0s come together, just what kind of information are we talking about? Do you use manual accounting methods or are your company’s financial records stored in some accounting software somewhere? When a customer calls in and orders something, do you jot that down on a carbon-copied order form or do you enter it in some sort of order processing program? What about things like budgets, memoranda, inventories, and any other “paperwork” that you throw around from day to day? Do you keep copies of every important memo that you send, or do you depend on the computer for that? If you’re like most people, you have grown quite dependent on these things we call computers. You forget how much of your work has been saved in the form of little magnetized bits spread out across a bunch of spinning platters. Maybe you work in an environment in which you’ve never lost a disk, so you’ve never had to do a restore. Maybe you’ve never fat-fingered a key and deleted an important file. If that’s the case, remember what my dad used to say: “motorcycle riders come in two types—those who have fallen and those who will fall.” The same is true of disk drives. If you’ve never had a failed disk drive, trust me, your turn is coming! So what would you lose if you lost data? To quantify this, we need to examine the types of information that may reside in your environment and what would happen if you lost each type of information. Most of what you could lose is very tangible—and quantifiable in monetary terms—and it might surprise you. Lost customers This is quite possibly the most tangible and most devastating of all losses. If your entire customer database is on a computer somewhere, how will you know who they are if that computer dies? You might actually “lose” your customers and never find them again. You could also lose customers who depend on data that is on one or more of your computers; if the customer finds out that you have lost his data, he will undoubtedly be less than impressed with you. The degree to which this data loss affects him may not even be relevant to him; he knows that you lost his data, and he might leave just because he no longer feels your company is competent. Orders Whatever service or product your company provides, you have some way to keep track of requests for that product or service. Again, chances are that the method is computer-based. Data loss may mean several hours, days, or even weeks of lost orders. These may be orders that your salespeople worked very hard to get! Morale Think about how you would feel if you were one of the salespeople whose orders were lost. You spent days or weeks working on sales, and now they’re gone forever. Maybe you should go somewhere where your hard work doesn’t go to waste. The better the salesperson, the better the chance that she may jump ship if you lose her sales. What about the average employee? If your computers have a reputation for going down and a reputation for losing data, it gives the employees a feeling of helplessness. Maybe they should go somewhere where they have the proper equipment to do their jobs. Image What about your standing in the industry? News of a major data loss undoubtedly spreads. This news may get to competitors, whom you can trust to use it against you at any opportunity. The news may also get to a regulatory agency that is in charge of your type of company. For example, if you work for a U.S. bank, it would be a terrible thing for the Office of the Comptroller of the Currency (OCC) to find out that you had a major data loss. They may decide to take a really close look at your affairs. Nobody wants that kind of attention! Budget It takes only one story of lost data to give your computer department an internal reputation for data loss. Try as you might to get rid of it, that reputation may stay for a while. You’re only as good as your last restore. (A friend of mine said, “You’re only as good as your worst restore.”) If people don’t trust your backups, they will duplicate your backup efforts. Employees will spend time and money backing up their systems locally. Each person may decide to buy his own backup drive and backup software or even to come up with his own in-house script. Their backups will be inefficient and costly at best, and may subject them to further data loss at worst. When everybody takes matters into their own hands, you can lose quite a bit of money in people-hours and extra hardware. Time How many people are supporting your computers? How much of their efforts will you lose if your development system loses data? I know of many companies that have numerous contract programmers writing code all the time. If the system that houses their work loses their code, how much money will you have wasted? In fact, no matter what department you look at, if they do their work on a computer, and you lose that data, you can lose considerable time and money. What Will Downtime Cost You? When planning your backup and recovery program, you may have several options that affect the speed of the recovery. The faster the recovery, the more the backup system will cost you. What you must ask yourself before deciding on these types of options is, “What will downtime cost?” When thinking about this, I’m reminded of a copier machine commercial from a few years ago: “When your copier goes down, do people just say, ‘That’s all right, we’ll just use carbon paper!’?” If one of your main systems goes down, can your people continue working, or does your entire company come to a standstill? If it comes to a standstill, are your people salaried, so that sending them home saves you no money? Here are some additional costs to consider: Customer perception A customer hates to hear, “Please call back; our computers are down,” or “Connection not responding.” Depending on your type of business, they might just decide to go elsewhere. The longer your systems are down, the more customers will hear this message. Employee perception Nobody wants to work at a company where the computers are always going down. The more your employees depend on your systems, the truer this becomes. If you were a salesperson who couldn’t use your contact database for a day or so, how happy would you be? Time Again, you lose time. You lose headway, and your salaried employees who depend on the system that is down are effectively being paid to do nothing. Wax On, Wax Off: Finding a Balance Using a system that has no backups is like driving a car 100 miles an hour down a busy road the day after your insurance policy expires. Likewise, having a three-node, highly available cluster for a noncritical application is like having full coverage on your 20-year-old fifth car. Just as insurance plans have different levels of coverage and riders to cover various types of damage, different backup methodologies provide different levels of recoverability. That Was Close One memorable moment was when we had a 600 GB file server that hadn’t been properly backed up in a while. During a particularly hot weekend, both A/Cs handling the room failed, and temperatures soared. We shut every thing down, waited for the A/Cs to be fixed, and started things backup after it cooled off a bit. Sure enough, two disks, phy sically next to each other in the same RAID4 array, failed. We were narrowly able to avoid total data loss by finding a spare diskand swapping control boards between it and one of the failed drives, which let it spin up and be accessed. We had the vendor courier us replacement drives the next day and then spent a lot of time fixing the backup server. Theo Van Dinter Don’t Go Overboard Not all environments need up-to-the-minute data recoverability. For many environments, recovering the systems up to last night’s backups is acceptable. For some environments, recovering the system even up to last week or month is OK. Spending thousands of dollars and hundreds of hours implementing the greatest backup solution in the world is a waste if you don’t need that level of coverage. This usually is not the problem for most sites; on the contrary, most sites don’t spend nearly enough money or effort on their backup and recovery systems. In other cases, however, money may be wasted on unnecessarily elaborate systems. Recoverability requirements also vary from machine to machine within the same company. The amount of work that would be lost, or the possibility of adversely affecting a customer, may determine these requirements. For example, it may be considered acceptable for an employee or two to lose a day’s work spent on a few word processing documents. That is, unless it was the Senior Vice President’s assistant who was working on the departmental budget, in which case your mileage may vary. And, it would probably be totally unacceptable for you to lose even one hour’s worth of entries into a companywide sales database used by hundreds of people. The point is that your backup requirements are determined by your recovery requirements. The difficulty comes in finding and using a tool capable of providing the level of recovery that you need. Consider users’ home directories for a minute. If they are local to each user’s workstation, a loss of one user’s disk in the afternoon would mean that one user would lose a few hours of work. However, if user directories are located on an NFS file server that serves thousands of users, you could potentially lose several thousand hours of work if you use only traditional backup tools. Tip If the loss of a networked file server is unacceptable, y ou might want to consider snapshot technology. Snapshot software allows y ou to take a “picture” of y our drive or filesystem at a single point in time and then use that picture to backup that drive or filesy stem. If the backup references the drive or filesy stem via this snapshot, it will back up a consistent picture of the drive or filesy stem as it looked at the time the snapshot was taken. If this kind of functionality is interesting to y ou, y ou might consider reading Chapter 7, which describes emulating snapshot functionality with rsync and hard links. Sometimes the tool you need comes with your operating system or database platform, but it’s just not being used properly. Sometimes backup tools aren’t being used at all. For example, if you have a production Oracle database, combining nightly hot backups with archived redo logs provides you up-to the-minute recoverability. However, if you lose a disk that is part of a database that doesn’t back up its transaction logs, you will lose all work since the last cold backup. See Part V for more information . Warning If y ou have a production instance of any kind and are not using the transaction logging feature of y our database engine, turn on logging as soon as possible! Therefore, while it is necessary to find the appropriate utility to give you the degree of recoverability that you require, it is also necessary to use it. Get the Coverage That You Need Some environments cannot afford even one minute of downtime, and they should pay for the best backup coverage—whatever it costs. This is because of the great loss that they will incur if they ever lose their systems for even a short period (I know of one company that claims that it loses over $1 million a minute when its systems are down). On the other hand, if you are in an environment that can afford downtime, then spending huge amounts of money for an immediately available hot site[1]is a complete waste of money. Consider Table 1-1. No one should depend on a car, or a computer, without having at least the basic level of coverage. If the only car that you own is uninsured and a drunk driver runs into you and totals it, how would you recover from such a loss? Similarly, if your computer systems have critical information stored on them, how will you recover when a hard drive crashes and all that data is lost? What some people forget is that the opposite of this equation is true as well. If you have a third car that happens to be a 20- year-old (nonclassic) car, you will probably get only liability coverage on it; you could live without that car if it were destroyed today. Spending hundreds of extra dollars a year to insure a $50 car just doesn’t make sense. Likewise, if the computers that you are managing are in an environment in which you can do without them for a few days, do you really need hot-swappable, mirrored drives? Pick an appropriate level of protection for your environment. Table 1-1. Comparing automobile insurance and data protection Types of coverageAutomobile insurance Computer backups Minimum coverage Unexpected disasters Get me driving now Major disasters Maximum protection Collision and liability (just keeps you from losing your shirt if you run into someone). Comprehensive coverage (vandalism, acts of God, etc.). Rental car coverage (you get a car if your car is in the shop due to an accident). Another company will pick up your policy and replace your car if both your car and your insurance company are destroyed in an earthquake. The insurance company not only agrees to the conditions listed earlier, but also agrees to store another car of the same model in another state that you can use at any time if all cars in your state are destroyed. Regular nightly backups (keeps you from losing your job when a disk drive dies) Journaling filesystemsUninterruptible Power Supplies (UPSs) RAIDMirroringUsing hot-swap drivesHigh availability (HA) system Sending copies of your backup volumes to off site storage, in case both your computer room and media library are destroyedSending your backups via a dedicated network to a large storage system at your off-site storage vendor Real-time mirroring to a hot-swappable system at another of your sitesSending your backups via either network or courier to a hot-site vendor You need to balance the cost of a particular backup implementation against the projected monetary loss of the outage from which it protects you. For example, assume that you are evaluating two backup choices. The first option involves sending copies of your backup volumes to an off-site vendor for storage at a cost of $500 a month. The second option is an immediately available standby machine in another city that receives up-to-the-minute replication data from your production machine; let’s say this option costs you $5,000 a month. Your company is located in Utopia, where no natural disasters ever occur, your disks are all mirrored, and you have determined that a day’s worth of downtime would cost only $500. Do you really want to spend $60,000 a year to protect against something that will probably never occur? If something catastrophic happened to your datacenter, wouldn’t the day-old, off-site copies serve just as well? Your company would suffer an extra day or so of downtime, but you have already determined that this is affordable. The $6,000-a-year solution is probably much more appropriate for this environment. However, are you protecting yourself from everything that you should be? Are you in an area that is prone to natural disasters and yet have no protection against that sort of event? Maybe you need to consider a different type of off-site storage. If you have a customer base that needs the data on your computers on a regular basis, have you provided for quick recovery in case of a failure? Perhaps you should be considering a hot site or multiple-site mirroring of your database servers. Table 1-1 provides a good overview of the various levels of coverage. Why the Word “Volume” Instead of “Tape”? Most backup utilities were originally written to back up to tape. Therefore, most books and online manuals talk about backing up to tape. However, many people are backing up to CDs, magneto-optical disks, or even disk drives. These media types have many advantages, because they act more like disk drives than tape drives. Random access of backup data is easier and you can read them using any block size you wish, because they do not record interrecord gaps like tape drives do. Since many people no longer use tape, this book uses the more generic word volume whenever appropriate. You’ll also find the term backup drive instead of tape drive. Again, that is because the backup drive could be a CD burner or a disk drive. The book uses the words tape and tape drive only when they are necessary and appropriate. Tip BackupCentral.com has a wiki page for every chapter in this book. Read or contribute updated information about this chapter at http://www.backupcentral.com. [1] A hot site is a place where you have computers standing by to do an immediate recovery of your environment. Chapter 2. Backing It All Up Now that the philosophy lessons of Chapter 1 are over, it’s time to look at some of the important concepts behind backup and recovery, such as what to include, when to back it up, and more.