Transcript for:
Data Backup Strategies

as we've mentioned a number of times in this course it's important that you have a strategy for backing up all of your systems this can really save the day when a document goes missing or a spreadsheet gets corrupted and if you lose all of your data on a system you can simply restore from your latest backup there are a number of things to consider when you're deciding on a backup solution one of them would be the total amount of data that you want to back up it's very different backing up a 100 megabytes versus 100 terabytes what type of backup do you want to use we'll go through a number of options in this video today what backup media do you plan on using where will this information be stored what type of software will be used for this backup and perhaps more importantly for the restoration of this data and should you perform this backup every day of the week should it be once a week we'll talk about different strategies for backups and more in this video if you simply want to back up every bit of data on a system to one single backup set then you're going to perform a full backup as the name implies this backs up everything on your computer so if you lose all of your data you can go back to your full backup and restore everything to your system since you are backing up every bit of data on that computer including the operating system your user documents and your applications this is one of the longest processes for backups you have to be able to transfer all of that data off of your system and onto some type of backup media because of the long time frames associated with a full backup it's difficult to perform one of these every single day not only does it take a long time to perform this backup it's also taking up a large amount of storage space every time you perform this full backup of course there are other ways to backup data instead of performing a full backup every day one of these methods would be a differential backup the first day of a differential backup looks identical to performing a full backup on that first day you take that full backup of the entire system on the second day however you do not take a full backup every backup that occurs after that full backup contains only information that was changed since the full backup this means on the second day there will probably be a small amount of data that had changed in the last 24 hours on the third day you're backing up everything that has changed since your full backup on the next day you're also backing up everything that has changed since that full backup so you'll notice each day your differential backup gets a little bit larger as you begin to back up more and more changed information if you need to restore from this differential backup you will need two different backup sets you will need the full backup that you originally did and then you will need your last differential backup that will contain everything that was there when you started the backup process and everything that had changed since that time frame let's say that we take a backup every single day and if we'd like to implement a differential backup we'll start our first backup day on Monday with a full backup on Tuesday we'll take a differential backup which will only be the information that's changed since Monday on Wednesday we take another differential backup this will obviously include some of the information that was changed from Tuesday and will certainly contain everything that has changed since the original Monday full backup on Thursday another differential backup which is everything that has changed since Monday and then in the middle of the day on Thursday we might decide that we need to restore everything on the system to be able to restore using a differential backup you will need the original full backup that we made on Monday and the last differential backup which in this case was made on Thursday another popular backup type is an incremental backup an incremental backup starts the same way that a full backup starts and starts the same way that a differential backup starts by taking a full backup on the first day but that's where things change with an incremental backup we are backing up everything that changed since the first full backup and things that have changed since the last incremental backup this means every day the backup size will be a little bit different it might be larger or smaller than the previous day depending on what may have changed during that time frame to restore all of the data on the system you would need a full backup and then you would need every incremental backup that was made since the full backup let's take the same scenario where we will perform a backup every day but in this case we're going to do an incremental backup on Monday we start with our full backup on Tuesday we take a backup of everything that's changed since the full backup on Wednesday we back up all of the data that's changed since the last incremental backup on Thursday we also take a backup of everything that's changed since the last incremental backup this will allow us to restore all of the most recent data to our system earlier in this video we described a full backup as taking a great deal of time but what if you could create a full backup without actually taking a full backup that's the idea behind a synthetic backup as we've already seen our differential and incremental backups take a full backup on the first day and then take a different amount of data on the subsequent days a synthetic backup will take all of this information combine it all together to create a full backup from everything that you already have this means that you don't have to spend more time creating a brand new full backup on Monday you can simply take the backups that you already have and create a full backup synthetically since we're not taking a complete full backup where we're transferring all of this information across the network we are saving a lot of bandwidth and we're certainly saving a lot of time this is a very easy way to create a full backup from all of the incremental or differential backups that you've already done let's now use our same scenario to create a synthetic backup on Monday the first day we need to take a full backup and let's assume in this case that this organization uses incremental backups you could of course create a synthetic backup from differential backups as well on Tuesday we take an incremental backup of everything that's changed since the last full backup on Wednesday we take another incremental backup that takes everything that's changed since Tuesday on Thursday another incremental backup which takes everything that has changed since Wednesday and then on Friday we put all of this information together we only take the latest version of files that we've backed up to create this synthetic full backup by simply combining together all the backup sets that we already have here's a summary of these different backup types we'll start with the full backup which obviously takes a backup of all of the data if you want to back up this data it takes quite a bit of time so we'll put that on the high category but restoring everything means that we're simply copying over all of the data so it has a relatively low restore time due to that single backup set a differential backup is going to back up all data since the last full backup since each day is only backing up the information that was changed since last full backup we can mark this as a moderate backup time and a moderate restore time one of the advantages of course to differential is that restore time is kept very low because we only need two backup sets the full backup and the last differential backup an incremental backup copies all new files and files that have been modified since the last backup this means that our backup time frames every day are relatively low but when we need to restore the backup time is relatively high that's because we will need the full backup and all of the incremental backups to be able to perform a full restoration and since a synthetic backup contains exactly the same information as a full backup all data on that system is backed up with a synthetic backup process since a synthetic backup is creating a full backup from backup sets that are already existing it has a relatively low backup time and a relatively low restore time because everything is in one single full synthetic backup you may think that after performing your backups that your job is done but in reality your job is only half done we know that we've backed up the data but are we able to restore the data that is an important step of any backup process it might be a good idea to simulate some type of recovery testing you can pick a particular document that is stored on a particular server and go through the process of restoring just that document to your recovery system this will either confirm that our backup and restoration process is working or it might discover that there is a problem with the restoration process that we need to resolve once we have our restored file we can test it and make sure that everything in that document is exactly the same as the original document that was backed up it's usually a good idea to perform audits of your backup process you never know what documents may have changed or what part of the backup process may have been altered so checking this on a regular basis should be part of your normal processes now that we've backed up our data and we've tested the restoration process we need to think about how we're going to perform that restoration there are different options that you can choose during the restoration process one of these would be to restore everything exactly in the same location where it originated this is an in place restoration where you are overwriting any data that might already be on that system with the information that's contained in the backup for example if you were performing a re-imaging of a system using those backup files you are very often performing an inplace restoration but you might also be concerned that your in place restoration might overwrite important data that has been changed since the backup was made in that case you might choose to perform your restoration to an alternate location this restoration option restores the files to a separate location instead of overwriting anything that might already be on that original system that way you can maintain all of your existing files and then have the original backup stored in a different location that you can then copy over later if you need to many organizations will perform on-site backups this means both the backup systems and the data are contained within the same facility this gives you high bandwidths between the backup system and the data itself and if you need to have access to those backup tapes for restoration all of that information is locally available since all of your data and systems are already running at this location you usually don't have to pay anything extra to maintain your backup systems at that same site some organizations will opt for an off-site backup where the information is stored somewhere different than your location you're transferring data usually over an internet connection or high-speed wide area network link since your data is located somewhere outside of your current location you're now protected against any disaster that might occur to your existing building if your building becomes a victim of a fire or a flood you can simply move everything to a different site and restore all of that off-site backup data from anywhere in the world there are obviously advantages and disadvantages to both on-site backups and off-site backups and many organizations will often use both of these to some degree so that they can take advantage of those backups in either of those scenarios we often think of backups as a single monolithic piece of data and that piece of data has no relationship with any other data or other backups that you're doing but in reality you're often taking different backups that contain different data and you're taking these backups on different days or different weeks one common strategy for timing and layering these backups is called GFS that stands for grandfather father son we start with creating the grandfather backup this would be a single full backup that occurs once a month this would be 12 monthly full backups in a single year and these might be the backups that we send off to offline storage so that we can retrieve those if something happens to our building now we can focus on a weekly backup and if you're doing weekly backups you might need four or five of those in a single month we refer to these backups as the father backup so if our grandfather backups are once a month and our father backups are once a week then obviously our son backups are going to occur every single day so we might have 31 daily incremental or 31 daily differential backups that we refer to as the sun backup here's an example of a month and you can see where I've overlaid son backups father backups and grandfather backups you can also change the time frames associated with these rotations perhaps the son backup is taken every hour your father backup is taken every day and the grandfather backup is taken every week on my calendar though I've created a different schedule where the grandfather is taken on the 31st of each month the father backup takes place every Monday and then the son backups are taken every Monday through Friday regardless of the type or interval between these backups we also need to think about where these backups will be stored and one good strategy that many organizations will follow is the 321 backup rule with this rule three copies of your backup data should always be available this means you could have one primary copy and two backups or any other combination so that we can get to three separate and unique copies of this backup data the number two in the 321 backup rule is two different types of media your backup could be taken on a local storage drive on tape backup or NAS so if your organization is following the 321 backup rule they may be storing one copy of the backup data on a local drive and other copies of the backup data on tape and the one in the 321 backup rule says that at least one copy of these backups should not be onsite they should be stored somewhere else offsite this means you could store the information in an off-site storage facility you might store it as part of your cloud backup but that information is stored outside of your local building so that if you ever need to access that data in a disaster you know that is stored somewhere safe