Today I would like to discuss the topic of RAID – more specifically backup issues. In the blog post title I proposed a daring thesis which I would like to develop and, as far as I’m able, to prove or defend it. At the very beginning, a few definitions and terms related to RAID have to be introduced, so the readers who aren’t familiar with RAID won’t feel confused.
What is RAID?
RAID technology extends the capabilities of drives and unifies individual disks in a group (or groups). Such groups are called arrays. Thanks to the combination of disks into arrays, extra disk capabilities become available. For example, fault tolerant disk (or disks) arrays, an increased read/write array transfer (or both) in comparison to the transfer on a single disk, or the possibility of expanding arrays to additional drives.
But let’s take it easy, not all at once. There are different types of arrays and their features depend on what types of arrays were used. In this article I would like to focus on just one of the properties of RAID, mainly redundancy or the array’s resistance to hard disk failures. Now let the fun with backup begin.
Proof No. 1 – All copies are not created equal
Imagine dear reader, that you are an administrator of a data server where drives work in RAID 1. To make matters clear, this is a type of array known as ‘mirror-type RAID’, as it saves the same dataon all disks found in the array. Saving SOMETHING on your array means that you will find SOMETHING on each of the RAID array disks.
Suppose that one beautiful day you find a corrupted file on your data server and, to add a dash of drama, assume that this file is a document that belongs to your boss. According to how RAID 1 works, the corrupted file was instantly saved on all disk arrays and there is no possibility to retrieve it. All in all, despite the fact that the copies of data are located on all disks, in case of a file corruption – all copies on all disks are damaged. That is the first difference between a copy on a RAID array and backup copy. When it comes to backup, the moment a file becomes corrupt its copy will still be safe on a backup server. Thus, all copies are not created equal.
Proof No. 2 – Attention, virus!
Let’s go back in time about two paragraphs, to the moment when your boss’ file hasn’t been damaged (same file server and array). Now, your boss downloaded something from torrents. It was supposed to be a free program for VAT taxes, but it turned out that it was a virus that deleted all the data. Just like in the first example (see proof No. 1), your boss’ data have been deleted from all disk arrays. A backup system should protect the data against deletion, giving a chance for data recovery at the same time.
Proof No. 3 – File system bites the dust
File systems also tend to fail. In such case, if a file system in a RAID array is damaged, the damage will be replicated x times, where x is the number of disks in the array. But there’s also the other side of the coin. If you manage to fix the file system, it will also be repaired on all of the disk array components. However, if some data will be lost during a file system repair and the data protection with a backup system was not done, then you can write those data off.
Proof No. 4 – Fire!
I would like to devote my last proof to the topic which is substantially unpleasant – a complete damage of the data server. Fire in the server room or any other factor that could destroy the whole server means data loss. Yes, yes, I understand that in the same burning server room there can be (but doesn’t have to) a backup server. However, there are some techniques that deal with the problem, even when there’s fire – a fireproof cabinet for storing magnetic tapes with data or data storage outside the company (called Offsite Backup) or even outside the building with the data server.
Now it will only get better
I have outlined four arguments that show why RAID is not the same as backup. I have focused on the weak points of RAID as a system of protection when it comes to data loss. Luckily, there are also many positive aspects of using RAID. Unfortunately, apart from those that were mentioned in the introduction, I’m not going to discuss them as this is a very broad topic and it’s not directly connected with the topic of my post.
If you’re interested, try finding some material on the topic of RAID on the Internet. The ‘data protection’ feature of certain types of RAID can protect the data on disks in case of a disk or disks failure, depending on which RAID type is used and the number of disks in the array. And in my view, that’s a strong point of some types of RAID – fault tolerance.
RAID together with backup
RAID technology is often used on servers recording data backups. It is an environment usually supervised by a backup server application that uses RAID arrays as a storage unit. Personally, I consider this as a good practice, provided that a suitable RAID was chosen.
We know that your data is priceless – calculate how many disks you need to get it safe!