Redundancy in the IT field describes any hardware, software, or network setup in which there is duplication of components or functions of the system to allow for continued operation in the event of a failure. There are many types of possible redundant components and systems which can help servers, networks, and businesses to continue functioning uninterrupted even when a normally critical piece of the puzzle is broken. Did we mention that redundancy allows for continued operation in the event of a failure? In the IT field, being redundant can be very beneficial.
One of the most common redundant technologies is RAID, which stands for Redundant Array of Independent Disks. Numerous options can be selected for a RAID setup for a server or computer depending on the controller. The most common RAID levels are RAID 0, 1, 5, and 10. Each have their own fault tolerance and performance characteristics. In general, RAID levels above RAID 0 allow for a server or computer to continue functioning normally with no data loss even if one or more hard drives fails, depending on the RAID level.
For example, RAID 0 can be setup with 2 or more hard drives, and the storage, read, and write performance characteristics are multiplied by the number of drives. This means that there is more storage space and the data can be read/written faster resulting in a quicker server or computer. However, RAID 0 has zero fault tolerance or redundancy – just one failed drive means all data in the array is lost and now an attempt must be made to restore from a backup (RAID 0 is not recommended for servers for this reason). However, the higher RAID levels allow for fault tolerance. For example, in a RAID 1 setup, there are generally two identical hard drives in the array, but the total storage space is equivalent to just one hard drive. In addition to increasing read performance, a RAID 1 array allows for one of the hard drives to fail – the computer or server does not lose any data even with a single failed hard drive! RAID 10 essentially combines the best of both worlds from RAID 1 and RAID 0, providing faster read/write performance and allowing for up to two failed drives in a four-drive array (one failed drive per “pair”). Although the storage space is cut in half, this is often a good choice for servers. At Machado Consulting, we set up hardware “health” monitoring on every client server and, if the server supports RAID monitoring (as virtually all newer Dell/HP servers do), we are alerted to a failed drive and replace it before there is any noticeable impact. It is important to replace failed drives because no RAID array allows for all drives to fail without data loss. Although RAID is not a complete replacement for a good backup solution, it often allows for hard drive replacements before there are any noticeable problems and is a great example of redundancy.
There are many other examples of redundancy, including dual internet lines to a router (in case one of the ISP’s has an outage), dual power supplies on servers (usually plugged into two different UPSs or with one plugged into the wall so that the server continues running even with a power failure on one supply), and even redundant “clusters” of servers (e.g. setting up two or more hosts running “virtual servers” with VMware “Fault Tolerance”, which allows for continuous availability of the “virtual servers” even if one of the physical servers goes offline). Feel free to drop us a line any time to see how these (and more) redundant solutions can be implemented at your company!