Top 10 features that increase server availability

Availability — the probability that a server will be perform its intended function under normal operating conditions when needed, usually expressed as a percentage.

Examples of Server Availability

Table below shows the relationship between single server availability and downtime for commonly used availability levels. The downtime column lists the maximum amount of single server downtime permitted in a year that will still satisfy your single server availability goal. For example, to achieve 99.9% availability on a single server, the server cannot be down for more than 8.76 hours per year.

Single Server Availability (%) Downtime per year
99.9% 8.76 hours
99.95% 4.38 hours
99.99% 52 minutes
99.999% 5 minutes

 

 

 

Standard approaches for increasing server availability is to increase redundancy so that if one component fails, other components can be put into service. Higher end servers have hot-plug replaceable components.

  1. RAID: RAID allows for continued hard drive access in the case of a single hard drive’s failure, increasing fault tolerant capabilities within a drive array. Two common RAID levels are RAID level 1 (mirroring) and RAID level 5 (stripe with parity).
  2. Hot-plug hard drives: These hard drives can be removed or added to a system while the system is operating. This reduces system downtime by avoiding the need to power off the server.
  3. Hot-plug PCI cards: Similar to hot-plug hard drives, hot-plug PCI cards can be deactivated, allowing for removal and replacement of a PCI card while the system is in operation.
  4. Redundant and Hot-plug power supplies: Hot-plug power supplies provide redundant power to the server in the case of component failure. Two or more redundant power supplies are required to ensure zero downtime from power supply failures.
  5. Redundant NICs: Allows one to create redundant switched networks with automatic failover should one NIC card or network switch fail.
  6. Redundant Cooling Fans: The cooling of a server is generally achieved by cooling fans installed inside the network server, which pull in cool air from outside of the network server and remove heated air from the network server. Failure of a cooling fan can lead to heat buildup in the server, which in most cases will cause the server to shut down due to heat sensors in the server. Many server vendors have hot-pluggable redundant cooling fans that are , meaning zero downtime due to a cooling fan failure.
  7. Uninterruptible Power Supply: An uninterruptible power supply (UPS) is a battery backup system that supplies power to the server in the event of an electrical power outage. The UPS is designed to supply power to the server just long enough for the server to be shut down gracefully. The shutdown process is usually done by software running on the server that is monitoring the UPS by means of a serial cable running from the server to the UPS. A UPS also conditions the power (eliminates spikes and sags) before it gets to the server. If a server has multiple power supplies, each of them should be connected to a different UPS to provide the ultimate in power protection and server reliability.
  8. Emergency Generator: When true full-time (24 x 7) operation of the server is required, an emergency generator, which is usually diesel powered, will start up soon after a power failure and be able to supply power to the server (probably the entire server room), before the battery power of the UPS is fully exhausted.
  9. Server Location: A server should be housed in a room designed to support this very special and very expensive piece of equipment. The server room should be secured and only authorized personnel should be allowed into the server room. The server room should be environmentally controlled. Network servers do not function well in rooms that are too hot or too humid.
  10. Redundant Network Feeds: Make sure your server is connected to the Internet via redundant network feeds, ideally provided by different vendors. Make sure that these feeds have independent physical routes to the Internet nodes where they originate from.More...

Other options are to maintain an inventory of spare components plus having a contract with server vendor to in-house service within a specified time interval (e.g. same-day 4hr service).

More explaination on two major factors for availability : Disk reliability and Redundant NICs

If you liked this article, click here to buy me a beer!

Dear visitor, thanks for dropping by. If you enjoyed reading this post, you may want to subscribe to my RSS feed. It could could win you some great prizes this month. Thanks for visiting!



Get FREE Norton AntiVirus, provided to you by Google and AskStudent

Related Posts

Comments

4 Responses to “Top 10 features that increase server availability”

  1. Ellie on November 27th, 2006 10:04 pm

    A gem!

  2. hailey on November 29th, 2006 3:40 am

    I really like your site, keep up the good work.

  3. VINCENT LILLIE on November 30th, 2006 12:16 am

    Cheers & Keep blogging

  4. Susannah on December 2nd, 2006 1:29 pm

    Nice work, your blog is excellent. I was searching the internet for some info and I somehow ended up on your blog. Although your site is not exactly related to my search, I am certainly glad I stopped by. Oh well, back to surfing and I am sure I will find what I am looking for. Thanks for the interesting post.

Got something to say?