Linux High Availability Cluster

By hantonov

Our application is using NFS server, since we use it just for serving static pages, we keep it on a virtual server running on VMWare ESX 3.5. This week the server froze and after cold restart didn’t boot. Luckily, the file systems were intact and I mounted them to a temporary server to export as an NFS service. The downtime was short but noticeable. After restoring the service I started to investigate the options to provide High Availability (HA) NFS service.
Looking from pure OSS perspective the only viable option with Ubuntu distribution is HA/DRBD configuration. It is not unusual for OSS solution to be a combination of several application to compliment functionality. I assume that for historical reasons, cluster solutions for Linux are mostly share nothing configurations. In the past, when shared storage was expensive, this was a great advantage that overweighted the complexity of the setup and the duplication of the data.
DRBD is used to replicate in realtime block devices. This is Active/Passive configuration where changes to the file system are replicated in realtime to the passive node. HA is responsible for the heartbeat and failover of the resources from Active to Passive node. For NFS, there are 3 resources. The IP address, the files system (replicated by DRBD) and the NFS service that needs to be activated after the file system is mounted after the failover.
After completing the setup and deploying to production, here some thoughts.

  1. The products are not integrated properly and although HA has a script to mount the DRBD device after the failover this is where the integration stops. HA has no knowledge about the status of DRBD and if there is a problem with the replication, the failover will still occur.
  2. Both products require monitoring of their own, especially DRBD, since if there is a problem with the replication, there is no way to know about it without directly checking DRBD.
  3. The replication is one way. Although HA supports fallback, the DRBD doesn’t out of the box and needs some tweaking with the scripts. If the Master server needs rebuilding, full replication must be performed before fallback is possible. This takes time depending on the size of the block device. During this time the cluster is not redundant.
  4. There are some concerns about latency that DRBD introduces into the I/O path since the replication is synchronous. Not that important in my NFS implementation, but can be a problem for MySql for example.
  5. HA is relatively primitive, and has no knowledge of the resources. For example, if NFS service will fail, HA will not initiate fail over. There is a need to write scripts to monitor the services (NFS, MySQL, etc…) or use applications like Munin.

I would really like to see a cluster product that is more like Windows cluster that after the initial shock of the price is very easy to install and support and has an API that if used can make any application cluster aware.
I saw that RedHat has a cluster offering that works with a shared storage/GFS. I think I will give it a try.

Tags: , , ,

One Response to “Linux High Availability Cluster”

  1. Florian Haas Says:

    Hillel,

    can I address some of you concerns here?

    1. “[Linux-HA and DRBD] are not integrated properly.” What makes you say so?

    3. “Although HA supports fallback, the DRBD doesn’t out of the box”. Excuse me?

    4. Latency concerns. Perfectly valid in the Gigabit Ethernet domain. However, we already have experimental (soon-to-be fully supported) support for DRBD for Dolphin Express, which dramatically cuts down on latency.

    5. “HA is relatively primitive, and has no knowledge of the resources. For example, if NFS service will fail, HA will not initiate fail over.” True only if you use Heartbeat in legacy R1 configuration mode, which is deprecated and will no longer be supported as of Heartbeat 3. Heartbeat 2 in CRM configuration mode (a.k.a. Pacemaker) does support resource monitoring.

    Disclosure: I work for LINBIT, the company behind DRBD.

Leave a Reply