Backup and Redundancy > Key Concepts > Application Server Auto-failovers and Manual Takeovers > Auto-failovers
 
Auto-failovers
Auto-failovers have an n+1 redundancy scheme. That is, iControl doesn’t limit the number of Main Application Servers. However, iControl does limit the number of Application Servers acting in a Backup capacity to only one. When two or more Application Servers are configured as a Redundancy Group on the iControl—Redundancy configuration page, any one of several conditions can trigger an Auto-failover of service from a Main Application Server to the Backup Application Server. In such cases, the Backup assumes the full role and identity of the original (including its IP address if the Take over the main IP address after failover check box is selected), and becomes the new Main. This process does not require user intervention except in the initial Redundancy Group configuration and when performing failover recovery tasks such as Reverse Takeovers or Replace operations.
 

IMPORTANT: If you configure your Redundancy Group NOT to take over the Main’s IP address upon failover or takeover, make sure you keep the Backup Application Server’s IP address configured in System tools | Edit service locations | Service and alarm discovery on all Application Servers that belong to the Redundancy Group.

In a Redundancy Group topology configured for Auto-failover, there may be multiple Main Application Servers but only one Backup in an n+1 redundancy scheme.
To enable the Auto-failover feature, you must first configure your Redundancy Group on the iControl—Redundancy configuration page and manually enabled the Auto-failover function on one or more Main Application Servers.
The Backup Application Server monitors the health of the Main Application Server and its connection to devices and the network, through the use of a heartbeat trigger. As long as both of the following conditions are met, no Auto-failover will occur:
There is a heartbeat from the Main to the Backup.
The Main can communicate with other devices (besides the Backup Application Server) over its eth0 interface.
 

IMPORTANT: Ethernet Port Label Considerations

If your Application Server is a Dell PowerEdge R200, R210, R310 or R320, please read the section regarding Ethernet port labels (click HERE).

The heartbeat is carried on a Main network cable which connects all Main Application Servers in a Redundancy Group to the Backup Application Server.
The heartbeat cabling between the Main and the Backup has two cable paths: the Main network and the Heartbeat network. The Backup Application Server uses the Main network but switches to using the Heartbeat network if the Main network is unresponsive.
 

IMPORTANT: The Heartbeat network and the Main network use cables and equipment that are distinct from one another to avoid single points of failure.

[ Graphic ] 

The Main network serves as the medium through which replication occurs between the Main Application Server and the Backup, as well as being the primary path the Backup Application Server uses to test the heartbeat of the Main. Only if the Backup does not receive the Main Application Server’s heartbeat signal through the Main network will the Backup resort to the Heartbeat network to listen for the Main Application Server’s heartbeat.
 

NOTE: When connecting your Application Servers to the networks, use the eth0 port to connect to the Main network. Use eth1 to connect to the Heartbeat network.

When a Redundancy Group is configured and online, any one of the following conditions can cause a Main Application Server to automatically fail over to its Backup:

[ Table ]

 
 
Condition
Details
The Main Application Server loses network connectivity on the Main network.
The Main Application Server loses its network connection and is unable to service its clients.
The Backup Application Server loses network connectivity with the Main Application Server via both the Main and the Heartbeat network (and the Backup Application Server has connectivity on the Main network).
In this case, the Backup Application Server has no way of knowing if the Main server has connectivity on the Main network. It therefore takes over the Main’s responsibilities.
The Main Application Server stops responding or is too overloaded to answer the Heartbeat request quickly enough.
In this case, pinging between the Backup and the Main Application Servers is still possible but not quick enough.
The Main Application Server loses power.
The Main Application Server shuts down because of a power loss to that Application Server.
 

IMPORTANT: Make sure the Main Application Server’s resource usage is within acceptable parameters  

Prior to enabling the Auto-failover feature, the operator should make sure that the Application resource usage on the Main Application Server (e.g. CPU usage, RAM usage) is within acceptable limits so that it can respond to Heartbeat requests from the Backup Application Server monitoring it.

 

IMPORTANT: When configuring a Redundancy Group, make sure virtual machines are not mixed with physical machines. Additionally, if both Main and Backup devices are virtual machines, ensure they are running operating systems at the same bit-processing level (i.e. Main and Backup should have operating systems that are either BOTH 32-bit or BOTH 64-bit).

The iControl—Redundancy configuration page contains the following information about the Redundancy Group and its Application Servers.

[ Table ]

 
 
Parameter
Description
Parameter range
User editable?
Visible on Main Application Server?
Visible on Backup Application Server?
Role
The redundancy role of an Application Server
Main, Backup
Yes
Yes
Yes
Host name
Host name of the Application Server
Alphanumeric
Yes, from elsewhere in iControl
Yes
Yes
Configured IP
Configured IP address of the Application Server (retained after an Auto-failover or Manual Takeover has changed the current IP)
IPv4 address (xxx.xxx.xxx.xxx)
Yes, from elsewhere in iControl
Yes
Yes
Current IP
Current IP address of the Application Server
IPv4 address (xxx.xxx.xxx.xxx) (or Unknown if Application Server unreachable)
No
Yes
Yes
Operational state
The operational state of an Application Server
Main: Offline, Online
Backup: Standby, Online
No
Yes
Yes
Auto-failover function state
If enabled, the corresponding Main Application Server shall be monitored by Backup Application Server through heartbeat mechanism.
If disabled, an Application server will not Auto-failover to a Backup Application Server.
Enabled, Disabled
Yes
Yes
Yes
Take over the main IP address after failover
A function that, when selected, causes the Backup Application Server to take on the IP address of the Main during a failover or takeover. When disabled, the Backup keeps its own configured IP address.1
Enabled, Disabled
Yes
Yes
Yes
Auto-failover status
Running status message indicating the current Auto-failover status
Manual2, Automatic3, Takeover4
No
Yes
Yes
Extra IP (current free IP address in the floating address pool)
IP address currently available in the floating IP address pool5
IPv4 address (xxx.xxx.xxx.xxx)
Yes
Yes
Yes
Last replication result
Timestamp for the most recent replication of each Main Application Server
N/A
No
No
Yes
Backup used for Auto-failover
Backup Application Server displaying the server currently assigned as the Auto-failover Backup
Host name and MAC address (alphanumeric)
Selectable list
No
Yes
Replication frequency
List of preset replication frequencies
never
every 5 min
every 15 min
every 30 min
every 1 hour
every 2 hours
every 3 hours
every 6 hours
every day (default)
Selectable list
No
Yes

1 Grass Valley recommends you enable this function.

2 Manual: The heartbeat mechanism is disabled (therefore, not in Automatic or Takeover state).

3 Automatic: A valid Redundancy Group exists and an Auto-failover Backup is in Standby mode.

4 Takeover: A failover or a switchover is in progress. While this is occuring, no additional switchover or failover can be triggered.

5 This IP address will be given to any Application Server coming back online, provided the Extra IP isn’t already being used by a Main (If it is used, then the Main falls back to the factory IP address: 10.0.3.6).

All the information on the iControl—Redundancy configuration page of a Main Application Server is visible from that of a Backup Application Server. In addition, however, the iControl—Redundancy configuration page on the Backup Application Server displays the following:
timestamps for the most recent replication of each Main Application Server
a list allowing operators to choose which Main Application Server to perform a Manual Takeover on
the name of the Backup Application Server designated as the Auto-failover Backup, and the option of putting this Backup in Auto-failover Backup mode.
a list allowing operators to choose the replication frequency, in minutes
In addition, only from a Backup Application Server can the following operations be done:
Trigger a Manual Takeover
Perform a Reverse Takeover
Perform a Replace Takeover
 
 

See also:  

For more information about: 

Auto-failovers, click HERE.

Redundancy Groups, click HERE.