Load Balancing and Failover with Gateway Groups

A Gateway Group is necessary to setup a Load Balancing or Failover configuration. The group itself does not cause any action to be taken, but when the group is used later, such as in policy routing firewall rules, it defines how the items utilizing the group will behave.

The same gateway may be included in multiple groups so that several different scenarios can be configured at the same time. For example, some traffic can be load balanced, and other traffic can use failover, and the same WAN can be used in both capacities by using different gateway groups.

A very common example setup for a two WAN firewall contains three groups:

  • LoadBalance - Gateways for WAN1 and WAN2 both on Tier 1
  • PreferWAN1 - Gateway for WAN1 on Tier 1, and WAN2 on Tier 2
  • PreferWAN2 - Gateway for WAN1 on Tier 2, and WAN2 on Tier 1

Configuring a Gateway Group for Load Balancing or Failover

To create a gateway group for Load Balancing or Failover:

  • Navigate to System > Routing, Groups tab
  • Click fa-plus Add to create a new gateway group
  • Fill in the options on the page as needed:
Group Name:

A name for the gateway group. The name must be less than 32 characters in length, and may only contain letters a-z, digits 0-9, and an underscore. This will be the name used to refer to this gateway group in the Gateway field in firewall rules. This field is required.

Tier:

Choose the priority for gateways within the group. Inside gateway groups, gateways are arranged in Tiers. Tiers are numbered 1 through 5, and lower numbers are used first. For example, gateways on Tier 1 are used before gateways on Tier 2, and so on. See the next sections for more detail on how to use Tiers.

Virtual IP:

Optionally specifies a virtual IP address to use for an interface, if one exists. This option is used for features such as OpenVPN, allowing a specific virtual address to be chosen, rather than using only the Interface address directly when a specific gateway is active in the group. In most cases, this is left at the default value Interface Address.

Trigger Level:

Decides when to mark a gateway as down.

Member Down:Marks the gateway as down only when it is completely down, past one or both of the higher thresholds configured for the gateway. This catches the worst sort of failures, when the gateway is completely unresponsive, but may miss more subtle issues with the circuit that can make it unusable long before the gateway reaches that level.
Packet Loss:Marks the gateway as down when packet loss crosses the lower alert threshold (See Packet Loss Thresholds).
High Latency:Marks the gateway as down when latency crosses the lower alert threshold (See Latency Thresholds).
Packet Loss or High Latency:
 Marks the gateway as down for either type of alert.
Description:

Text describing the purpose of this gateway group.

  • Click Save

Load Balancing

Any two gateways on the same tier are load balanced. For example, if Gateway A, Gateway B, and Gateway C are all Tier 1, connections would be balanced between them. Gateways that are load balanced will automatically failover between each other. When a gateway fails it is removed from the group, so in this case if any one of A, B, or C went down, the firewall would load balance between the remaining online gateways.

Weighted Balancing

If two WANs need to be balanced in a weighted fashion due to differing amounts of bandwidth between them, that can be accommodated by adjusting the Weight parameter on the gateway as described in Unequal Cost Load Balancing and Weight.

Failover

Gateways on a lower number tier are preferred, and if they are down then gateways of a higher numbered tier are used. For example, if Gateway A is on Tier 1, Gateway B is on Tier 2, and Gateway C is on Tier 3, then Gateway A would be used first. If Gateway A goes down, then Gateway B would be used. If both Gateway A and Gateway B are down, then Gateway C would be used.

Complex/Combined Scenarios

By extending the concepts above for Load Balancing and Failover, many complicated scenarios are possible that combine both load balancing and failover. For example, if Gateway A is on Tier 1, and Gateway B and Gateway C are on Tier 2, then Gateway D on Tier 3, the following behavior occurs: Gateway A is preferred on its own. If Gateway A is down, then traffic would be load balanced between Gateway B and Gateway C. Should either Gateway B or Gateway C go down, the remaining online gateway in that tier would still be used. If Gateway A, Gateway B, and Gateway C are all down, traffic would fail over to Gateway D.

Any other combination of the above can be used, so long as it can be arranged within the limit of 5 tiers.

Problems with Load Balancing

Some websites store session information including the client IP address, and if a subsequent connection to that site is routed out a different WAN interface using a different public IP address, the website will not function properly. This is becoming more common with banks and other security-minded sites. The suggested means of working around this is to create a failover group and direct traffic destined to these sites to the failover group rather than a load balancing group. Alternately, perform failover for all HTTPS traffic.

The sticky connections feature of pf is intended to resolve this problem, but it has historically been problematic. It is safe to use, and should alleviate this, but there is also a downside to using the sticky option. When using sticky connections, an association is held between the client IP address and a given gateway, it is not based off of the destination. When the sticky connections option is enabled, any given client would not load balance its connections between multiple WANs, but it would be associated with whichever gateway it happened to use for its first connection. Once all of the client states have expired, the client may exit a different WAN for its next connection, resulting in a new gateway pairing.