Netgate Releases High Availability for pfSense Plus on AWS®

Executive Summary

Netgate's pfSense^® Plus software is gaining traction among enterprise and government clients for its robust features, especially in AWS deployments. Recognizing the mission-critical need for uninterrupted services, Netgate introduces High Availability (HA) for pfSense Plus on AWS.

This innovative solution employs a dynamic primary/secondary firewall setup across different AWS availability zones. Both firewalls maintain identical settings through seamless configuration synchronization, ensuring consistent performance.

Powered by the resilient CARP protocol, these firewalls communicate consistently, thus enabling swift failover in case of a primary firewall outage. When the secondary takes charge, it seamlessly triggers AWS routing and IP adjustments via REST APIs, ensuring uninterrupted access for external users.

With Netgate's HA solution in AWS, enterprise and government clients can rest assured that their uptime requirements and internal SLAs are met, safeguarding mission-critical operations within AWS.

Introduction

pfSense Plus HA uses a pair of firewall appliances in a primary/secondary relationship to provide fast failover in case of AWS infrastructure or appliance failure. Customers can minimize risk by instantiating the firewalls in separate AWS availability zones. AWS availability zones are discrete data centers with redundant infrastructures. These AZs have high bandwidth and low latency connectivity and are typically less than 100 km apart. Availability zone subnets have separate Layer 3 IP addresses, which adds additional challenges. pfSense Plus AWS HA solves these challenges.

The pfSense Plus AWS HA solution builds upon standard HA features that customers have leveraged in data centers, branches, and remote offices worldwide. Additional AWS-specific features complete the solution, enabling fast failover and maintaining connectivity to critical cloud workloads and services. This article focuses on pfSense Plus at the edge with cloud ingress and egress. The details of workload configurations, use of load balancers, DNS, etc. are not discussed.

Standard pfSense Plus HA

pfSense Plus HA leverages a pair of firewalls. These firewalls form a primary/secondary relationship. One device is active and the other one is passive. Some customers may think that active-passive indicates they are under-utilizing resources. If they don't want a degraded HA failover, customers must provision each firewall to handle all traffic. This ensures there is no cost penalty for active-passive when compared to active-active. Standard pfSense Plus HA is built on three pillars:

The CARP protocol permits the FWs to form a primary/secondary relationship. Designated CARP interfaces send and receive messages using keepalives to detect failures and trigger events.
State table synchronization is achieved using pfsync. Since there is no shared subnet when deployed in separate AWS availability zones, state information will not match, and pfsync will not be used. pfSense Plus HA remediates this as shown in the next section.
XMLRPC protocol synchronizes primary and secondary configuration. Most configuration tasks only need to be entered into the primary device.

standard-ha-2

Standard HA

pfSense Plus with HA on AWS

AWS availability zone subnets have separate Layer 3 IP addresses. The CARP First Hop Redundancy Protocol (FHRP) has a floating IP address from the shared subnet, which is assigned to the primary device. When using separate AWS availability zones, there is no shared subnet. pfSense Plus AWS HA resolves this issue by leveraging AWS cloud native services. In pfSense Plus AWS HA, the AWS Elastic IP is used as the floating address. This is the WAN IP address of the firewall. In case of failover, the Elastic IP re-associates with the secondary firewall’s WAN interface.

Another critical element of the solution is route manipulation to send traffic to the acting primary firewall. The WAN and LAN subnets of both devices share a common route table, and the AWS route table is also shared. Traffic ingressing the cloud directs to the WAN interface of the primary device, and cloud egress traffic directs to the LAN interface of the primary device. In case of failure, the routes point traffic to the new primary (former secondary) firewall.

Both the Elastic IP reassignment and route manipulations are performed using REST APIs to the regional AWS API server. The pfSense Plus management interface sources requests and needs access to ec2.amazonaws.com. AWS service roles set permissions assigned to the instances. Administrators can use the least privileged permission. For more details, please see pfSense Plus on AWS, including the section on HA.

Note: There are many options for integrating firewalls in AWS. We do not detail back-end connectivity from the FW to workloads using ALB, ELBs, single VPC, transit VPCs, TGW, or Cloud WAN. We will also not address all NAT and SNAT options. For more information, please visit Amazon’s How to integrate third-party firewall appliances into an AWS Environment guide.

The following diagram details North-South packet flow during steady-state operations, where the configured primary pfSense Plus HA firewall is active and sending keepalives to the secondary firewall.

steady-state-ha-packet-flow-ingress

Steady state HA packet flow ingress with pfSense Plus performing SNAT

Traffic destined for AWS Cloud workloads ingresses into the VPC through the Internet Gateway using the AWS Elastic IP (EIP).

The EIP is associated with the WAN ENI of the primary firewall.

Traffic is sent to the WAN ENI of the primary firewall.

The firewall inspects, processes, and routes the traffic out of the LAN interface to the AWS route table associated with both LAN subnets.

The AWS route table assigned to both LAN subnets has a route pointing to the workload.

Traffic is sent to the workloads.

steady-state-ha-ingress-return Steady state HA Ingress return traffic with pfSense Plus SNAT

The return traffic is destined for the Primary FW WAN ENI because it was the SNAT’s source.

The AWS route table associated with both LAN subnets has a route pointing to the ENI of the primary firewall LAN ENI.

Traffic is sent to the LAN ENI of the primary firewall.

The firewall inspects, processes, and routes the traffic out of the WAN interface to the AWS route table associated with both WAN subnets.

The AWS route table associated with both WAN subnets has a route pointing to the IGW for egress to the internet.

After a failover, CARP and AWS HA will make the secondary FW primary.

HA packet flow ingress after failure with pfSense Plus performing SNAT

Traffic destined to AWS Cloud workloads enters into the VPC through the Internet Gateway using the AWS Elastic IP (EIP).

The EIP is associated with the WAN ENI of the secondary (now primary) firewall.

Traffic is sent to the WAN ENI of the secondary (now primary) firewall.

The firewall inspects, processes, and routes the traffic out of the LAN interface to the AWS route table associated with both LAN subnets.

The AWS route table assigned to both LAN subnets has a route pointing to the workload.

Traffic is sent to the workloads.

ha-return-packet-flow-after-failure

HA return packet flow after failure with pfSense Plus performing SNAT

The return traffic is destined for the secondary (now primary) FW WAN ENI because it was the SNAT’s source.

The AWS route table associated with both LAN subnets has a route pointing to the ENI of the secondary (now primary) firewall LAN ENI.

Traffic is sent to the LAN ENI of the secondary (now primary) firewall.

The firewall inspects, processes, and routes the traffic out of the WAN interface to the AWS route table associated with both WAN subnets.

The AWS route table associated with both WAN subnets has a route pointing to the IGW for egress to the internet.

The traffic is sent out to the IGW. The IGW will NAT and source the traffic with EIP and send it to the Internet.

For VPC to VPC traffic inspection (East-West), we can leverage AWS Transit Gateway (TGW) Connect Attachments to first send traffic to the pfSense Plus firewall. We can use BGP for Dynamic routing. The secondary does not participate in the BGP neighbor relationship until made primary. For this scenario, we can also leverage AWS Cloud WAN.

example-vpc-to-vpc-traffic-flow

Example of VPC to VPC traffic flow using the TGW and pfSense Plus AWS HA

ubuntu1 VPC has a default route pointing to the TGW.

Traffic is sent to the TGW.

TGW Connect attachment has a BGP relationship with pfsense1 and has a route pointing to it.

Traffic is sent to the pfsense1 LAN ENI.

pfsense1 inspects and routes traffic back out that same interface to the TGW.

TGW sees the next hop as ubuntu2 VPC.

failover

Failover

Conclusion

Netgate is excited to launch this free-of-charge feature in pfSense Plus for our AWS customers who require increased availability in the cloud. For more information on the step-by-step configuration, refer to this video. Netgate continues to listen to our customers, enhancing the pfSense Plus software experience by adding capabilities while maintaining the industry’s best price/performance ratio and the lowest TCO. For more information or to do a fully funded AWS POC, please contact sales@netgate.com.