Recent developments in pfSense

I thought I’d announce a few recent developments that may have escaped general attention. Here are updates from the team.

Renato has been busy converting our both our build infrastructure and pfSense to use FreeBSD’s pkg(ng). Renato has also backported the native-xtools pkg to stable/10, (Renato is a FreeBSD ports contributor, and this code has been accepted into FreeBSD 10-STABLE, so it should show-up in FreeBSD 10.2-RELEASE.) Is pfSense on ARM “on the way”? Read it yourself:

There has been a lot more work in this area to align our source repo with that of upstream FreeBSD, making moves from one base to the next much easier than the one from pfSense 2.1 (FreeBSD 8.3) to pfSense 2.2 (FreeBSD 10.1). pfSense on MIPS is a bit more problematic, and will be a bit farther beind. It may require an experimental release of pfSense based on -CURRENT.

Jim Pingle built and released a package for ftp-proxy, which should mostly solve the issues for those who need ftp running through the firewall.

Ermal Luci is working on porting the linux QuickAssist driver for 895x to FreeBSD with direct support from Intel. QuickAssist is the crypto/compression accelerator that Intel produces. We’ve seen 40Gbps of AES-256 + SHA1 on linux using “openssl -speed”. Ermal is also including support for cryptdev(4) on FreeBSD.

Since not everyone will have a QuickAssist unit to leverage, we’re continuing work on software crypto (including AES-NI). Our internal testing of IPsec performance relative to a FreeBSD baseline, linux and OpenBSD showed that linux was a bit faster at everything, and even OpenBSD 5.6 is faster than pfSense 2.2 at AES-CBC-256 + HMAC-SHA1, while pfSense is faster than OpenBSD 5.6 using AES-GCM. After investigating the issue, Ermal has responded with some preliminary work on a patch to cryptdev(4) to even the score. Same will be reflected back to FreeBSD, when ready, as will the changes to make AES-GCM work with IPsec in the FreeBSD baseline starting with FreeBSD 10.2-RELEASE.

We’ve back ported the 802.11 parts from -CURRENT, and these will be available in pfSense software 2.2.1. The result is much improved WiFi support, especially for 802.11n modes. This results of this testing and adoption has spurred Adrian Chadd to find and likely solve the last of the “missed beacon” issues as well, and the whole thing will likely MFC such that it, too, is available in FreeBSD 10.2. To be clear, Adrian doesn’t work here, but we’ve known each other a long time, and my interest in WiFi is well-understood.

Speaking of Adrian Chadd, he has been working on RSS support for FreeBSD. While this will eventually be useful to the lower layers, there are real issues with the locking in rtentry that prevent any additional performance gains from the multi-threaded pf in FreeBSD 10+. Once these rtentry locking issues are fixed, teaching ipfw and pf about using using rss buckets when doing keep-state will be straight-forward.We’re also investigating having netisr do hashing in software to distribute the load across all available cores. The first implementation will use XXXHASH (which we submitted as a faster hash for the threaded pf in FreeBSD 10, and same is available in FreeBSD 10.1) and than to some more performant ones, such as Cuckoo hash. So we’re not abandoning the in-kernel stack, but in-fact, working to make it better simultaneous with the next-gen work.

Speaking of the next-gen work: Preliminary results from Matt Smith have yielded 2.8Mpps on a c2758, and 14.88Mpps on a 12 core X5680 Xeon box. Note that these are (millions of) packets per second, not (billions of) bits per second, and that 14.88Mpps is “line rate” on a 10G cards. This is with reassembly, packet filtering, forwarding and (re)-fragmentation running in a “fast-forwarding” kind of way.

Part of the difference between the two platforms is that the packet filtering code perform an N-tuple search over a set of rules with multiple categories and find the best match (highest priority) for each category. (Succinctly, it is not ‘pf’, though it is designed to implement something a lot like ‘pf’.) On platforms which support AVX/AVX2, this code runs in vector registers, but the C2758 doesn’t support these, so the code has to run ‘scalar’.

In both cases, we’re only using 6 cores to produce these results. Intel announced Xeon-D yesterday with 8 cores and on-board dual port 10G Ethernet. I’ve discussed QuickAssist above. Link the pieces and you start to get a picture of the path we all travel together. When the paper that George Neville-Neil and I are giving later this week at AsiaBSDcon is out, you can compare these results to that obtained by linux, 11-CURRENT, pfSense 2.2 and even OpenBSD 5.6, as they use the same open source test harness. As the paper’s title directs, “Measure twice, Code once.”

In Open Source, co-operation is key. You support your upstream. You submit patches to upstream. You employ committers when you can. You review and accept both patches and ideas from your community. You look around for other projects with ideas and code you can leverage. You attend conferences to discuss your work and the work of others. In general, you work with the whole community around your project.