**FlexTOE: Flexible TCP Offload with Fine-Grained Parallelism** [Rajath Shashidhara](https://homes.cs.washington.edu/~rajaths)^1, [Tim Stamler](https://www.cs.utexas.edu/lasr/profile.php?uid=187)^2, [Antoine Kaufmann](https://people.mpi-sws.org/~antoinek/)^3, [Simon Peter](https://homes.cs.washington.edu/~simpeter/)^1 ^1 University of Washington, ^2 UT Austin, ^3 MPI-SWS Published at [USENIX NSDI 22](https://www.usenix.org/conference/nsdi22/presentation/shashidhara). Open-Source on [GitHub](https://github.com/tcp-acceleration-service/FlexTOE) under BSD 3-Clause license. FlexTOE is a *flexible*, yet *high-performance* TCP offload to SmartNICs. * Eliminates almost all host data-path TCP processing. * Enables full customization via flexible C & eBPF-XDP extensions. * Supports POSIX sockets and interoperates well with other stacks. * Robust under packet losses and congestion. Our prototype on the [Netronome Agilio-CX40](https://www.netronome.com/media/documents/PB_Agilio_CX_2x40GbE-7-20.pdf) SmartNIC shows: * Memcached scales up to 38% better on FlexTOE versus [TAS](https://doi.org/10.1145/3302424.3303985). * Saves up to 81% host CPU cycles versus [Chelsio Terminator TOE](https://www.chelsio.com/wp-content/uploads/resources/Chelsio-Terminator-6-Brief.pdf). * Competitive performance for RPCs, even with wimpy SmartNICs. * Cuts 99.99th-percentile RPC RTT by 3.2 × versus TAS. * C & eBPF-XDP extensions: TCP tracing, VLAN stripping, flow classification, firewalling, and splicing. * Generalizes across architectures: single connection speed up of 2.4× on x86 and 4× on [BlueField](http://www.mellanox.com/related-docs/npu-multicore-processors/PB_BlueField_Ref_Platform.pdf).