A Multi-FPGA Networking Architecture and Its Implementation

Date

2015-05-12

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

FPGAs show great promise in accelerating compute-bound parallelizable applications by offloading kernels into programmable logic. However, currently FPGAs present significant hurdles in being a viable technology, due to both the capital outlay required for specialized hardware as well as the logic required to support the offloaded kernels on the FPGA. This thesis seeks to change that by making it easy to communicate clusters of FPGAs over IP networks and providing infrastructure for common application use cases, allowing authors to focus on their application and not the procurement and details of interacting with a specific FPGA.

Our approach is twofold. First, we develop an FPGA IP network stack and bitfile management system allowing users to upload their logic to a server and have it run on FPGAs accessible through the Internet. Second, we engineer a programmable logic interface which authors can use to move data to their application kernels. This interface provides communication over the Internet as well as the scaffolding typically re-invented for each application by providing I/O between application logic, even if spread across different FPGAs.

We utilize Partial Reconfiguration to divide the FPGAs into regions, each of which can host different applications from different users. We then provide a web service through which users can upload their FPGA logic. The service finds a spot for the logic on the FPGAs, reconfigures them to contain the logic, then sends back the user their IP addresses.

To ease development of the application pieces themselves, our framework abstracts away the complexity of communicating over IP networks as well as between different FPGAs. Instead we provide an interface to applications consisting simply of a RAM port. Applications write packets of data into the port, and they appear at the other end, whether that other end is across an IP network or another FPGA.

Finally, we then prove the feasibility and utility of our approach by implementing it on an array of Xilinx Virtex 5 FPGAs, linked together with GTP serial links and connected via Gigabit Ethernet. We port a compute-bound application based on regular expression string matching to the framework, demonstrating that our approach is feasible for implementing a realistic application.

Description

Citation