L Stornaiuolo, M Rabozzi, MD Santambrogio, D Sciuto, CB Ciobanu, G Stramondo, AL Varbanescu
IFIP/IEEE International Conference on Very Large Scale Integration-System on a Chip
With the increased interest in energy efficiency, a lot of application domains experiment with Field Programmable Gate Arrays (FPGAs), which promise customized hardware accelerators with high-performance and low power consumption. These experiments possible due to the development of High-Level Languages (HLLs) for FPGAs, which permit non-experts in hardware design languages (HDLs) to program reconfigurable hardware for general purpose computing.
However, some of the expert knowledge remains difficult to integrate in HLLs, eventually leading to performance loss for HLL-based applications. One example of such a missing feature is the efficient exploitation of the local memories on FPGAs. A solution to address this challenge is PolyMem, an easy-to-use polymorphic parallel memory that uses BRAMs. In this work, we present HLS-PolyMem, the first complete implementation and in-depth evaluation of PolyMem optimized for the Xilinx Design Suite. Our evaluation demonstrates that HLS-PolyMem is a viable alternative to HLS memory partitioning, the current approach for memory parallelism in Vivado HLS. Specifically, we show that PolyMem offers the same performance as HLS partitioning for simple access patterns, and outperforms partitioning as much as 13x when combining multiple access patterns for the same data structure. We further demonstrate the use of PolyMem for two different case studies, highlighting the superior capabilities of HLS-PolyMem in terms of performance, resource utilization, flexibility, and usability.
Based on all the evidence provided in this work, we conclude that HLS-PolyMem enables the efficient use of BRAMs as parallel memories, without compromising the HLS level or the achievable performance.