Anna Maria Nestorov, Alberto Scolari, Enrico Reggiani, Luca Stornaiuolo, Marco D Santambrogio
2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
Face Detection (FD) recently became the base of multiple applications requiring low latency but also with limited resources and energy budgets. Deep Convolutional Neural Networks (DCNNs) are especially accurate in FD, but latency requirements and energy budgets call for Field Programmable Gate Arrays (FPGAs)-based solutions, trading flexibility and efficiency. Nonetheless, the offer of FPGAs solutions is limited and different chips often require expensive re-design phases, while developers desire solutions whose resources can scale proportionally to the demands. Therefore, this work presents an FD solution based on a DCNN on a distributed, embedded system with FPGAs, proposing a general approach to reduce the DCNN size and to design its FPGA cores and investigating its accuracy, performance, and energy efficiency.