The cloud-computing paradigm is nowadays facing a fast-growing trend, playing a more and more important role in the development and maintenance cycle of big infrastructures. The vast majority of web-based application of everyday use is based on this paradigm: examples are video chats, widely employed during the lockdown.
By Samuele Barbieri,
Student in Computer Science and Engineering, Politecnico di Milano
Not all computations, however, are efficiently performed by classic processors. Although the ones made available by cloud providers are far more powerful than those implemented in personal computers, they are often insufficient and technological solutions have to be developed in order to overcome this limitation. For this reason, heterogeneous computing systems involving specific accelerators have been found out to represent a valid alternative to satisfy the request of more stringent requirements in terms of latency. In this context, FPGAs (Field Programmable Gate Arrays) represent a possible alternative as they allow an improvement of the performance-consumption ratio. From the infrastructure owner’s point of view, an allocated but only partially exploited resource represents a waste, an unnecessary cost proportional to the unutilized portion. However, since the amount and frequency of requests is unpredictable, this problem often results almost impossible to overcome.
A possible solution, when it comes to FPGAs, is represented by the sharing of resources among users.
At NECSTLab we thus developed BlastFunction: a serverless solution enabling the acceleration of functions through FPGA implementation, also managing machines and FPGAs sharing. The latter relies on a paradigm based on time-sharing: according to this model each user has access to the resources for a limited period of time. This technique allows the optimization of the FPGAs exploitation, granting at the same time the required performance to each user.
BlastFunction also allows to select which functions need to be accelerated, then the users are required to send the data necessary to the computation. Finally, after collecting this information from the users, the system computes the best services configuration and divides data to be processed among the available resources.
At the moment we are working on the possibility to instantiate dynamically the number of machines and FPGAs, too, following the number of input requests. To do so, we are relying on Amazon Web Services, in order to adapt BlastFunction to more resources thanks to the virtually infinite number and typologies of FPGAs available.