Tracking latency in microservice environments with DEEP-mon

In our previous article we introduced HyPPO, our project aimed at optimizing workloads in Docker and Kubernetes based environments, in order to minimize power consumption. To do so, HyPPO is equipped with a monitoring agent, DEEP-mon, which exploits Linux eBPF to collect various metrics at runtime, from Docker containers and assigns a power consumption level to each container.

By Tommaso Sardelli
Student at Politecnico di Milano

With this follow up, we would like to talk about the recent changes we made to DEEP-mon, in order to add a new metric to our arsenal: latency.As we anticipated in the first article about DEEP-mon, our target is a specific kind of applications which are referred as On-Line Data Intensive applications (OLDI). Web search, advertising, and machine translation are some examples of this workload type. A rapid response time is particularly important for these workload types. It affects the experience of users interacting with those systems and consequently, has an economic impact on the service owners.

In this context latency becomes a fundamental indicator. It is a measure of how we are modifying the workloads when we apply the power optimizations. In addition we can use it to drive our choices when we define the new power profiles that will be applied. The idea is to integrate this kind of information in the general flow of HyPPO so that the controller can put to use this metric in order to provide a better experience to the users.
To collect latency metrics we employed once again the tools provided by the linux kernel through eBPF. This time we did not use them to analyze the server workload but rather to inspect the network traffic passing through each container. Going into the details, what we do is to catch each packet as it enters inside the container and we extract useful information from it such as IP addresses , TCP ports, HTTP requests’ URIs and the current timestamp. All the information is temporarily stored into a table so that it can be used later on. In the same way we trace also every packet going out from the container and we correlate the information of those packets, with the one that we saved in the table previously. With this approach we are able to match each request with its response, and find the computational latency required to obtain the response, simply subtracting the two packets’ timestamps.

This approach has the advantage of cutting network latency out of the equation. In this way we can see how different power profiles affects latency, without any noise coming from the network.
The collected latency metrics are then sent to our backend, together with power consumption metrics, so that they can be correlated and analysed by HyPPO controller.
We can now use these insights to understand how we are impacting on user experience but, above all, to drive HyPPO towards better results, during the creation of future power profiles.

Tomaso Sardelli on Twitter

Tommaso Sardelli on LinkedIn