High Performance Computing – Graph and data analytics

Universities have to face increasing challenges posed by new technologies, techniques, and tools. If we consider technical studies such as computer science and computer engineering, one of the key roles of the university is to help the students to learn the hard and soft skills that will enable them to continuously learn throughout their life. This is fundamental as technology keeps changing and the students should be able to apply the concepts they learn and measure themselves independently from the actual technology.

By Rolando Brondolin and Alberto Parravicini
PhD Students at NECSTLab, Politecnico di Milano @NECSTLab, Politecnico di Milano

Along with classical frontal lessons, Politecnico di Milano decided to add new classes to the “Passion In Action” program, to support the development of transversal, soft and social skills to encourage and facilitate students in enriching their personal, cultural and professional experience. These classes are optional and based on innovative teaching techniques.

Within this context our laboratory proposed in collaboration with Oracle Labs the “High Performance Computing – graph and data analytics” course. This class teaches from the fundamentals of graph-based computing to advanced topics in graph computations and graph data structures. These topics are of paramount importance nowadays as the internet and social media completely changed how we work, live and interact with each other. At the base of this internet revolution, we now have billions of entities (e.g. web pages, users, servers) connecting each other with trillions of edges (e.g. hypertext links, friendships, network links). Graphs are the best way to represent and describe this continuously growing scenarios in the most efficient way.

This year’s edition started on the 26th of October with 100 students and the theoretical lessons will continue until the 15th of November. After the theoretical lessons, the students that are interested in practical activities and that are interested in better understanding the topics of the class can enroll in a contest that will last until the 15th of December. During the contest, the students can put to practice the concepts of the classes in two different tracks: “Efficient Query Planning” and “Entity Resolution using Machine Learning Approaches”. The former requires the students to develop heuristic techniques to find a good trade-off between query planning and query execution over graph instances of different types and shapes. The latter, instead, requires the students to find techniques based on ML and AI to detect whether two entities have the same identity, and is widely used in fraud detection.

Students can team up with a friend to participate in the contest. The winners will have the chance to join the NECST Group Conference 2020 and will have the chance to be selected for internship opportunities within Oracle Labs.

This year we reached the second edition of the High Performance Computing – graph and data analytics. The first edition involved 70 students joining the theoretical lessons, while 20 groups of 2 students decided to continue with the contest. In the end, we had 8 submissions in total with 2 teams winning the 2 tracks of the contest. After this step, the students continued to work to improve their solutions and they had the chance to collaborate with the NECSTLab and with Oracle Labs on selected research topics.

Rolando Brondolin on LinkedIn