Asterios Katsifodimos | Dataflow Engines for Executing Cloud Applications: a Maslow's Hammer?

23 JUNE 2022

FaaS is currently being marketed as the silver bullet of abstractions for developing scalable applications in the cloud. Although very popular, current FaaS offerings offer poor support for the management of application state: managing state and keeping it consistent at large scale is very challenging. As a result, the serverless model is inadequate for executing stateful, consistent, and latency-sensitive applications. To this end, a new breed of systems and programming models are currently in the making, termed as Stateful Functions as a Service (SFaaS). Surprisingly, recent results in both academia and industry point to a common pattern: stateful functions at scale can be modeled as dataflow graphs and they can be executed on top of existing data flow systems, such as Apache Flink or Beam. In this talk, I analyse the requirements of scalable cloud applications and how those affect the design choices of an imaginary “universal execution engine” for the Cloud. I then try to answer the question: will dataflow engines be the answer in the quest for the universal cloud execution engine? I will conclude with a set of requirements and possible directions for the future dataflow engines and what we have been up too in my research group at TU Delft.

Asterios Katsifodimos is an Associate Professor at the Delft University of Technology. His research interests span the areas of parallel (streaming-) data processing, Cloud computing,  optimization of ML-systems, and data integration. His research on fault tolerance, aggregation methods and benchmarking has influenced open-source stream processing engines such as Apache Flink, and Hazelcast Jet, while his research group develops and maintains the dataset discovery system Valentine. Asterios has received the ACM SIGMOD Research Highlights Award in 2016, as well as the best paper award at EDBT 2019 and ACM DEBS 2021. He is the instructor of the online MOOC “Taming Massive Data Streams”  and regularly serves as an associate editor or a program committee member in the data management conferences such as VLDB, ICDE, SIGMOD and EDBT. More info: https://asterios.katsifodimos.com/