The Academic Fringe Festival - Tim Kraska: Towards Instance-optimized Data Systems

02 May 2022 16:00 till 17:00 - Location: Online | Add to my calendar

by Tim Kraska | MIT

Abstract

Recently, there has been a lot of excitement around ML-enhanced (or learned) algorithms and data structures. For example, there has been work on applying machine learning to improve query optimization, indexing, storage layouts, scheduling, log-structured merge trees, sorting, compression, sketches, among many other data management tasks. Arguably, the ideas behind these techniques are similar: machine learning is used to model the data and/or workload in order to derive a more efficient algorithm or data structure. Ultimately, what these techniques will allow us to build are “instance-optimized” systems; systems that self-adjust to a given workload and data distribution to provide unprecedented performance and avoid the need for tuning by an administrator.

In this talk, I will first provide an overview of the opportunities and limitations of current ML-enhanced algorithms and data structures, present initial results of SageDB, a first instance-optimized system we are building as part of DSAIL@CSAIL at MIT, and finally outline remaining challenges and future directions.

Speaker Biography

I am an Associate Professor of Electrical Engineering and Computer Science in MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL), founding co-director of the Data System and AI Lab (DSAIL) at MIT, and co-founder of einblick analytics, inc.

My group aims to dramatically increase the efficiency of data-intensive systems and democratize data science by enabling a broader range of users to unfold the potential of (their) data through the development of a new generation of algorithms and systems. This entails exploring how we can build systems to better support the recent advances in machine learning (Systems for ML) and how we can leverage machine learning to improve systems (ML for Systems). For example, with our work on SageDB we started to explore how we can enhance or even replace core systems components using machine learning models and early results suggest, that we can improve the state-of-the-art by more than an order-of-magnitude in performance. On the other hand, with Northstar we are exploring new user interfaces and infrastructure to democratize data science by enabling visual, interactive, and assisted data exploration and model building. One particular focus of this work is to help all types of users to analyze data and build models faster, but also make data exploration and model building safer by automatically preventing the user from common pitfalls.

Our work has been featured several times by the media (TechCrunch, Science, O'Reilly among others) and we are proud, that we had significant impact on academia and industry. For example, Northstar is now being commercialized by einblick analytics backed by venture capital and our ML for Systems work is getting extended by countless researchers around the world (for a slightly outdated overview see our SIGMOD 2019 tutorial on Learned Data Structures and Algorithms) and is even finding its way into some cloud products of leading internet companies.

I am fortunate to be working with an outstanding team of grad student, under-graduates, and post-docs, with numerous collaborators from academia and industry, and grateful for the research funding we have been receiving from NSF, DARPA, Airforce, Google, Microsoft, and Intel.

Homepage: https://people.csail.mit.edu/kraska/index.html.

More information

In this second edition on the topic of "Responsible Use of Data", we take a multi-disciplinary view and explore further lessons learned from success stories and examples in which the irresponsible use of data can create and foster inequality and inequity, perpetuate bias and prejudice, or produce unlawful or unethical outcomes. Our aim is to discuss and draw certain guidelines to make the use of data a responsible practice.

Join us

To receive announcements of upcoming presentations and events organized by TAFF and get the Zoom link to join the presentations, join our mailing list.

TAFF-WIS Delft

Visit the website of The Academic Fringe Festival