I joined Microsoft Azure Research – Systems (AzRS), focusing on systems for AI efficiency. ➪ Reliable and efficient agentic workflow serving: Sherlock, Murakkab ➪ Fine-grained autoscaling for generative model serving at scale: OpScale ➪ Efficient multimodal model serving: ModServe (input) and StreamWise (generation) ➪ Long-context LLM serving at multi-million token scale: Medha
I completed a PhD in Computer Science at UIUC, advised by Prof. Ravishankar K. Iyer. My dissertation focuses on designing and integrating learning-based solutions seamlessly (i.e., with efficiency, robustness, and reliability) into production cloud systems: ➪ [ATC24, AIOps24, SoCC24] Energy-efficient, SLO-aware interactive LLM serving ➪ [ATC23, NeurIPS23, MLSys24] Robust, at-scale ML model deployment in cloud systems ➪ [SoCC22, NeurIPS22] Multi-tenant serverless computing resource management ➪ [OSDI20, DSN24] Sustainable, intelligent, and resilient microservices resource management.
I was a recipient of the inaugural 2023 ML Systems Rising Stars by MLCommons.
In the past, I have worked with IBM Research, Google SRG, Google DeepMind, and Microsoft Research. Before joining UIUC, I got my bachelor's degree in Computer Science from the University of Hong Kong. I worked with Prof. Heming Cui at the HKU Systems Software Group.
You can find my CV here (Aug. 2024).
Ph.D. in Computer Science, 2024
University of Illinois, Urbana-Champaign
B.Eng in Computer Science, 2019
University of Hong Kong
Visiting Student, 2018
University of Wisconsin-Madison
Review Services:
Organizing Committee Leadership:
Community/Outreach Services: