I joined Microsoft Azure Research – Systems
(
AzRS),
focusing on systems for AI efficiency.
- ➪ Reliable and efficient agentic workflow serving:
Sherlock,
Murakkab
- ➪ Fine-grained provisioning and scaling
for generative model serving at scale:
OpScale
- ➪ Efficient multimodal model serving:
ModServe (input) and
StreamWise (generation)
- ➪ Long-context LLM serving
at multi-million token scale:
Medha
- ➪ [ATC24,
AIOps24,
SoCC24]
Energy-efficient, SLO-aware interactive
LLM serving
- ➪ [ATC23,
NeurIPS23,
MLSys24]
Robust, at-scale
ML model deployment
in cloud systems
- ➪ [SoCC22,
NeurIPS22]
Multi-tenant
serverless computing
resource management
- ➪ [OSDI20,
DSN24]
Sustainable, intelligent, and resilient
microservices
resource management
I was a recipient of the inaugural 2023 ML Systems Rising Stars by MLCommons.
In the past, I have worked with IBM Research, Google SRG, Google DeepMind, and Microsoft Research. Before joining UIUC, I got my bachelor's degree in Computer Science from the University of Hong Kong. I worked with Prof. Heming Cui at the HKU Systems Software Group.
You can find my CV here (Aug. 2024).
Interests
- ML/LLM/Agent Systems
- ML for Systems / AIOps
- Serverless Computing
- Distributed Systems
Education
Ph.D. in Computer Science, 2024
University of Illinois, Urbana-Champaign
B.Eng in Computer Science, 2019
University of Hong Kong
Visiting Student, 2018
University of Wisconsin-Madison