Search

Haoran Qiu | Microsoft AzRS

Home
Publications
Awards
Services
Experiences
Contact

Shengkun Cui

Latest

Power-aware Deep Learning Model Serving with µ-Serve
FLASH: Fast Model Adaptation in ML-Centric Cloud Platforms
Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction
QLM: Queue Management for Large Language Model Serving

© 2026 Haoran Qiu · Powered by the Academic theme for Hugo.

Cite