Haoran Qiu | Microsoft AzRS
Home
Publications
Experiences
Awards
Contact
Chen Wang
Latest
Power-aware Deep Learning Model Serving with ยต-Serve
FLASH: Fast Model Adaptation in ML-Centric Cloud Platforms
When Green Computing Meets Performance and Resilience SLOs
Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction
On the Promise and Challenges of Foundation Models for Learning-based Cloud Systems Management
PARM: Adaptive Resource Allocation for Datacenter Power Capping
AWARE: Automate Workload Autoscaling with Reinforcement Learning in Production Cloud Systems
Multi-Agent Meta-Reinforcement Learning: Sharper Convergence Rates with Task Similarity
SIMPPO: A Scalable and Incremental Online Learning Framework for Serverless Resource Management
A Mean-Field Game Approach to Cloud Resource Management with Function Approximation
Reinforcement Learning for Resource Management in Multi-tenant Serverless Platforms
Is Function-as-a-Service a Good Fit for Latency-Critical Services?
Cite
×