Haoran Qiu | Microsoft AzRS
Home
Publications
Awards
Services
Experiences
Contact
Rodrigo Fonseca
Latest
ModServe: Modality- and Stage-Aware Resource Disaggregation for Scalable Multimodal Model Serving
Towards Efficient Large Multimodal Model Serving
TAPAS: Thermal- and Power-Aware Scheduling for LLM Inference in Cloud Platforms
Cite
×