Resource Management

PARM: Adaptive Resource Allocation for Datacenter Power Capping

Energy efficiency is pressing in today's cloud datacenters. Various power management strategies, such as oversubscription, power capping, and dynamic voltage and frequency scaling, have been proposed and are in use by datacenter operators to better …

AWARE: Automate Workload Autoscaling with Reinforcement Learning in Production Cloud Systems

Workload autoscaling is widely used in public and private cloud systems to maintain stable service performance and save resources. However, it remains challenging to set the optimal resource limits and dynamically scale each workload at runtime. …

SIMPPO: A Scalable and Incremental Online Learning Framework for Serverless Resource Management

Serverless Function-as-a-Service (FaaS) offers improved programmability for customers, yet it is not server-"less" and comes at the cost of more complex infrastructure management (e.g., resource provisioning and scheduling) for cloud providers. To …

A Mean-Field Game Approach to Cloud Resource Management with Function Approximation

Reinforcement learning (RL) has gained increasing popularity for resource management in cloud services such as serverless computing. As self-interested users compete for shared resources in a cluster, the multi-tenancy nature of serverless platforms …

Reinforcement Learning for Resource Management in Multi-tenant Serverless Platforms

Serverless Function-as-a-Service (FaaS) is an emerging cloud computing paradigm that frees application developers from infrastructure management tasks such as resource provisioning and scaling. To reduce the tail latency of functions and improve …