Reliability

AWARE: Automate Workload Autoscaling with Reinforcement Learning in Production Cloud Systems

Workload autoscaling is widely used in public and private cloud systems to maintain stable service performance and save resources. However, it remains challenging to set the optimal resource limits and dynamically scale each workload at runtime. …

Evaluating Hardware Memory Disaggregation under Delay and Contention

Hardware memory disaggregation is an emerging trend in datacenters that provides access to remote memory as part of a shared pool or unused memory on machines across the network. Memory disaggregation aims to improve memory utilization and scale …