This paper makes a first attempt to formally quantify datacenter operational costs and proposes a new class of topology called FatClique.
Most recent datacenter topology designs have focused on performance properties such as latency and throughput. In this paper, we explore a new dimension, life cycle management, which attempts to capture operational costs of topologies. Specifically, we consider costs associated with deployment and expansion of topologies and explore how structural properties of two different topology families (Clos and expander graphs as exemplified by Xpander) affect these. We also develop a new topology that has the wiring simplicity of Clos and the expandability of expander graphs using the insights from our study.
This paper won the Best Paper award at NSDI 2019 and has been featured in the morning paper: https://blog.acolyer.org/2019/03/20/understanding-lifecycle-management-complexity-of-datacenter-topologies/