Data center networks are designed with a large number of criteria in mind; performance, cost and resilience being the obvious ones. But beyond these, there are a number of criteria, currently largely unstudied in the networking community, that determine the feasibility of network topology.
For example, wiring and deployment complexity of topologies determines their time-to-readiness. A topology that is hard or expensive to deploy is infeasible. A second important factor is the cost and complexity of making modifications to the topology across its rather long lifetime. Such modifications that contribute to operational costs include expansions to existing topology (e.g., adding more servers or more bandwidth), and hardware replacements (e.g., moving from 10G to 20G servers or moving from 32 port switch chassis to 64 port switch chassis). Beyond this, there are other factors such as complexity of configuration, routing and traffic engineering, control architecture and debuggability.
Some of these dimensions are hard to measure systematically (e.g., complexity of control architecture), others do not have measures (e.g., quantifying the wiring complexity of a topology, or the ease of traffic engineering). In our work, we aim to demystify some of these dimensions, and propose ways to design networks that cater to such criteria.
Our work on Understanding Lifecycle Management Complexity of Datacenter Topologies
won the best paper award in NSDI 2019.