TY - JOUR
T1 - MatrixDCN
T2 - A high performance network architecture for large-scale cloud data centers
AU - Sun, Yantao
AU - Chen, Min
AU - Peng, Limei
AU - Hassan, Mohammad Mehedi
AU - Alelaiwi, Abdulhameed
N1 - Publisher Copyright:
© Copyright 2015 John Wiley & Sons, Ltd.
PY - 2016/6/1
Y1 - 2016/6/1
N2 - With the widespread deployment of cloud services, data center networks are developing toward large-scale, multi-path networks. Conventional switching-oriented data center network meets difficulties in terms of scalability and flexibility to support increasing bandwidth requirements for cloud services. To solve this problem, a simple and scalable architecture, MatrixDCN, is proposed in this paper. MatrixDCN is an approximate non-blocking network, in which switches and servers are arranged in rows and columns that compose a matrix structure. A MatrixDCN network can accommodate up to hundreds of thousands of servers without bandwidth bottlenecks. Furthermore, the physical topology of a MatrixDCN network can be designed consistently with its logic topology, which helps to reduce the complexity of the management and maintenance of a data center. An efficient routing algorithm, named fault-avoidance routing (FAR), is well designed for MatrixDCN to fully leverage the regularity in the topology. FAR builds two routing tables for a router. A BRT is built based on local topology, and a novel negative routing table (NRT) is increasingly built based on learned partial network failures, which really avoids the problem of network convergence and further shortens the calculating time of routing tables. FAR also greatly reduces the size of routing tables by introducing NRTs at routers. Theoretical analysis and simulations show that MatrixDCN has advantages on the scalability of topology, network throughput, and the performance of FAR.
AB - With the widespread deployment of cloud services, data center networks are developing toward large-scale, multi-path networks. Conventional switching-oriented data center network meets difficulties in terms of scalability and flexibility to support increasing bandwidth requirements for cloud services. To solve this problem, a simple and scalable architecture, MatrixDCN, is proposed in this paper. MatrixDCN is an approximate non-blocking network, in which switches and servers are arranged in rows and columns that compose a matrix structure. A MatrixDCN network can accommodate up to hundreds of thousands of servers without bandwidth bottlenecks. Furthermore, the physical topology of a MatrixDCN network can be designed consistently with its logic topology, which helps to reduce the complexity of the management and maintenance of a data center. An efficient routing algorithm, named fault-avoidance routing (FAR), is well designed for MatrixDCN to fully leverage the regularity in the topology. FAR builds two routing tables for a router. A BRT is built based on local topology, and a novel negative routing table (NRT) is increasingly built based on learned partial network failures, which really avoids the problem of network convergence and further shortens the calculating time of routing tables. FAR also greatly reduces the size of routing tables by introducing NRTs at routers. Theoretical analysis and simulations show that MatrixDCN has advantages on the scalability of topology, network throughput, and the performance of FAR.
KW - data center network
KW - network architecture
KW - routing method
UR - http://www.scopus.com/inward/record.url?scp=84923253906&partnerID=8YFLogxK
U2 - 10.1002/wcm.2579
DO - 10.1002/wcm.2579
M3 - Article
AN - SCOPUS:84923253906
SN - 1530-8669
VL - 16
SP - 942
EP - 959
JO - Wireless Communications and Mobile Computing
JF - Wireless Communications and Mobile Computing
IS - 8
ER -