Modular AWG-based Interconnection for Large-Scale Data Center...

50
Modular AWG-based Interconnection for Large-Scale Data Center Networks Tong Ye, Tony T. Lee, Mao Ge, and Weisheng Hu [email protected] State Key Lab of Advanced Optical Communications and Networks Shanghai Jiao Tong University

Transcript of Modular AWG-based Interconnection for Large-Scale Data Center...

Page 1: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Modular AWG-based Interconnection

for Large-Scale Data Center Networks

Tong Ye, Tony T. Lee, Mao Ge, and Weisheng [email protected]

State Key Lab of Advanced Optical Communications and Networks

Shanghai Jiao Tong University

Page 2: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Outline

Background

AWG-based Interconnection

Modular AWG-based Interconnection

Application to Data Center Networks

Conclusion

Page 3: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Data Centers Play Important Roles

World-wide information service infrastructure

[1] http://datacenterfrontier.com/regional-data-center-clusters-power-amazons-cloud/

[2] Sushant Jain et. al., “B4: Experience with a Globally-deployed Software Defined Wan”, ACM SIGCOMM, Oct. 2013, pp. 3-14.

Amazon Web Service’s

Global Infrastructure

Google World-wide

Data Center Map

Page 4: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Footprint of Data Centers (DCs)

A mega DC requires a large number of long

cables with very high capacity

Area (m2)

Link Rate (bits/s)

Number of Server Racks

103

104

106

102

103

105

1T100G10G

Small and

medium DC

Large-scale DC

Mega DC

0

OIDA/CIAN Data center Workshop, “Quantitative metrics for data centers,” in Proc. OFC, 2012.

Page 5: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Cabling Problem

Cable maintenance is extremely difficult, when

network connections change

line failures occur

[1] N. Farrington, E. Rubow, and A. Vahdat, “Data center switch architecture in the age of merchant silicon,” in Proc. IEEE HOTI, Aug. 2009.

[2] www.hpl.hp.com/techreports/2015/HPL-2015-8.html

[3] J. Mudigonda, P. Yalagandula, and J. C. Mogul, “Taming the flying cable monster: A topology design and optimization framework for data- center networks,” in Proc. ATC,

Jun. 2011.

Eventually, cables become a terrible monster…

Page 6: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Solution: Wireless Links

Pros: reduce number of cables

Cons:

Low bandwidth (~Gb/s)

Serious radio interference

[1] K. Ramachandran, R. Kokku, R. Mahindra, and S. Rangarajan, “60 GHz data-center networking: Wireless => worry less?” Technical Report, NEC, 2008.

[2] Y. Cui, H. Wang, X. Cheng, and B. Chen, “Wireless data center networking,” IEEE Wireless Commun. Mag., vol. 18, no. 6, pp. 46– 53, Dec. 2011.

[3] N. Hamedazimi et al., “Firefly: A reconfigurable wireless data center fabric using free-space optics,” in Proc. ACM SIGCOMM, Oct. 2014.

Page 7: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Solution: Optimal Device Allocation

Idea: combine several switches to form a high

radix switch, but

specific for Butterfly networks (not universal)

reduce the number of cables only by half (not scalable)

J. Kim, W. J. Dally, and D. Abts, “Flattened butterfly: a cost- effective topology for high-radix networks,” in Proc. ACM ISCA, Jun. 2007.

1

2

3

4

5

6

7

8

9

11

2

3

4

5

6

7

8

9

1 2 3

4 5 6

7 8 9

High radix switch

Page 8: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Solution: Optical Method

Replace links of each full mesh by an arrayed

waveguide grating (AWG)

Pros: reduce cabling complexity + bandwidth guaranteed √

Cons: AWG is not scalable if network is very large

M. Csernai, F. Ciucu, R. P. Braun, and A. Gulyas, Towards 48-Fold Cabling Complexity Reduction in Large Flattened Butterfly Networks, in Proc. INFOCOM 2015.

0 1 N-1

N N+1 2N-1

N2-1

N3 links

0 1 N-2 N-1

0 1 N-2 N-1

mirror nodes 0 ~ N-1

nodes 0 ~ N-1

N2 links

0 1 N-2 N-1

0 1 N-2 N-1

mirror nodes 0 ~ N-1

nodes 0 ~ N-1

N links

N links

AWG

Page 9: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

The Goal of Our Work

Achieve modular AWG-based interconnection:

Substantially reduce cabling complexity, while preserving

function of original DC networks

Scalable even when size of DC networks is very large

Can be applied to different DC networks

Page 10: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Outline

Background

AWG-based Interconnection

Modular AWG-based Interconnection

Application to Data Center Networks

Conclusion

Page 11: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Different networks have the similar subnetwork

Topology of Existing Networks

Multi-root Network Fat-Tree Flattened Butterflycore

pod

core

aggregate

[1] M. F. Bari et al., “Data center network virtualization: a survey,” IEEE Commun. Surveys Tuts., vol. 15, no. 2, pp. 909–928, May 2013.

[2] M. Al-Fares, A. Loukissas, and A. Vahdat, “A scalable, commodity data center network architecture,” in Proc. ACM SIGCOMM, Aug. 2008.

[3] Z. Zhu, S. Zhong, L. Chen, and K. Chen, “Fully programmable and scalable optical switching fabric for petabyte data center”, Opt. Express, vol. 32, no. 3, pp. 3563-3580, Feb.

2015.

core

aggregatepod pod pod pod

core node

mirror node

Page 12: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Banyan-Type Subnetwork: 𝒩𝐴

𝑁1𝑁2 links

Two disjoint node sets

Exact one fiber link from a node in one set to

that in another set

Page 13: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Passive => consume no power

Provide 𝑁1𝑁2 links between inputs and outputs

𝑁1 × 𝑁2 AWG

0

1

2

0

1

2

3

𝜆0,𝜆1,𝜆2,𝜆3

𝜆0,𝜆1,𝜆2,𝜆3

𝜆0,𝜆1,𝜆2,𝜆3

𝜆0,𝜆1,𝜆2

𝜆1,𝜆2,𝜆3

𝜆2,𝜆3,𝜆0

𝜆3,𝜆0,𝜆1

Wavelength (𝜆) set Λ = 𝜆0, 𝜆1, 𝜆2, 𝜆3

Page 14: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Non-blocking Routing Property

Output# (𝑗) is determined by input# (𝑖) & 𝜆# (𝑘):

𝑗 = 𝑘 − 𝑖 Λ ≝ 𝑘 − 𝑖 mod Λ

OUT 0 OUT 1 OUT 2 OUT 3

IN 0 𝜆0 𝜆1 𝜆2 𝜆3IN 1 𝜆1 𝜆2 𝜆3 𝜆0IN 2 𝜆2 𝜆3 𝜆0 𝜆1

Cyclic Latin Square if 𝑁1 = 𝑁2

0

1

2

0

1

2

3

𝜆0,𝜆1,𝜆2,𝜆3

𝜆0,𝜆1,𝜆2,𝜆3

𝜆0,𝜆1,𝜆2,𝜆3

𝜆0,𝜆1,𝜆2

𝜆1,𝜆2,𝜆3

𝜆2,𝜆3,𝜆0

𝜆3,𝜆0,𝜆1

Page 15: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

𝑁 × 1 AWG: 𝜆 Mux/DeMux

0

1

N-1

N´1 Mux

1´N DeMux

0

1

N-1

Page 16: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

AWG-based Interconnection

Replacing fiber links in 𝒩𝐴 by an AWG yields

a network 𝒩𝐵

𝒩𝐴: 𝑁1𝑁2 fiber links 𝒩𝐵: 𝑁1 +𝑁2 fiber links

M. Csernai, F. Ciucu, R. P. Braun, and A. Gulyas, Towards 48-Fold Cabling Complexity Reduction in Large Flattened Butterfly Networks, in Proc. INFOCOM 2015.

Page 17: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Limitations of 𝑁 × 𝑁 AWG

If 𝑁 is very large:

In-band crosstalk is prominent

(bad physical-layer performance)

Synthesis is very difficult

A large number of 𝜆s are required

R. Gaudino, G. G. Castillo, F. Neri, and J. Finochietto, in Proc. IEEE ICC, May 2008, pp. 5331–5337.

0

1

N-1

0

1

N-1

l0~lN-1

0

1

3

0

1

3

2 2

interference

signals at the same 𝜆interferes with each other

Page 18: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Modular AWG-based Interconnection

Phase 1: AWG decomposition

Suppress in-band crosstalk

Cut down synthesis difficulty

Phase 2: Wavelength reuse

Reduce number of required wavelengths

Page 19: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Outline

Background

AWG-based Interconnection

Modular AWG-based Interconnection

Phase 1: AWG Decomposition

Phase 2: Wavelength Reuse

Application to Data Center Networks

Conclusion

Page 20: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

AWG Decomposition

0

1

N-1

0

1

N-1

0

1

1

2

N-1 N-1

r ´ r

AWG

r ´ r

AWG

Tong Ye, T. T. Lee, and Weisheng Hu, IEEE Journal of Lightwave Technology, vol. 30, no. 13, pp. 2125-2133, Jul. 2012.

𝑁 × 𝑁 AWG ⇒ 𝑁 × 𝑁 network of AWGs:

same routing property

<=> output# is uniquely determined by input# and 𝜆#

𝑟 < 𝑁

Page 21: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Example of Decomposition

l0, l1, l2, l3, l4, l5

P0 P1 P2 P3

Q0 Q1 Q2 Q3

P4

Q4

P5

Q5

P0

P1

P2

P3

P4

P5

Q0 Q1 Q2 Q3 Q4 Q5

, , , , , ,, ,

P0 P1 P2 P3 P4 P5

Q0 Q1 Q2 Q3 Q4 Q5

P0

P1

P2

P3

P4

P5

Q0 Q1 Q2 Q3 Q4 Q5

𝐀

6 × 6 cyclic Latin square 6 × 6 Latin square

Page 22: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Example: Observation 1

𝐀 consists of 22 3 × 3 cyclic Latin squares

Each square is associated with a 3 × 3 AWG

, , , , , ,, ,

P0 P1 P2 P3 P4 P5

Q0 Q1 Q2 Q3 Q4 Q5

0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 1 0 1 0 1 0 1

0 1 2 0 1 2 0 1 2 0 1 2

0 1 2 0 1 2 0 1 2 0 1 2

0

A10 A11

A01

P0

P1

P2

P3

P4

P5

Q0 Q1 Q2 Q3 Q4 Q5

A00

Page 23: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

M1 M0

M1

P0

P1

P2

P3

P4

P5

Q0 Q1 Q2 Q3 Q4 Q5

M0

Example: Observation 2

𝐌0 is defined on 𝜆-set 𝜆0, 𝜆1, 𝜆2 𝐌1 is defined on 𝜆-set 𝜆3, 𝜆4, 𝜆5

A10 A11

A01

P0

P1

P2

P3

P4

P5

Q0 Q1 Q2 Q3 Q4 Q5

A00

𝐌0

𝐌0

𝐌1

𝐌1

𝐀

recursively cyclic

Page 24: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Matrix-based AWG Decomposition

Initialization:

Define 𝑛 𝑟 × 𝑟 cyclic Latin squares (𝑛𝑟 = 𝑁)

𝐌0, 𝐌1, ⋯ ,𝐌𝑛−1

Specify an 𝑁 × 𝑁 Latin square 𝐀 with 𝑛2 𝑟 × 𝑟 blocks

𝐀𝑎𝑏 = 𝐌 𝑎+𝑏 𝑛

Construct an AWG network according to 𝐀:

S1. Central stage construction

S2. Upper-layer stage construction

S3. Lower-layer stage construction

Page 25: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Initialization: 𝐌0, 𝐌1, ⋯ ,𝐌𝑛−1

Specify 𝑛 𝑟 × 𝑟 cyclic Latin squares, where

𝑁 = 6, 𝑛 = 2, 𝑟 = 3 𝐌0 is defined on 𝜆-set Λ0 = 𝜆0, 𝜆1, 𝜆2 𝐌1 is defined on 𝜆-set Λ1 = 𝜆3, 𝜆4, 𝜆5

𝐌0 𝐌1

𝜆0, 𝜆1, 𝜆2 𝜆3, 𝜆4, 𝜆5

Page 26: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Initialization: Specify 𝐀

𝐀𝑎𝑏 = 𝐌 𝑎+𝑏 𝑛

𝐌0

𝐌0

𝐌1

𝐌1

𝐀M1 M0

M1

P0

P1

P2

P3

P4

P5

Q0 Q1 Q2 Q3 Q4 Q5

M0

A10 A11

A01

P0

P1

P2

P3

P4

P5

Q0 Q1 Q2 Q3 Q4 Q5

A00

Page 27: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

S1. Central Stage Construction

Layout 𝑛2 𝑟 × 𝑟 AWGs from left to right

Label 𝑘th AWG by 𝐴 𝑎, 𝑏 and associate it with 𝐀𝑎𝑏, where

𝑎 = 𝑘/𝑛 , 𝑏 = 𝑘 𝑛

A(0,0) A(0,1) A(1,0) A(1,1)

, , , , , ,, ,

P0 P1 P2 P3 P4 P5

Q0 Q1 Q2 Q3 Q4 Q5

D(0,0) D(0,1) D(0,2) D(1,0) D(1,1) D(1,2)

M(0,0) M(0,1) M(0,2) M(1,0) M(1,1) M(1,2)

0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 1 0 1 0 1 0 1

0 1 2 0 1 2 0 1 2 0 1 2

0 1 2 0 1 2 0 1 2 0 1 2

0

𝑟 = 3, 𝑛 = 2

A10 A11

A01

P0

P1

P2

P3

P4

P5

Q0 Q1 Q2 Q3 Q4 Q5

A00

Page 28: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

S1. Central Stage Construction

Layout 𝑛2 𝑟 × 𝑟 AWGs from left to right

Label 𝑘th AWG by 𝐴 𝑎, 𝑏 and associate it with 𝐀𝑎𝑏, where

𝑎 = 𝑘/𝑛 , 𝑏 = 𝑘 𝑛

, , , , , ,, ,

P0 P1 P2 P3 P4 P5

Q0 Q1 Q2 Q3 Q4 Q5

D(0,0) D(0,1) D(0,2) D(1,0) D(1,1) D(1,2)

M(0,0) M(0,1) M(0,2) M(1,0) M(1,1) M(1,2)

A(0,0) A(0,1) A(1,0) A(1,1)

0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 1 0 1 0 1 0 1

0 1 2 0 1 2 0 1 2 0 1 2

0 1 2 0 1 2 0 1 2 0 1 2

0𝑟 = 3, 𝑛 = 2

A10 A11

A01

P0

P1

P2

P3

P4

P5

Q0 Q1 Q2 Q3 Q4 Q5

A00

𝑘 = 2𝑎 = 2/2 = 1𝑏 = 2 2 = 0

Page 29: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

S2. Upper-layer Stage Construction

Layout 𝑁 DeMuxs at upper layer

If 𝑖th row of 𝐀 is 𝛼th row of 𝐀𝑎𝑏

output 𝑏 of DeMux 𝑖 ↔ upper port 𝛼 of 𝐴 𝑎, 𝑏

, , , , , ,, ,

P0 P1 P2 P3 P4 P5

Q0 Q1 Q2 Q3 Q4 Q5

D(0,0) D(0,1) D(0,2) D(1,0) D(1,1) D(1,2)

M(0,0) M(0,1) M(0,2) M(1,0) M(1,1) M(1,2)

A(0,0) A(0,1) A(1,0) A(1,1)

0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 1 0 1 0 1 0 1

0 1 2 0 1 2 0 1 2 0 1 2

0 1 2 0 1 2 0 1 2 0 1 2

0

𝑟 = 3, 𝑛 = 2

A10 A11

A01

P0

P1

P2

P3

P4

P5

Q0 Q1 Q2 Q3 Q4 Q5

A00

Page 30: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

S2. Upper-layer Stage Construction

Layout 𝑁 DeMuxs at upper layer

If 𝑖th row of 𝐀 is 𝛼th row of 𝐀𝑎𝑏 (𝑏 = 0~𝑛 − 1)

output 𝑏 of DeMux 𝑖 ↔ upper port 𝛼 of 𝐴 𝑎, 𝑏

𝑟 = 3, 𝑛 = 2

A10 A11

A01

P0

P1

P2

P3

P4

P5

Q0 Q1 Q2 Q3 Q4 Q5

A00

, , , , , ,, ,

P0 P1 P2 P3 P4 P5

Q0 Q1 Q2 Q3 Q4 Q5

D(0,0) D(0,1) D(0,2) D(1,0) D(1,1) D(1,2)

M(0,0) M(0,1) M(0,2) M(1,0) M(1,1) M(1,2)

A(0,0) A(0,1) A(1,0) A(1,1)

0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 1 0 1 0 1 0 1

0 1 2 0 1 2 0 1 2 0 1 2

0 1 2 0 1 2 0 1 2 0 1 2

0

Page 31: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

S2. Upper-layer Stage Construction

Layout 𝑁 DeMuxs at upper layer

If 𝑖th row of 𝐀 is 𝛼th row of 𝐀𝑎𝑏 (𝑏 = 0~𝑛 − 1)

output 𝑏 of DeMux 𝑖 ↔ upper port 𝛼 of 𝐴 𝑎, 𝑏

𝑟 = 3, 𝑛 = 2

A10 A11

A01

P0

P1

P2

P3

P4

P5

Q0 Q1 Q2 Q3 Q4 Q5

A00

, , , , , ,, ,

P0 P1 P2 P3 P4 P5

Q0 Q1 Q2 Q3 Q4 Q5

D(0,0) D(0,1) D(0,2) D(1,0) D(1,1) D(1,2)

M(0,0) M(0,1) M(0,2) M(1,0) M(1,1) M(1,2)

A(0,0) A(0,1) A(1,0) A(1,1)

0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 1 0 1 0 1 0 1

0 1 2 0 1 2 0 1 2 1 2

0 1 2 0 1 2 0 1 2 0 1 2

0

0

Page 32: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

S2. Upper-layer Stage Construction

Layout 𝑁 DeMuxs at upper layer

If 𝑖th row of 𝐀 is 𝛼th row of 𝐀𝑎𝑏 (𝑏 = 0~𝑛 − 1)

output 𝑏 of DeMux 𝑖 ↔ upper port 𝛼 of 𝐴 𝑎, 𝑏

𝑟 = 3, 𝑛 = 2

, , , , , ,, ,

P0 P1 P2 P3 P4 P5

Q0 Q1 Q2 Q3 Q4 Q5

D(0,0) D(0,1) D(0,2) D(1,0) D(1,1) D(1,2)

M(0,0) M(0,1) M(0,2) M(1,0) M(1,1) M(1,2)

A(0,0) A(0,1) A(1,0) A(1,1)

0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 1 0 1 0 1 0 1

0 1 2 0 1 2 0 1 2 0 1 2

0 1 2 0 1 2 0 1 2 0 1 2

0

A10 A11

A01

P0

P1

P2

P3

P4

P5

Q0 Q1 Q2 Q3 Q4 Q5

A00

Page 33: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

S3. Lower Stage Construction

Layout 𝑁 Muxs at lower layer

If 𝑗th col of 𝐀 is 𝛽th col of 𝐀𝑎𝑏

input 𝑎 of Mux 𝑗 ↔ lower port 𝛽 of 𝐴 𝑎, 𝑏

𝑟 = 3, 𝑛 = 2

, , , , , ,, ,

P0 P1 P2 P3 P4 P5

Q0 Q1 Q2 Q3 Q4 Q5

D(0,0) D(0,1) D(0,2) D(1,0) D(1,1) D(1,2)

M(0,0) M(0,1) M(0,2) M(1,0) M(1,1) M(1,2)

A(0,0) A(0,1) A(1,0) A(1,1)

0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 1 0 1 0 1 0 1

0 1 2 0 1 2 0 1 2 0 1 2

0 1 2 0 1 2 0 1 2 0 1 2

0

A10 A11

A01

P0

P1

P2

P3

P4

P5

Q0 Q1 Q2 Q3 Q4 Q5

A00

Page 34: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

S3. Lower Stage Construction

Layout 𝑁 Muxs at lower layer

If 𝑗th col of 𝐀 is 𝛽th col of 𝐀𝑎𝑏 (𝑎 = 0~𝑛 − 1)

input 𝑎 of Mux 𝑗 ↔ lower port 𝛽 of 𝐴 𝑎, 𝑏

𝑟 = 3, 𝑛 = 2

, , , , , ,, ,

P0 P1 P2 P3 P4 P5

Q0 Q1 Q2 Q3 Q4 Q5

D(0,0) D(0,1) D(0,2) D(1,0) D(1,1) D(1,2)

M(0,0) M(0,1) M(0,2) M(1,0) M(1,1) M(1,2)

A(0,0) A(0,1) A(1,0) A(1,1)

0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 1 0 1 0 1 0 1

0 1 2 0 1 2 0 1 2 0 1 2

0 1 2 0 1 2 0 1 2 0 1 2

0

A10 A11

A01

P0

P1

P2

P3

P4

P5

Q0 Q1 Q2 Q3 Q4 Q5

A00

Page 35: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

S3. Lower Stage Construction

Layout 𝑁 Muxs at lower layer

If 𝑗th col of 𝐀 is 𝛽th col of 𝐀𝑎𝑏 (𝑎 = 0~𝑛 − 1)

input 𝑎 of Mux 𝑗 ↔ lower port 𝛽 of 𝐴 𝑎, 𝑏

𝑟 = 3, 𝑛 = 2

, , , , , ,, ,

P0 P1 P2 P3 P4 P5

Q0 Q1 Q2 Q3 Q4 Q5

D(0,0) D(0,1) D(0,2) D(1,0) D(1,1) D(1,2)

M(0,0) M(0,1) M(0,2) M(1,0) M(1,1) M(1,2)

A(0,0) A(0,1) A(1,0) A(1,1)

0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 1 0 1 0 1 0 1

0 1 2 0 1 2 0 1 2 0 1 2

0 1 2 0 1 2 0 1 2 0 1 2

0

A10 A11

A01

P0

P1

P2

P3

P4

P5

Q0 Q1 Q2 Q3 Q4 Q5

A00

Page 36: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

S3. Lower Stage Construction

Layout 𝑁 Muxs at lower layer

If 𝑗th col of 𝐀 is 𝛽th col of 𝐀𝑎𝑏 (𝑎 = 0~𝑛 − 1)

input 𝑎 of Mux 𝑗 ↔ lower port 𝛽 of 𝐴 𝑎, 𝑏

A10 A11

A01

P0

P1

P2

P3

P4

P5

Q0 Q1 Q2 Q3 Q4 Q5

A00

𝑟 = 3, 𝑛 = 2

, , , , , ,, ,

P0 P1 P2 P3 P4 P5

Q0 Q1 Q2 Q3 Q4 Q5

D(0,0) D(0,1) D(0,2) D(1,0) D(1,1) D(1,2)

M(0,0) M(0,1) M(0,2) M(1,0) M(1,1) M(1,2)

A(0,0) A(0,1) A(1,0) A(1,1)

0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 1 0 1 0 1 0 1

0 1 2 0 1 2 0 1 2 0 1 2

0 1 2 0 1 2 0 1 2 0 1 2

0

Page 37: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Network After AWG Decomposition

𝑁 × 𝑁 𝒩𝐵 => 𝑁 × 𝑁 𝒩𝐶 𝑛, 𝑟 , where 𝑛𝑟 = 𝑁

A(0,0), , , , , ,, ,

A(0,1) A(1,0) A(1,1)

R0 R1 R2 R3 R4 R5

𝒩𝐵 𝒩𝐶 3,2

Page 38: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Outline

Background

Preliminaries

Modular AWG-based Interconnection

Phase 1: AWG Decomposition

Phase 2: Wavelength Reuse

Application to Data Center Networks

Conclusion

Page 39: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Mux/DeMux Replacement

A(0,0), , , , , ,, ,

A(0,1) A(1,0) A(1,1)

R0 R1 R2 R3 R4 R5

A10 A11

A01

P0

P1

P2

P3

P4

P5

Q0 Q1 Q2 Q3 Q4 Q5

A00

𝜆0, 𝜆1, 𝜆2 𝜆3, 𝜆4, 𝜆5 𝜆0, 𝜆1, 𝜆2 𝜆3, 𝜆4, 𝜆5

Page 40: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Network After Mux Replacement

Connections passing through different AWGs

are link-disjoint => reuse the same 𝜆-set

, ,, ,A(0,0), , , ,

A(0,1) A(1,0) A(1,1)

R0 R1 R2 R3 R4 R5

A10 A11

A01

P0

P1

P2

P3

P4

P5

Q0 Q1 Q2 Q3 Q4 Q5

A00

Page 41: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

A(0,0), , , ,

A(0,1) A(1,0) A(1,1), , , ,

R0 R1 R2 R3 R4 R5

Network After Wavelength Reuse

𝒩𝐷 𝑛, 𝑟 where 𝑛 = 2 and 𝑟 = 3

A00

A10 A11

A01

P0

P1

P2

P3

P4

P5

Q0 Q1 Q2 Q3 Q4 Q5

Page 42: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

ui

vj

A(a,b)

0 1 b r-1

0 1 a r-1

M(b,b,a)M(b,b,0) M(b,b,n-1)M(b,0,0) M(b,r-1,n-1)

D(a,a,b)D(a,a,0) D(a,a,n-1) D(a,r-1,n-1)D(a,0,0)

𝒩𝐷 𝑛, 𝑟 in General

𝑢𝑖, 𝑣𝑗 => path, 𝜆𝑥′

Page 43: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Comparison

Cabling complexity: 𝑂 𝑁2

Number of required wavelengths: 𝑂 1

𝑂 𝑁

𝑂 𝑁

𝑂𝑁2

𝑟

𝑂 𝑟

A(0,0), , , ,

A(0,1) A(1,0) A(1,1), , , ,

R0 R1 R2 R3 R4 R5

𝒩𝐴

𝒩𝐵

𝒩𝐷

Page 44: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Outline

Background

Preliminaries

Modular AWG-based Interconnection

AWG Decomposition

Wavelength Reuse

Application to Data Center Networks

Conclusion

Page 45: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

2-D FB Network

A 2-D 16,384-node DC network

0 1 126

128 129 254

16128 16129 16254

127

255

16255

16256 16257 16382 16383

128 129 254 255

128 129 254 255

mirror nodes 128 ~ 255

nodes 128 ~ 255

16,256 links

2,080,768 links

Z. Zhu, S. Zhong, L. Chen, and K. Chen, “Fully programmable and scalable optical switching fabric for petabyte data center”, Opt. Express, vol. 32, no. 3, pp. 3563-3580, Feb.

2015.

Page 46: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

AWG-based Interconnection Scheme

𝒩𝐷 4,32 : 32 × 32 AWGs in the central stage

A(2,3)

l0~l31

A(2,2)

l0~l31

A(2,1)

l0~l31

A(2,0)

l0~l31

A(2,3)

l0~l31

A(2,2)

l0~l31

A(2,1)

l0~l31

A(2,0)

l0~l31

A(1,3)

l0~l31

A(1,2)

l0~l31

A(1,1)

l0~l31

A(1,0)

l0~l31

A(0,3)

l0~l31

A(0,2)

l0~l31

A(0,1)

l0~l31

128

128 129 159 160 161 191 192 193 223 224 225 255

A(0,0)

l0~l31

129 159 160 161 191 192 193 223 224 225 255

512 links

512 links

128 129 254 255

128 129 254 255

16,256 links

Page 47: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Physical Layer Performance

Power penalty (~0.7dB) is very small

32×32

AWG

10G PRBS

MZM

10G PRBS

MZM

BER-T

APD

CW Laser

CW Laser

1×32

AWG

32×1

AWG1552.52 nm

1552.52 nm

upper layer lower layer

-36 -33 -30 -2712

9

6

3

0

a direct optical link

a connection through AWGs

-Lo

g(B

ER

)

Received Optical Power (dBm)

0.7 dB

connection under test

Page 48: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

If Network is very Large …

Each central AWG is replaced by an integrated

AWG-network module

Page 49: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Conclusions

AWG-based interconnection networks is

proposed for DC networks

Substantially reduce cabling complexity

Only employ small-size AWG modules to avoid

serious in-band crosstalk

difficult synthesis technology

Reuse same wavelength set, such that

number of required wavelengths is small

Feasibility is confirmed by Physical-layer

performance evaluations

Tong Ye, Tony T. Lee, Mao Ge, and Weisheng Hu, Modular AWG-based Interconnection for Large-Scale Data Center Networks, IEEE Trans. on Cloud Computing, Accepted.

Page 50: Modular AWG-based Interconnection for Large-Scale Data Center …bblab.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a... · [2] Sushant Jain et. al., “B4: Experience with

Q & A

Thank you for your attention!