1 Area Exam: Theory and Practice of Reconfigurable Optical ...configurable optical networks. While...

20
1 Area Exam: Theory and Practice of Reconfigurable Optical Networks Matthew Nance Hall Department of Computer and Information Science University of Oregon Fall 2020 CONTENTS I Introduction 2 II Network Architectures 3 II-A IP-over-Optical Transport Network ...................................... 3 II-B Data Center Architecture ........................................... 3 III Theoretical Models 4 III-A Data centers .................................................. 4 III-B Cost ...................................................... 4 III-C Blocking Probability .............................................. 5 III-D Service Velocity ................................................ 5 III-E Competition .................................................. 5 III-F Open Challenges ................................................ 5 IV Practical Implementations 5 IV-A Enabling Hardware Technologies ....................................... 5 IV-A1 Wavelength Selective Switching (WSS) ............................. 5 IV-A2 ROADMs ............................................. 6 IV-A3 Bandwidth-variable Transponders ................................. 7 IV-A4 Silicon Photonics ......................................... 7 IV-A5 Open Challenges .......................................... 7 IV-B Optically Reconfigurable Data Centers .................................... 8 IV-B1 DCN-specific Technologies .................................... 8 IV-B2 Algorithms ............................................. 9 IV-B3 Systems Implementations ..................................... 10 IV-B4 Open Challenges .......................................... 10 IV-C Reconfigurable Optical WAN ......................................... 11 IV-C1 WAN-Specific Challenges and Solutions ............................. 11 IV-C2 Algorithms ............................................. 12 IV-C3 Systems Implementations ..................................... 14 IV-C4 Open Challenges .......................................... 15 V Future Work and Applications 15 V-A Vertically Programmable Network Simulator ................................ 15 V-B Network Measurement ............................................ 15 V-C Traffic Engineering (TE) ........................................... 15 V-D Cybersecurity ................................................. 16 VI Conclusion and General Open Challenges 16 References 16

Transcript of 1 Area Exam: Theory and Practice of Reconfigurable Optical ...configurable optical networks. While...

  • 1

    Area Exam: Theory and Practice ofReconfigurable Optical Networks

    Matthew Nance HallDepartment of Computer and Information Science

    University of OregonFall 2020

    CONTENTS

    I Introduction 2

    II Network Architectures 3II-A IP-over-Optical Transport Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3II-B Data Center Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    III Theoretical Models 4III-A Data centers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4III-B Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4III-C Blocking Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5III-D Service Velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5III-E Competition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5III-F Open Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

    IV Practical Implementations 5IV-A Enabling Hardware Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

    IV-A1 Wavelength Selective Switching (WSS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5IV-A2 ROADMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6IV-A3 Bandwidth-variable Transponders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7IV-A4 Silicon Photonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7IV-A5 Open Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    IV-B Optically Reconfigurable Data Centers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8IV-B1 DCN-specific Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8IV-B2 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9IV-B3 Systems Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10IV-B4 Open Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    IV-C Reconfigurable Optical WAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11IV-C1 WAN-Specific Challenges and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11IV-C2 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12IV-C3 Systems Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14IV-C4 Open Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    V Future Work and Applications 15V-A Vertically Programmable Network Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15V-B Network Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15V-C Traffic Engineering (TE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15V-D Cybersecurity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    VI Conclusion and General Open Challenges 16

    References 16

  • 2

    Abstract—Reconfigurable optical networks have emerged as apromising technology to serve the fast-growing traffic producedby the digital society. In this area exam we consider reconfig-urable optical networks and their interfaces to higher layers ofthe networking stack. To this end, we explore the challengesfor implementing a vertically programmable network. First, wesurvey modeling work which is essential for efficiently utilizingreconfigurable optical networks given limited resources. Then,we discuss practical implementations for reconfigurable opticalnetworks including hardware technologies and systems imple-mentations. Finally, we explore exciting applications for futurework in this field, including network simulation, measurement,traffic engineering, and cybersecurity.

    I. INTRODUCTION

    We are amidst an explosive growth in communicationtraffic driven by data-centric applications related to business,science, social networking, and entertainment. Further, therise of machine learning and artificial intelligence creates yetmore demand for data-hungry applications. This trend affectsboth data center and wide-area networks. Researchers haveresponded by innovating at different layers of the networkstack. For example, software-defined networking controllersthat run traffic engineering applications are replacing ad-hocrouting [1], [2]. Software-based systems are also replacinghardware appliances such as load balancers [3], [4]. Pro-grammable switches are replacing proprietary and vendor-specific switch APIs [5], [6]. Finally, even server NICs arebeing re-imagined as cloud network operators look to FPGAimplementations as an avenue for scaling bandwidth [7].Reconfigurable optical networks are the final frontier in thistrend, where programmability is embedding itself deeper intothe network stack.

    Reconfigurable optical technologies are an incredibly excit-ing innovation. They enable adaptation of both the topologyitself and of the network capacity on links. Such adaptationsmay be exploited by next-generation systems to improveperformance and efficiency (e.g., by making the networktopology demand aware). For example, recent technologiesbased on free-space optics or optical circuit switches supportvery fast topology adaptations in data centers. Meanwhile,technologies based on reconfigurable optical add-drop multi-plexers (ROADMs) can add or drop wavelengths carrying datachannels from a transport fiber without the need to convert thesignals to electronic signals and back. In both cases, operatorsneed not carry out the entire bandwidth assignment and opticalroute planning during the initial deployment.

    While research gains momentum in virtualizing the networkstack, we are coming closer to a vertically programmablenetwork. Ideally, in such an ecosystem, every aspect of con-nectivity in a network is programmable. A vertically pro-grammable network yields unparalleled flexibility for routing,topology adaptation, and bandwidth assignment. However, thevirtualization of the physical layer is uniquely challenging.With reconfigurable optical networks being a relatively newtechnology, the community is still discussing their conceptualfundamentals, benefits, and limitations. There is a massivedisconnect, or chasm, between the optical communications

    layer and higher layers of the network stack. This chasm is atthe core of challenges for vertically programmable networks.

    This area exam is an up-to-date survey on emerging re-configurable optical networks. While there are surveys withbroad overviews of reconfigurable optical technologies in thesesettings (e.g., software-defined optical networks [8], routingand spectrum allocation [9], wavelength switching hardwarearchitecture [10]), the functioning of such systems (e.g.,reconfigurable metropolitan networks [11] and data centernetworks [12]) requires a full-stack perspective on opticalnetworks. We address this requirement by presenting an end-to-end perspective on reconfigurable optical networks by (a)emphasizing the interdependence of optical technologies withalgorithms and systems and (b) identifying the open challengesand future work at the intersection of optics, theory, algo-rithms, and systems communities.

    This work is especially timely, as interest in dynamic opticallayer networking technologies is gaining attention from thenetworked systems community. Upon reviewing the last fiveyears of publications from five optical and systems networkjournals, we ran a clustering analysis to see how much overlapthere has been between the two fields. Table I shows thejournals, and their raw publication counts since 2015. Figure 1shows the clustering analysis results for the ten largest clusters.Connections between papers are first-order citations. Basedon the most massive cluster, 1, it appears there is a strongrelationship between the Journal of Lightwave Technology(JLT) and the Journal of Optical Communications and Net-working (JOCN) publications. Cluster 2 is mainly composed ofIEEE Transactions on Communications (TCOM) papers, witha few IEEE/ACM Transactions on Networking (TON) papers.Cluster 3 shows a strong relationship between TON and IEEETransactions on Network and Service Management (TNSM)papers. The three predominantly physical-layer journals (JLT,TCOM, and JOCN) appear together in cluster 4. Cluster 5shows a relation between physical-layer and systems journals,TCOM, and TON. Clusters 6 and beyond are mostly singletonclusters, comprising predominantly one journal. Our analysisunderscores a division between optical and higher layer pub-lishing venues.

    We address the apparent disconnect between networkedsystems and optical layer communities by surveying the mostcutting-edge research in the last decade. Figure 2 shows thevolume of work covered by this exam. The overwhelmingmajority of work cited is from the past five years. The paperscovered here include works from the top networking confer-ences (e.g., SIGCOMM, NSDI, CoNEXT) as well as opticalnetworking and networked systems journals highlighted by ourclustering analysis.

    There already exist some excellent surveys on optical net-works which at least partially cover reconfigurable aspects,both in the context of data centers [13]–[15] and (to alesser extent) in the context of wide-area networks [16]. Weextend these surveys while providing an up-to-date overviewof the literature. Our paper aims to provide an understand-ing of the underlying fundamental issues and concepts inreconfigurable optical networks and identify commonalitiesand differences (spanning both data center and wide-area

  • 3

    Journal PapersIEEE-OSA Journal of Lightwave Technology 3,988IEEE Transactions on Communications 2,703IEEE-ACM Transactions on Networking 1,226IEEE-OSA Journal of Optical Communications 817IEEE Transactions on Network and Service Management 533

    TABLE I: Papers published since 2015 in various networkingjournals.

    0

    400

    800

    1200

    1600

    1 2 3 4 5

    Pu

    bli

    cati

    on

    s b

    y J

    ou

    rnal

    Citation Cluster

    TON TNSM TCOM JLT JOCN

    Fig. 1: Paper clusters among five networking and opticalsystems journals.

    9 11 11

    31

    78

    Year≤ 2000 (2000, 2005] (2005, 2010] (2010, 2015] (2015, 2020]

    Pape

    rs

    0

    10

    20

    30

    40

    50

    60

    70

    80

    Fig. 2: Papers covered in this Area Exam by year.

    networks). To this end, we proceed from theoretical models topractical technological constraints and implementations. Wefinish by exploring exciting applications for reconfigurableoptical networks, including network simulation, measurement,traffic engineering, and cybersecurity.

    The remainder of this paper is organized as follows. Sec-tion II defines the network architecture model for opticalnetworks and its connection to the packet-switched networkmodel. Section III illuminates modeling work in reconfig-urable optical networks. Section IV is a broad overview ofpractical implementations of reconfigurable optical networks,first showcasing hardware technologies in Section IV-A. Then,Section IV-B showcases research on reconfigurable opticaldata center networks (DCNs) by highlighting DCN-specifichardware capabilities, algorithms, and systems implementa-tions. Section IV-C looks directly at reconfigurable wide-areanetworks (WANs) with an emphasis on WAN-specific chal-lenges in addition to algorithms and systems implementations.In section V we look at future work and applications forreconfigurable optical networks. Section VI concludes withthe overarching open challenges for the field of reconfigurableoptical networks spanning hardware, data centers, and wide-area networks.

    LTE/4G/

    5G

    DSL/Cable

    FTTH/FTTO

    Metropolitan Area

    Fiber Network

    Long-Haul Transit

    Optical Network

    Local

    PoP Regional

    IXP

    Fig. 3: Network architecture model, showing the connectionbetween IP and optical layers.

    II. NETWORK ARCHITECTURES

    A. IP-over-Optical Transport Network

    We discuss IP-over-Optical Transport Networks (IP-over-OTN) when referring to the network’s IP and the Opticallayers. In IP-over-OTN, hosts (e.g., data centers, points-of-presence or PoPs, servers, etc.) connect to routers, and theserouters are connected through the optical transport network(OTN) as shown in Figure 3. A node in the optical layer isan Optical Cross-Connect (OXC). An OXC transmits data onmodulated light through the optical fiber. The modulated lightis called a lambda, wavelength, or a circuit. The OXC canalso act as a relay for other OXC nodes to transparently routewavelengths. When acting as a relay for remote hosts, an OXCprovides optical switching capabilities, thus giving the networkflexibility in choosing where to send transmitting lambdas overthe OXC node.

    IP-over-OTN networks are not new. However, they are builtat a great cost. Historically network planners have engineeredthem to accommodate the worst-case expected demand by (1)over-provisioning of dense wavelength division multiplexing(DWDM) optical channels and (2) laying redundant fiber spansas a fail-safe for unexpected traffic surges. These surges couldcome from user behavior changes or failures elsewhere in thenetwork that forces traffic onto a given path. Only recentlyhave reconfigurable optical systems begun to gain attentionin the data center and wide-area network settings. For moreinformation about early IP-over-OTN, we defer to Bannisteret al. [17] and references therein, where the authors presentwork on optimizing WDM networks for node placement, fiberplacement, and wavelength allocation.

    B. Data Center Architecture

    Historically, data centers relied on packet-switched net-works to connect their servers; however, as scale and demandincreased, the cost to build and manage these packet-switchednetworks became too large. As a result of this change, newreconfigurable network topologies gained more attention fromresearches and large cloud providers. Many novel data centerarchitectures with reconfigurable optical topologies have beenproposed over the last decade. These architectures have incommon that they reduce the static network provisioning re-quirements, thereby reducing the network’s cost by presentinga means for bandwidth between hosts to change periodically.

  • 4

    Servers

    ToR Switches

    Aggregation

    Switches

    Core Switch

    Optical Switch

    Fig. 4: Data center architecture proposed in C-Through [19]

    Figure 4 shows one such example of a hybrid electrical-opticaldata center architecture. These architectures reduce cost andcomplexity via scheduling methods, which change bandwidthon optical paths in the data center. Various approaches havebeen demonstrated. Notable architectures employ fixed, anddeterministic scheduling approaches [12], [18] or demand-aware changes that prioritize establishing optical paths be-tween servers with mutual connectivity requests [19], [20].Switching fabrics are also diverse for data center opticalsystems. These include fabrics based on nanosecond tunablelasers [21], digital micromirror devices (DMD) [22], andliquid crystal on silicon (LCOS) wavelength selective switches(WSS) [23].

    III. THEORETICAL MODELS

    A. Data centers

    Momentum has been building for data centers to moveto optically switched and electrical/optical hybrid networks.However, there is a general reluctance to walk away fromthe old paradigm of a packet-switched-only network due tothe additional complexity of optical circuit switching (e.g.,the control plane management of optical circuits with shiftingdemand, and the variety of optical switching architecturesavailable). Further, without a quantitative measure of value-added by optical switching over packet-switched-only, datacenter network operators are understandably reluctant to spendcapital on an unvetted system.

    To address the concerns surrounding complexity and valuewhile raising awareness for the necessity of optically switchedinterconnects, researchers have constructed cost models todemonstrate the benefit of optical switching and hybrid archi-tectures. Wang et al. [24] developed one such model. Theyconducted intra-DC traffic measurements, which consistedof mixed workloads (e.g., Map-Reduce, MPI, and scientificapplications). They then played the traces back in simulation,assuming that three optical circuits could be created andreconfigured between racks every 30 seconds. Their analysisin a data center with seven racks showed that rack-to-racktraffic over the packet-switched network could be reduced by50% with circuit switching.

    More theoretical modeling work is presented in Sec-tion IV-B, which considers practical implementations of re-configurable data center networks. These papers also includeevaluations of prototype systems, therefore we leave them outof this section on pure theoretical models.

    B. Cost

    Fiber infrastructure for wide-area networks is incrediblycostly. Provisioning of fiber in the ground requires legalpermitting processes through various governing bodies. Asthe length of the span grows beyond metropolitan areas,to connect cities or continents, the number of governingbodies with whom to acquire the legal rights to lay the fibergrows [25]. Then, keeping the fiber lit also incurs high cost;power requirements are a vital consideration for wide-areanetwork providers [26]. Therefore, reliable cost models arenecessary for deploying and managing wide-area networks.In this section, we look at cost modeling efforts particularlysuited for reconfigurable optical networks.

    An early study on the cost comparison of IP/WDM vsIP/OTN networks (in particular: European backbone networks)was conducted by Tsirilakis et al. in [27]. The IP/WDMnetwork consists of core routers connected directly over point-to-point WDM links in their study. In contrast, the IP/OTNnetwork connects the core routers through a reconfigurableoptical backbone consisting of electro-optical cross-connects(OXCs) interconnected in a mesh WDM network.

    Capacity planning is a core responsibility of a networkoperator in which they assess the needs of a backbone networkbased on the projected growth of network usage. Gerstel etal. [28] relates the capacity planning process in detail, whichincludes finding links that require more transponders andfinding shared-risk-link-groups that need to be broken-up,among other things. They note that in this process, theIP and optical network topologies are historically optimizedseparately. They propose an improvement to the process viamulti-layer optimization, considering the connection betweenIP and optical layers. They save 40 to 60% of the requiredtransponders in the network with this multi-layer approach.The networks they looked at were Deutsche Telekom [29]and Telefonica Spain core networks. These authors’ workprovides a strong motivation for jointly optimizing IP andoptical network layers and sharing of information betweenthe two.

    Papanikolaou et. at. [30] propose a cost model for jointmulti-layer planning for optical networks. Their paper presentsthree network planning solutions; dual-plane network design,failure-driven network design, and integrated multi-layer sur-vivable network design. They show that dual-plane and failure-driven designs over-provision the IP layer, leaving resourceson the table that are only used if link failures occur. They showthat integrated multi-layer survivable network design enablesa significant reduction in CapEx and that the cost savingsincreases beyond dual plane and failure driven designs.

    Cost models for evaluating colorless (C)-ROADM vs. color-less, directionless, contentionless (CDC)-ROADM network ar-chitectures are described by Kozdrowski et al. [31]. They showthat for three regional optical networks (Germany, Poland,USA), CDC-ROADM based networks can offer 2 to 3× moreaggregate capacity over C-ROADM based networks. Theyevaluate their model with uniform traffic matrices (TMs)and apply various scalar multipliers to the TM. Their modelaccounts for many optical hardware related constraints, in-

  • 5

    cluding the number of available wavelengths and cost factorsassociated with manual-(re)configuration of C-ROADM ele-ments. However, their model doesn’t include an optical-reachconstraint. They limit solver computation time to 20 hours andpresent the best feasible solution determined in that amountof time.

    C. Blocking Probability

    Blocking probability is a crucial metric for assessing theflexibility of an optical network. It is the probability that arequest for an end-to-end lightpath in the network cannot beprovisioned. Turkcu et al. [32] provides analytical probabilitymodels to predict the blocking probability in ROADM basednetworks with tunable transceivers and validate their modelswith simulation considering two types of ROADM architecturein their analysis, namely share-per-node and share-per-link.In share-per-link, each end of a link has a fixed number oftransponders that can use it. In share-per-node, a node has afixed set of transponders that may use any incident links. Theauthors show that low tunable range (4 to 8 channels, out of32 possible) is sufficient for reducing blocking probability intwo topologies, NSF Net (14 Nodes), and a ring topology with14 nodes. As the tunable range moves beyond 8 and up to 32,there is little to no benefit for split-per-node and share-per-linkarchitectures. As the load on the network increases, blockingprobability increases, as well as the gap between blockingprobability of split-per-node and split-per-link decreases.

    D. Service Velocity

    Service velocity refers to the speed with which operatorsmay grow their network as demand for capacity grows. Wood-ward et al. [33] tackles the problem of increasing servicevelocity for WANs. In this context, they assume a networkof colorless non-directional ROADMS (CN-ROADMs)1, inwhich any incoming wavelength can be routed on any outgoingfiber. They claim that one of the largest impedances fornetwork growth in these networks is the availability of regener-ators. To solve this problem, they present three algorithms fordetermining regenerators’ placement in a network as servicedemand grows. The algorithms are: locally aware, neighboraware, and globally aware. Each algorithm essentially con-siders a broader scope of the network, which a node uses todetermine if an additional regenerator is needed at the site ata particular time. They show, via Monte Carlo simulations,varying optical reach and traffic matrices. The broadest scopealgorithm performs the best and allocates enough regeneratorsat the relevant sites without over-provisioning. This workshows that service velocity is improved with demand fore-casting, enabling infrastructure to be placed to meet thoseprojected demands.

    E. Competition

    Programmable and elastic optical networks can also worktogether with Network Function Virtualization (NFV) to offer

    1CN-ROADMs are also called CD-ROADMs in other papers. These bothrefer to the same ROADM architecture.

    lower-cost service-chaining to users. Optimal strategies havebeen demonstrated, with heuristic algorithms, to quickly findnear-optimal solutions for users and service brokers by Chen etal. [34]. In their work, they take a game-theoretic approachto modeling the competition among service brokers—whocomplete offering the lowest cost optical routs and servicechains, and between users—who compete to find the lowestcost and highest utility service chains among the brokers. Theydemonstrate both parties’ strategies, which converge on low-latency service chain solutions with low blocking probabilityfor optical paths.

    Modeling opportunity cost of optically switched paths isexplored by Zhang et al. [35]. In their work, they presentan algorithm for quickly evaluating the opportunity cost of awavelength-switched path. Given a request and a set of futurerequests, the opportunity cost for accommodating the initialrequest is the number of future requests blocked as a resultof the accommodation. Thus, the network operator’s goal isto minimize opportunity cost by permitting connections thatinterfere with the fewest future requests.

    F. Open Challenges

    The research literature is clear, that reconfigurable opticalnetworks can save costs over static topologies. Further, withthe right hardware in the network, we can construct networkswith low blocking probability, i.e., networks that can use thefull capability of wavelength switching to improve throughput.Also , by constructing networks with reconfigurability in mind,we can make sure that redundancies are in place that allowlight paths to switch fiber paths more quickly, thereby givingeven greater performance, especially in troubling scenariossuch as fiber cuts.

    Given these notions, we still currently lack a holistic model-ing and simulation framework for prototyping and managingreconfigurable optical networks. Network simulators such asMininet [36] exist for prototyping packet switched networks.Similarly, YATES [37] is a useful simulator for comparingtraffic engineering applications. However, there is yet nosimulation framework that integrates optical layer switchingcapability with the higher layers of the network stack. Sucha framework could serve to be invaluable in designing nextgeneration networks, whose underlying core topology is flexi-ble and adaptable to change. In our ongoing work we seek tobuild such a system.

    IV. PRACTICAL IMPLEMENTATIONS

    A. Enabling Hardware Technologies

    1) Wavelength Selective Switching (WSS): In contrast topacket-switched networks, optically circuit-switched systemsoperate at a more coarse granularity. The transmission ofinformation over a circuit requires an end-to-end path forthe communicating parties. Although packet switching hasgenerally prevailed in today’s Internet, recent research hasrevitalized the prospect of circuit switching for data centersand wide-area networks by illuminating areas in which flex-ible bandwidth benefits outweigh the start-up cost of circuitbuilding.

  • 6

    Technological advancements for optical hardware, primarilydriven by physics and electrical engineering research, havebeen instrumental in making circuit-switched networks a vi-able model for data center networks. Among these technolo-gies are low-cost/low-loss hardware architectures. Here wegive a brief overview of technological advancements in thisdomain that have had the most significant impact on networkedsystems.

    Kachris et al. [38] have an in-depth look at optical switchingarchitectures in data centers from 2012. In their survey, theyprimarily look at competing data center architectures andswitch models. In this section, we choose to focus insteadon those architectures’ physical manifestations (i.e., the basecomponents that make them up).

    Polymer waveguides are a low-cost architecture for opticalcircuit switches. These have been fabricated and studied indepth over the last 20 years, including work by Taboada etal. [39] in 1999, Yeniay et al. [40] in 2004, and Felipe etal. [41] in 2018. Early implementations such as Taboada etal. [39] showed fabrication techniques for simple polymerwaveguide taps. Multiple waveguide taps can be combinedto form an Array Waveguide Grating (AWG), and the signalstraversing the AWGs can then be blocked or unblocked to cre-ate an optical circuit switch. A major inhibitor of the polymerwaveguide architecture was signal-loss, which was as high as0.2 dB/cm until Yeniay et al. [40] discovered an improvementon the state-of-the-art with ultra low-loss waveguides in 2004.Their waveguides, made with fluorocarbons, have 4× less loss(0.05 dB/cm) than the next best waveguides at the time, madefrom hydrocarbons. Felipe et al. [41] demonstrate the effec-tiveness of a polymer waveguide-based switching architecturefor reconfiguring groups of optical flows of up to 1 Tbps,proving that that AWG is a viable and competitive switchingarchitecture for data centers. More recently, in 2020, AWGswere demonstrated to work in conjunction with nanosecondtunable transmitters to create flat topologies, significantlyreducing power consumption for data center networks due tothe passive—no power required—nature AWGs [42].

    Microelectromechanical Systems (MEMs), introduced byToshiyoshi et al. [43] in 1996, offered a lower-loss and moreflexible alternative to polymer waveguide systems of the day.These MEMs devices are made up of small mirrors, which canbe triggered between states (i.e., on and off ). Therefore, in aMEMs system light is reflected rather than guided (as in thepolymer waveguide systems). This distinction between reflec-tion and guiding implies generally slower switching speedsfor MEMs based systems, as the mirror must be physicallyturned to steer light out of the desired switch-port. Despitethis limitation, MEMs systems evolved to be competitive withpolymer waveguides in modern systems. Data center solutionsleveraging MEMs based switches include Helios [44].

    Liquid Crystal on Silicon (LCOS) was demonstrated as an-other viable optical switching architecture by Baxter et al. [45]in 2006. In LCOS switches, multiple frequencies of light,which have been multiplexed together, are spatially separatedand guided to a liquid crystal on silicon array. Each cell inthe array corresponds to an input frequency, and the outputport for that frequency is determined by applying a specific

    Fiber Array

    Conventional Grating

    Imaging Optics

    LCOS Switching Element

    Imaging

    Mirror

    Fig. 5: Liquid crystal on silicon wavelength selective switch.

    voltage to the silica in the cell. Switches based on the LCOSarchitecture are commercially available and are recognized asa key enabler for reconfigurable optical networks [23].

    Figure 5 shows an abstracted LCOS WSS. Multiplexedoptical signals from an array of fiber enter the system froma fiber array. These signals are directed to a conventionaldiffraction grating where the different colors or light areseparated from each signal. These colors are then projectedonto a unique position in the LCOS switching element. Thevoltage on the cell in the switching element determines whichoutput fiber the channel will leave through. From there, thesignal travels back through the system and into a differentfiber in the array.

    2) ROADMs: Reconfigurable add-drop multiplexers, orROADMs, are an integral component of IP-over-OTN net-works. These devices have evolved over the years to providegreater functionality and flexibility to optical transport net-work operators. We briefly describe the evolution of ROADMarchitectures. Figure 6 shows a broadcast and select ROADMarchitecture. Please refer to [10] for more information aboutROADM architectures.

    Colorless (C). Early ROADMs were effectively pro-grammable wavelength splitter-and-blockers, or broadcast-and-select devices. A wavelength splitter-and-blocker can beplaced before an IP-layer switch. If the switch is intendedto add/drop a wavelength (i.e., transceive data on it), thenthe blocker prohibits light on the upstream path and enableslight on the path to the switch. These splitter-and-blockersystems are better known as Colorless, or C-ROADMs, asthe splitter-and-blocker architecture is independent of anyspecific frequency of light. To receive the maximum benefitfrom C-ROADMs, operators should deploy their networkswith tunable transceivers as they allow more flexibility forthe end hosts when connecting to remote hosts.

    Colorless, Directionless (CD). The CD-ROADMs extendthe architecture of C-ROADMs by pairing multiple C-ROADMs together in the same unit to allow for a waveto travel in one of many directions. One shortfall of thisarchitecture is that the drop ports from each direction arefixed, and therefore if all of the drop ports are used from onedirection, the remaining points from other directions cannot beused. Due to the limitation of drop ports in different directions,the CD architecture is not contentionless.

    Colorless, Directionless, Contentionless (CDC). The CDC-ROADM solves the contention problem by providing a sharedadd/drop port for each direction of the ROADM. This allowscontentionless reconfiguration of the ROADM as any drop-

  • 7

    Fig. 6: Broadcast and Select colorful ROADM. The add/dropnode, R1, has ports for two optical channels. These channelsare directed at the ROADM. The ROADM uses a splitterto broadcast the channels onto two outbound ports, wherea wavelength blocker selects the appropriate channel for thenext router.

    signal is routed to a common port regardless of the directionfrom which the wave begins/terminates.

    Colorless, Directionless, Contentionless w. Flexible Grid(CDC-F). Flexi-grid, or elastic optical networks, are networkscarrying optical channels with non-uniform grid alignment.This contrasts with a fixed-grid network, where differentwavelengths are spaced with a fixed distance (e.g., 50 GHzspacing). Wideband spacing allows signals to travel fartherbefore becoming incoherent due to chromatic dispersion. Thus,CDC-Flex or CDC-F ROADMs enable the reconfiguration ofwavelengths with heterogeneous grid alignments. These aremost useful for wide area networks, with combinations of sub-sea and terrestrial circuits.

    3) Bandwidth-variable Transponders: Before we discussbandwidth-variable transponders, we must first take a momentto illuminate a common concept to all physical communica-tions systems, not only optical fiber. This concept is modu-lation formats. Modulation formats determine the number ofbinary bits that a signal carries in one symbol. Two parties,a sender and receiver, agree on a symbol rate (baud), whichdetermines a clock-speed to which the receiver is tuned when itinterprets a symbol from the sender. The simplest modulationformat is on-off keying (OOK), which transmits one-bit-per-symbol. In OOK, the symbol is sent via a high or lowpower level, as shown in figure 7A. A higher-order modulationtechnique is Quadrature Phase Shift Keying (QPSK), in whichthe symbol is a sinusoidal wave whose phase-offset relatesthe symbol. In this QPSK, there are four phase shifts agreedupon by the communicating parties, and therefore the systemachieves two bits per symbol, or two baud, seen in figure 7B.A constellation diagram for QPSK is shown in figure 7C.As modulations become more complex, it is more useful tovisualize them in the phase plane shown by their constel-lation diagram. Higher-order modulation formats are of thetype, #-Quadrature Amplitude Modulation (QAM) techniques(Figure 7D), and these permit ;>62 (#) bits per symbol2. InQAM, the symbol is denoted by phase and amplitude changes.Figure 7D shows an example of a constellation diagram for16-QAM modulation, which offers 4 bits per symbol, or twicethe baud of QPSK.

    Fiber optic communications are subject to noise. The noiselevel is Signal to Noise Ratio (SNR), and this metric deter-mines the highest possible modulation format. In turn, the

    2where # is generally a power of 2

    modulation format yields a potential capacity (Gbps) for anoptical channel. For example, in [46], the authors claim thatSNR of just 6 dB is sufficient to carry a 100 Gbps signal,while a circuit with an SNR of 13 dB can transmit 200 Gbps.

    Bandwidth Variable Transponders (BVTs) [47] have re-cently proven to have significant applications for wide-areanetworks. These devices are programmable, allowing for theoperator to choose from two or more different modulationformats, baud rates, and the number of subcarriers whenoperating an optical circuit. For example, the same transpondermay be used for high-capacity/short-reach transmission (16-QAM or greater) or lower-capacity/longer-reach transmission(e.g., QPSK). Higher modulation formats offer higher datarates. They are also more sensitive to the optical SNR, whichdecreases in a step-wise manner with distance, as illustratedin Figure 8. BVTs enable network operators to meet the ever-growing demand in backbone traffic by increasing optical cir-cuits’ spectral efficiency.

    Low spectrum utilization, or waste, can be an issue forBVT circuits. For example, a BVT configured for a low-modulation circuit such as QPSK instead of 16-QAM has apotential for untapped bandwidth. Sambo et al. [48] introducedan improvement to the BVT architecture, known as Sliceable-BVT (S-BVT), which addresses this issue. They describe anarchitecture that allows a transponder to propagate numerousBVT channels simultaneously. Channels in the S-BVT archi-tecture are sliceable in that they can adapt to offer higher orlower modulation in any number of the given subchannels.

    4) Silicon Photonics: Various materials (e.g., GaAs, Si,SiGe) can be used to make photonics hardware requiredfor data transmission. These devices include photodetectors,modulators, amplifiers, waveguides, and others. Silicon isthe preferred material for these devices due to its low cost.However, there are challenges to manufacturing these silicondevices, such as optical power loss and free carrier absorption.Other materials, notably GaAs, have better properties for prop-agating light; however, GaAs is more costly to manufacture.Despite these challenges, research into efficient and qualitytransmission using silicon-based photonic devices has boomedin the last decade. Early advances were made towards siliconphotonics in the 80s, particularly for waveguides, which arethe basis for circuit switches and multiplexers. Today, siliconphotonics is an integral part of almost all optical hardware,including lasers, photodetectors, modulators, and amplifiers.Costs are falling for optical hardware, thus enabling networkoperators to deploy newer technology into their systems at amore advanced pace as the devices’ quality and guaranteeshave continued to improve. For more information on siliconphotonics, see the survey by Thomson et al. [49].

    5) Open Challenges: The development of hardware forreconfigurable optical networking is a burgeoning field inengineering and research. While CDC-F ROADMs exist to-day, they are costly to produce, and their capabilities arefound lacking. In particular, the benefit of integrating CDC-F ROADMs with optical transport networks is limited bycascading fiber impairments, signal loss at WSS modules,and wavelength and fiber collision [50]. We expect siliconphotonics to bring down the cost of transport hardware,

  • 8

    Po

    wer

    Time

    1 Baud

    1 0 1

    OOK QPSK

    00 01 10 11

    QPSK

    Constellation Diagram

    Phase θ

    0001

    1110

    Amplitude

    0101 0111

    0100 0110

    16-QAM

    Constellation Diagram

    1 Baud

    (A) (B) (C) (D)

    Amplitude

    Phase θ

    Fig. 7: Modulation examples of on-off keying, quadrature phase shift keying (QPSK), quadrature amplitude modulation (QAM),and constellation diagrams for QPSK and 16-QAM.

    Mo

    du

    lati

    on

    / D

    ata

    Rat

    e

    Distance / Noise

    Fig. 8: Trade off between modulation/data rate and dis-tance/noise with BVT.

    thereby increasing access to such devices and lowering entrybarriers for research and development.

    Another great challenge here is that most optical networkingcomponents are invisible to higher layers of the networkingstack. This consequence is a result of the passive nature ofthe devices. Unfortunately, it implies difficulty for efficientlymanaging and programmatically reconfiguring such devices.More critically, it creates a hard disconnect between thelogical network topology and the physical, optical-wavelengthtopology. This means that it is nearly impossible to map opticalnetworks with active measurement strategies.

    B. Optically Reconfigurable Data Centers

    A key challenge for data centers is to optimize the utilizationof the data center network (DCN). In a DCN, many differentservices are running and competing for shared bandwidth.Communication patterns between top-of-rack (ToR) switchesvary with the underlying applications that are running (e.g.,map-reduce, video stream processing, physics simulations,etc.). Thus, as future applications and user’s needs change,it is challenging to predict where bandwidth will be needed.

    Static and reconfigurable network solutions have been posedby research and industry to address this challenge. There is anassumption that the connectivity graph of the network cannotchange in static network solutions. These solutions also as-sume fixed capacity (or bandwidth) on links. In reconfigurablenetwork solutions, by contrast, these assumptions regardingconnectivity and bandwidth are relaxed. Servers and switches(collectively referred to as nodes) may connect some subset of

    the other nodes in the network, and the nodes to which theyare adjacent may change over time. Further, the bandwidth ofa connection may also change over time.

    Under the assumption of a static physical topology, differentnetwork architectures and best practices have been established.Some of these architectures include Clos, fat-tree, and torustopologies. Best practices include (over-)provisioning all linkssuch that the expected utilization is a small fraction of thetotal bandwidth for all connections. These solutions can incurhigh cabling costs and are inefficient.

    Reconfigurable network solutions circumvent the limitationsof the static network solutions by reducing cabling costsor reducing the need to over-provision links. The flexibilityof light primarily empowers these reconfigurable solutions.Some of these flexibilities include the steering of light (e.g.,with MEMs or polymer waveguides) and the high capacityof fiber-optics as a medium (e.g., dense wavelength divisionmultiplexing, or DWDM, enables transmitting O()1/B) on asingle fiber).

    In this section, we illuminate efforts to improve DCNswith reconfigurable optics. Related surveys on this subjectinclude Foerster et al. [13] and Lu et al. [15]. We dividethe state of reconfigurable optical DCNs into technology,algorithms, and systems. In technology, we supplement thediscussion from section IV-A with hardware capabilities thatcurrently exist only for DCNs. Such features include free-space optics and sub-second switching. Next, we highlight costmodeling research, whose goal is to derive formal estimatesor guarantees on the benefit of reconfigurable optical networksover static topologies for DCNs. Then, we survey the relevantalgorithms for managing and optimizing reconfigurable opticalnetworks in the data center and some systems implementationsthat leverage those algorithms.

    1) DCN-specific Technologies: Innovations in reconfig-urable optical networks are enabled by hardware’s evolution,as discussed in section II. There is a subset of innovationsthat are well-suited for data centers only. These are free-spaceoptics and sub-second switching. Although we have separatedthese below, there may be overlaps between free-space opticsand sub-second switching systems as well.

  • 9

    Laser

    Collimation Lens

    DMD

    Aperture

    Mirror

    Assembly

    Focusing Lens

    Photodetector

    Diffracted

    Beams

    Fig. 9: Free-space optics switching architecture for data cen-ters [52]

    Free-space Optics. In free-space optics systems, light prop-agates through the air from one transceiver to another. Free-space optics enables operators to reduce their network’s com-plexity (a function of cabling cost). These closed environ-ments and their highly variable nature of intra-data centertraffic make such solutions appealing. Recent works suchas Firefly [51] have demonstrated that free space optics arecapable of reducing latency for time-sensitive applicationsby routing high-volume/low-priority traffic over the wirelessoptical network while persistently serving low-volume/high-priority traffic on a packet-switched network. High fan-out (1-to-thousands) for free-space optics is enabled with DMDs, orDense Micro-mirror Devices, as shown by ProjecToR [52].The DMDs are placed near Top-of-Rack (ToR) switches andpair with disco-balls, fixed to the ceiling above the racks.The DMD is programmed to target a specific mirror on thedisco-ball, guiding the light to another ToR in the data center.Figure 9 illustrates the main properties of the free space opticsdeployment proposed in [52]. The deployment and operationof a free-space optics data center are fraught with uniquechallenges, particularly for keeping the air clear betweentransceivers and DMDs. Any particulate matter that the lightcomes into contact with can severely degrade performance andcause link failures should they persist. This phenomenon isknown as atmospheric attenuation [53].

    Sub-second Switching. In data centers, distances are shortbetween hosts, and therefore optical signals do not lose theirstrength to such a degree that mid-line devices such as am-plifiers are necessary. Therefore, applications can benefit fromall of the agility of optical layer devices without accountingfor physical-layer impairments, which can slow down recon-figuration times in wide-area networks. Research has shownthat micro-second switching of application traffic is possiblein data center environments [54]–[56]. The ability to conductcircuit switching at microsecond timescales has illuminatedfurther intrigue, particularly for transport protocols runningon top of these networks. In C-Through [19], the authorsobserved that throughput for TCP applications dropped whentheir traffic migrated to the optical network. They showed howto mitigate this by increasing the queue size for optical circuitswitches and adjusting the host behaviors. Mukerjee et al. [57]augmented their solution by expanding TCP for reconfigurabledata center networks.

    2) Algorithms: The capability of optical circuit switchingfor data center networks comes with the need to define

    new algorithms for optimizing utilization, bandwidth, fairness,latency, or any other metric of interest. Research has presentedmany different approaches for optimizing the metric relevantto the network operator in static networks. Traffic Engineer-ing (TE) generally refers to the determination of paths forflows through the network, and the proportion of bandwidthlevied for any particular flow. If the data center has a staticnetwork topology (e.g., fat-tree), then TE is simple enoughthat switches can conclude how to route flows. However,introducing reconfigurable paths complicates the process ofTE significantly: network elements (e.g., switches) must nowalso determine with whom and when to establish optical paths,and when to change optical neighbors.

    Matchings. Choosing where to establish optical circuits canbe computed quickly via matching algorithms [58]. Matchingsoften provide a good approximation, especially in settingswhere the goal is to maximize single-hop throughput alongwith reconfigurable links. Matching algorithms hence fre-quently form the basis of reconfigurable optical networks,e.g., Helios [59] and c-Through [19] rely on maximum match-ing algorithms. If there exist multiple reconfigurable links (say1 many), it can be useful to directly work with a generalizationof matching called 1-matching [60]: 1-matchings are forexample used in Proteus [61] and its extension OSA [62]. Insome scenarios, for example, when minimizing the averageweighted path length under segregated routing, maximum 1-matching algorithms even provide optimal results [63], [64].This however is not always true, e.g., when considering non-segregated routing policies [63], [64], which require heuris-tics [51, §5.1], [65].

    Oblivious Approaches. Matchings also play a role in re-configurable networks which do not account for the trafficthey serve, i.e., in demand-oblivious networks. The primeexample here is Rotornet [18] which relies on a small set ofmatchings through which the network cycles endlessly: sincethese reconfigurations are “dumb”, they are fast (compared todemand-aware networks) and provide frequent and periodicdirect connections between nodes, which can significantlyreduce infrastructure cost (also known as “bandwidth tax”)compared to multihop routing. In case of uniform (delay-tolerant) traffic, such single-hop forwarding can saturate thenetwork’s bisection bandwidth [18]; for skewed traffic matri-ces, it can be useful to employ Valiant load balancing [66] toavoid underutilized direct connections, an idea recently alsoleveraged in Sirius [12] via Chang et al. [67]. Opera [68] ex-tends Rotornet by maintaining expander graphs in its periodicreconfigurations. Even though the reconfiguration schedulingof Opera is deterministic and oblivious, the precomputation ofthe topology layouts in its current form is still randomized. Ex-pander graphs (and their variants, such as random graphs [69])are generally considered very powerful in data center contexts.An example of a demand-aware expander topology was pro-posed in Tale of Two Topologies [70], where the topologylocally converts between Clos and random graphs.

    Traffic Matrix Scheduling. Another general algorithmic ap-proach is known as traffic matrix scheduling: the algorith-mic optimizations are performed based on a snapshot ofthe demand, i.e., based on a traffic matrix. For example,

  • 10

    Mordia [71] is based on an algorithm that reconfigures thenetwork multiple times for a single (traffic demand) snap-shot. To this end, the traffic demand matrix is scaled intoa bandwidth allocation matrix, which represents the fractionof bandwidth every possible matching edge should be al-located in an ideal schedule. Next, the allocation matrix isdecomposed into a schedule, employing a computationallyefficient [72] Birkhoff-von-Neumann decomposition, resultingin $ (=2) reconfigurations and durations. This technique alsoapplies to scheduling in hybrid data center networks whichcombine optical components with electrical ones, see e.g.,the heuristic used by Solstice [73]. Eclipse [74] uses trafficmatrix scheduling to achieve a (1 − 1/4 (1−Y) )-approximationfor throughput in the hybrid switch architecture with re-configuration delay, but only for direct routing along withsingle-hop reconfigurable connections. While Eclipse is anoffline algorithm, Schwartz et al. [75] recently presentedonline greedy algorithms for this problem, achieving a prov-able competitive ratio over time; both algorithms allow toaccount for reconfiguration costs. Another example of trafficmatrix scheduling is DANs [76]–[79] (short for demand-awarenetworks, which are optimized toward a given snapshot ofthe demand). DANs rely on concepts of demand-optimizeddata structures (such as biased binary search trees) and coding(such as Huffman coding) and typically aim to minimize theexpected path length [76]–[79], or congestion [77]. In general,the problem features exciting connections to the schedulingliterature, e.g., the work by Anand et al. [80], and morerecently, Dinitz et al. [81]; the latter, however, is not based onmatchings or bipartite graphs. Rater, the demands are the edgesof a general graph, and a vertex cover can be communicatedin each round. Each node can only send a certain number ofpackets in one round.

    Self-Adjusting Datastructures. A potential drawback of traf-fic matrix scheduling algorithms is that without countermea-sures, the optimal topology may change significantly from onetraffic matrix snapshot to the next, even though the matrixis similar. There is a series of algorithms for reconfigurablenetworks which account for reconfiguration costs, by makinga connection to self-adjusting data structures (such as splaytrees) and coding (such as dynamic Huffman coding) [78],[82]–[85]. These networks react quickly and locally two newcommunication requests, aiming to strike an optimal trade-offbetween the benefits of reconfigurations (e.g., shorter routes)and their costs (e.g., reconfiguration latency, energy, packetreorderings, etc.).

    To be more specific, the idea of both the traffic matrixscheduling algorithms and the self-adjusting data structurebased algorithms is to organize the communication partners(i.e., the destinations) of a given communication source ineither a static binary search or Huffman tree (if the demandis known), or in a dynamic tree (if the demand is not knownor if the distribution changes over time). The tree optimizedfor a single source is sometimes called the ego-tree, and theapproach relies on combining these ego-trees of the differentsources into a network while keeping the resulting node degreeconstant and preserving distances (i.e., low distortion).

    Machine Learning. Another natural approach to devise

    algorithms for reconfigurable optical networks is to use ma-chine learning. To just give two examples, xWeaver [20]and DeepConf [86] use neural networks to provide traffic-driven topology adaptation. Another approach is taken byKalmbach et al. [87], who aim to strike a balance betweentopology optimization and “keeping flexibilities”, leveragingself-driving networks. Finally, Truong-Huu et al. [88] pro-posed an algorithm which uses a probabilistic, Markov-chainbased model to rank ToR nodes in data centers as candidatesfor light-path creation.

    Additional Aspects. Last but not least, several algorithmsaccount for additional and practical aspects. In the contextof shared mediums (e.g., non-beamformed wireless broadcast,fiber3 (rings)), contention and interference of signals can beavoided by using different channels and wavelengths. Thealgorithmic challenge is then to find (optimal) edge-coloringson multi-graphs, an NP-hard problem for which fast heuristicsexist [90]. However, on specialized topologies, optimal solu-tions can be found in polynomial time, e.g., in WaveCube [91].Shared mediums also have the benefit that it is easier todistribute data in a one-to-many setting [92]. For example, onfiber rings, all nodes on the ring can intercept the signal [89,§3.1]. One-to-many paradigms4 such as multicast can also beimplemented in other technologies, using e.g., optical splittersfor optical circuit switches or half-reflection mirrors for free-space optics [95]–[99].

    3) Systems Implementations: There have been manydemonstrations of systems for reconfigurable optics in datacenters. Many of the papers that we discuss in Section IV-B2are fully operational systems. Another notable research de-velopment that does not fit into algorithms is the work byMukerjee et al. [57]. They describe amendments to theTCP protocol to increase the efficiency of reconfigurable datacenter networks. These amendments include dynamic bufferre-sizing for switches and sharing explicit network feedbackwith hosts. Much of the other work on reconfigurable DCNsare summarized in Table II.

    4) Open Challenges: Our understanding of algorithmsand topologies in reconfigurable networks is still early, butfirst insights into efficient designs are being published. Onefront where much more research is required concerns themodeling (and dealing with) reconfiguration costs. Indeed,existing works differ significantly in their assumptions, evenfor the same technology, making it challenging to comparealgorithms. Related to this is also the question of how re-configurations affect other layers in the networking stack, andhow to design (distributed) controllers. In terms of algorithms,even though a majority of problems are intractable to solveoptimally, due to integral connection constraints, the questionof approximation guarantees is mostly open. For example, con-sider designing a data center with minimum average weightedpath length. A logarithmic approximation is easy to achieve bysimply minimizing the diameter of a (constant-degree) statictopology. However, computing an optimal solution is NP-hard.So, can we obtain polynomial approximation algorithms with

    3In the context of data center proposals, shared fiber is the more popularmedium, e.g., in [62], [71], [89].

    4Conceptually similar challenges arise for coflows [93], [94].

  • 11

    Fabric Dem.-Aw.

    Novelty

    Helios [59] Hybrid X First hybrid system using WDM for busty low-latency trafficc-Through [19] Hybrid X Enlarged buffers for optical ports increases utilizationProjecToR [52] Hybrid/FSO X Introduces DMDs for free-space switching thus enabling a fan-out

    potential to thousands of nodesProteus [61] All-optical X Design of an all-optical and reconfigurable DCN.

    OSA [62] All-optical X Demonstrates greater reconfiguration flexibility and bisectionbandwidth than hybrid architectures

    Rotornet [18] Hybrid × An all-optical demand-oblivious DCN architecture for simplifiednetwork management

    Opera [68] All-optical × Extends Rotornet to include expander graphs rotationsFlat-tree [70] Hybrid X A hybrid of random graphs and Clos topologies brings reconfig-

    urable optics closer to existing DCNs.Solstice [73] Hybrid X Exploits sparse traffic patterns in DCNs to achieve fast scheduling

    of reconfigurable networks.Eclipse [100] Hybrid X Outperforms Solstice by applying sub-modular optimization theory

    to hybrid network scheduling.xWeaver [20] Hybrid X Trains neural networks to construct performant topologies based

    on training data from historic traffic traces.DeepConf [86] Hybrid X Presents a generic model for constructing learning systems of

    dynamic optical networksWaveCube [91] Hybrid X A modular network architecture for supporting diverse traffic

    patterns.Sirius [12] All-optical × Achieves nanosecond-granularity reconfiguration for thousands of

    nodes

    TABLE II: Summary of systems implementations of reconfigurable data center networks

    constant performance trade-offs? Similarly, do good (fixed)parameter characterizations enable efficient run times, andwhat can we expect from e.g., linear time and distributedalgorithms? Moreover, beyond general settings, how do spe-cific (oblivious) network designs enable better algorithms, andhow does their design interplay with topologies of the sameequipment cost?

    Next, going beyond scheduling, how can the framework ofonline algorithms be leveraged in this context? Ideally, wewant a reconfigurable link to exist before the traffic appears.How can we balance this from a worst-case perspective?In this context, traffic prediction techniques might reducethe possible solution space massively, but we will still needextremely rapid reaction times to new traffic information.

    Another open challenge is the efficient interplay betweenreconfigurable and non-reconfigurable network parts. The-ory for specific reconfigurable topologies (e.g., traffic matrixscheduling for a single optical switch) has seen much progress.However, more general settings, particularly non-segregatedrouting onto both network parts, are still an open issue, beyondan abstract view of the combination with a single packetswitch.

    C. Reconfigurable Optical WAN

    In this section, we survey recent research in reconfigurableoptics in wide-area networks (WAN). Reconfigurable opticsrefers to dynamism in the physical-layer technology that en-ables high-speed and high-throughput WAN communications,fiber optics. We divide reconfigurable optical innovations intotwo sub-categories, rate-adaptive transceivers and dynamic op-tical paths. Rate adaptive transceivers are optical transceiversthat can change their modulation format to adapt to physical

    layer impairments such as span-loss and noise. Dynamicoptical paths refer to the ability to steer light, thus allowingthe edges of the network graph to change (e.g., to avoid a linkthat has failed).

    Many groups have studied the programmability and au-tonomy of optical networks. Gringeri et al. [101] wrote aconcise and illuminating introduction to the topic. In it, theauthors propose extending Software Defined Network (SDN)principles to optical transport networks. They highlight chal-lenges, such as reconfiguration latency in long-haul networks,and provide a trade-off characterization of distributed vs.centralized control for an optical SDN system. They claimthat a tiered hierarchy of control for a multi-regional network(e.g., segregated optical and network control loops) will offerthe best quality solution. Further, they argue that centralizedcontrol should work best to optimize competing demandsacross the network, but that the controller’s latency willbe too slow to react to network events, e.g., link outagesquickly. Therefore, the network devices should keep somefunctionality in their control plane to respond to link failures ina decentralized manner, e.g., reallocating the lost wavelengthsby negotiating an alternative path between the endpoints.

    1) WAN-Specific Challenges and Solutions: There are manyreasons for the prevalence of optical fiber as the de-facto leaderfor long-distance communications. First, it has incredible reachcompared to copper—optical signals can propagate 80 to100 km before being amplified. Second, it has an incrediblyhigh bandwidth compared to the radio spectrum. Third, opticalfiber itself has proved to be a robust medium over decades,as improvements to the transponders at the ends of the fiberhave enabled operators to gain better value out of the samefiber year after year.

    To design a WAN, the network architect must solve several

  • 12

    difficult challenges, such as estimating the demand on thenetwork now and into the future, optimal placement of routersand quantity of ports on those routers within the network, andoptimal placement of amplifiers in the network.

    Many design challenges solve more easily in a static WAN,where optical channels are initialized once and maintainedfor the network’s life. For example, amplifiers carrying thechannel must have their gain set in such a way that thesignal is transmitted while maximizing the signal-to-noise ratio(SNR). This calculation can take minutes or hours dependingon the network’s characteristics (e.g., the number of interme-diate nodes and the number of distinct channels on sharedamplifiers).

    Dynamic optical networks must rapidly address these chal-lenges (in sub-second time frames) to achieve the highestpossible utilization, posing a significant challenge. For ex-ample, it requires multiple orders of magnitude increasesin the provisioning time for optical circuits beyond whatis typically offered by hardware vendors. Therefore, severalresearch efforts have explored ways to automate WAN networkelements’ configuration concerning physical layer impairmentsin a robust and time-efficient manner.

    Chromatic Dispersion. DWDM makes efficient use ofoptical fiber by putting as many distinct optical channels, eachidentified by a frequency (or lambda _) onto the shared fiber.Each of these lambdas travels at a different speed relative tothe speed of light. Therefore, two bits of information trans-mitted simultaneously via two different lambdas will arriveat the destination at two different times. Further, chromaticdispersion is also responsible for pulse-broadening, whichreduces channel spacing between WDM channels and cancause FEC errors. Therefore, DWDM systems must handlethis physical impairment.

    Amplified Spontaneous Emission (ASE) Noise. A significantlimitation of circuit switching is the latency of establishing thecircuit due to ASE noise constraints [102]. Although SDNprinciples can apply to ROADMs and WSSs (to automatethe control plane of these devices), physical layer properties,such as Noise Figure (NF) and Gain Flatness (GF) complicatethe picture. When adding or removing optical channels to orfrom a long-haul span of fiber, traversing multiple amplifiers,the amplifiers on that path must adjust their gain settings toaccommodate the new set of channels. To this end, researchershave worked to address the challenge of dynamically config-uring amplifiers. Oliveira et al. [103] demonstrated how tocontrol gain on EDFAs using GMPLS. They evaluated theirsolution on heterogeneous optical connections (10, 100, 200,and 400 Gbps) and modulations (OOK, QPSK, and 16-QAM).They used attenuators to disturb connections and allow theirGMPLS control loop to adjust the amplifier’s gains. Theyshow that their control loop helps amplifiers to adjust whiletransmitting bits with BER below the FEC threshold for up to6 dB of added attenuation.

    Moura et al. [104] present a machine learning approach forconfiguring amplifier gain on optical circuits. Their approachuses case-based reasoning (CBR) as a foundation. The intu-ition behind CBR is that the gain setting for a set of circuitswill be similar if similar circuits are present on a shared fiber.

    They present a genetic algorithm for configuring amplifiersbased on their case-based reasoning assumption. They showthat their methodology is suitable for configuring multipleamplifiers on a span with multiple optical channels. In afollow-up study, they present FAcCBR [105], an optimizationof their genetic algorithm, which yields gain recommendationsmore quickly by limiting the number of data-points recordedby their algorithm.

    Synchronization. Managing a WAN requires coordinatingservices (e.g., end-to-end connections) among diverse setsof hardware appliances (transponders, amplifiers, routers),logically and consistently. The Internet Engineering Task Force(IETF) has defined protocols and standards for configuringWAN networks. As the needs and capabilities of networks haveevolved, so have the protocols. Over the years, new protocolshave been defined to bring more control and automation to thenetwork operator’s domain. These protocols are Simple Net-work Management Protocol (SNMP) [106] and Network Con-figuration Protocol (NETCONF) [107]. Additionally, networkoperators and hardware vendors have been working to define aset of generalized data models and configuration practices forautomating WAN networks under the name OpenConfig [108].Although OpenConfig is not currently standardized with theIETF, it is deployed and has demonstrated its value in severalunique settings.

    In addition to the standardized and proposed protocolsfor general-purpose WAN (re)configuration, there has been apush by various independent research groups to design andtest protocols specifically for reserving and allocating opticalchannels in WAN networks.

    One protocol was developed in conjunction with the CORO-NET [109] program, whose body of research has led toseveral other developments in reconfigurable optical WANs.The proposal, by Skoog et al. [110], describes a three-wayhandshake (3WHS) for reserving and establishing optical pathsin single and multi-domain networks. In the 3WHS, messagesare exchanged over an optical supervisory channel (OSC)—an out-of-band connection between devices isolated from usertraffic. The transaction is initiated by one Optical Cross-Connect (OXC�) and directed at a remote OXC, OXC/ . Ateach hop along the way, the intermediate nodes append theavailable channels to the message. Then, OXC/ chooses achannel via the first-fit strategy [111] and sends a message toOXC� describing the chosen channel. Finally, OXC� activatesthe chosen channel and beings sending data over it to OXC/ .This protocol is claimed to meet the CORONET projectstandard for a setup time of 50 ms + RTT between nodes. Bitarrays are used to communicate the various potential channelsbetween nodes and are processed in hardware. The blockingprobability is 10−3 if there is one channel reserved betweenany two OXC elements so long as there are at least 28 totalchannels possible between OXCs [110].

    2) Algorithms: Jointly optimizing both the optical and thenetwork layer in wide-area networks leads to new opportuni-ties to improve performance and efficiency, while introducingnew algorithmic challenges. In contrast to the previouslydiscussed data center networks, it is impossible to createnew topological connections in a wide-area network (without

  • 13

    deploying more fiber. Free space optics solutions don’t applyhere). Instead, reconfigurability is possible by adjusting andshifting bandwidth capacities along the fiber edges, possiblyover multiple hops. Hence, we need a different set of algorith-mic ideas that optimize standard metrics such as throughput,completion time, blocking probability, and resilience. In thissection, we discuss recent papers that tackle these issues,starting with some earlier ones. Moreover, there is the need forsome central control to apply the routing, policy, lightpath etc.changes, for which we refer to recent surveys [112], [113].

    Routing aspects are explored intensively in this context.Algorithmic approaches to managing reconfigurable opticaltopologies have been studied for decade, but are recentlygaining new attention. In early work by Kodialam et al. [114]explores IP and optical wavelength routing for a series ofconnection requests. Their algorithm determines whether arequest should be routed over the existing IP topology, or ifa new optical path should be provisioned for it. Interestingwork by Brzezinski and Modiano [115] who leverage matchingalgorithms and Birkhoff–von Neumann matrix decompositionsand evaluate multi- versus single-hop routing5 in WDM net-works under stochastic traffic. However, the authors mostlyconsider relatively small networks, e.g., with three to sixnodes. For larger networks, shortest lightpath routing is apopular choice [117]. Another fundamental aspect frequentlyconsidered in the literature regards resilience [118]–[120]. Forexample, Xu et al. [118] investigate resilience in the contextof shared risk link groups (SLRGs) and propose a methodon how to provision the circuits in a WAN. To this end, theyconstruct Integer Linear Programs to obtain maximally SLRG-diverse routes, which they then augment with post-processingfor DWDM system selection and network design issues.

    We now introduce further selected algorithmic works, start-ing with the topic of bulk transfers [121]. In OWAN [122],Jin et al. optimize bulk transfers in a cross-layer approach,which leverages both the optical and the network layer. Theirmain objective is to improve completion time; while an integerlinear program formulation would be too slow, the authorsrely on a simulated annealing approach. A local search shiftswavelength allocations, allowing heuristic improvements to becomputed at a sub-second scale. The scheduling of the bulktransfer then follows the standard shortest job first approaches.When updating the network state, if desired, OWAN can extendprior consistent network update solutions [123] by introducingcircuit nodes in the corresponding dependency graphs. OWANalso considers deadline constrained traffic, implementing theearliest deadline first policy. Follow-up work extended OWANin two directions, via theoretic scheduling results and forimprovements on deadline-constrained transfers.

    In DaRTree [124], Luo et al. develop an appropriate relax-ation of the cross-layer optimization problem for bulk transfersunder deadlines. Their approach relies on a non-greedy allo-cation in an online setting, which allows future transfers to bescheduled efficiently without needing to reallocate currentlyutilized wavelengths. To enhance multicast transfers (e.g., forreplication), they develop load-adaptive Steiner Tree heuristics.

    5See also the idea of lightpath splitting in Elastic Optical Networks [116].

    Jia et al. [125] design various online scheduling algo-rithms and prove their competitiveness in the setting ofOWAN [122]. The authors consider the minimum makespanand sum completion time, analyzing and extending greedycross-layer scheduling algorithms, achieving small competitiveratios. Dinitz and Moseley [126] extend the work of Jia et al.by considering a different objective, the sum of flow timesin an online setting. They show that resource augmentationis necessary for acceptable competitive bounds in this setting,leading to nearly (offline) optimal competitive ratios. Whiletheir algorithms are easy to implement (e.g., relying on order-ing by release time or by job density), the analysis is com-plicated and relies on linear program relaxations. Moreover,their algorithm also allows for constant approximations in theweighted completion time setting, without augmentations.

    Another (algorithmic) challenge is the integration of cross-layer algorithms into current traffic engineering systems. SuchTEs are tried and tested, and hence service providers arereluctant to adapt their designs. To this end, Singh et al. [127]propose an abstraction on how dynamic link capacities (e.g.,via bandwidth variable transceivers) can be inserted into clas-sic TEs. Even though the TE is oblivious to the optical layer,an augmentation of the IP layer with fake links enables cross-layer optimization via the TE. A proposal [128] for a newTE for such dynamic link capacities is discussed in the nextSection IV-C3. Singh et al. [127] also discuss consistent updatemethods [129] for dynamic link capacities, which Tseng [130]formalizes into a rate adaption planning problem, providingintractability results and an LP-based heuristic.

    OptFlow [131] proposes a cross-layer abstraction for pro-grammable topologies as well, but focuses on shifting wave-lengths between neighboring fibers. Here, the abstractionconcept is extended by not only creating fake links but alsoaugmenting the traffic matrix with additional flows. As bothlinks and flows are part of the input for TEs, OptFlow enablesthe compilation of optical components into the IP layer for var-ious traffic engineering objectives and constraints. Concerningconsistent updates, classic flow-based techniques [129] carryover, enabling consistent cross-layer network updates too.

    Higher-layer applications, such as VNF Network Em-bedding (VNF-NE) have also been studied by variousgroups [132], [133]. Network embedding is a physical layerabstraction for creating end-to-end paths for network ap-plications or network function virtualization (NFV) servicechains. Paths have requirements for both bandwidth and CPUresources along the service chain. Wang et al. [132] provesthis problem to be NP-Complete for elastic optical networks.Soto et al. [133] provides an integer linear program (ILP) tosolve the VNF-NE problem. The ILP solution is intractablefor large networks. Thus they provide a heuristic that uses aranking-system for optical paths. Their heuristic ranks opticalpaths by considering a set of end-to-end connection requests.Paths with higher rank satisfy a more significant proportion ofthe demand for bandwidth and CPU among all of the requests.

    Optical layer routing with traffic and application constraintsis a difficult problem. The running theme has been thatlinear programming solutions can find provably optimal solu-tions [134], but take too long to converge for most use cases.

  • 14

    However, network traffic is not entirely random and thereforehas an underlying structure that may be exploited by offlinelinear program solvers, as shown by Kokkinos et al. [135].They use a two-stage approach for routing optical paths in anonline manner. Their technique finds periodic patterns over anepoch (e.g., daily, weekly, or monthly) and solves the demandcharacterized within the epoch with an offline linear program.Then, their online heuristic makes changes to the topology toaccommodate random changes in demand within the epoch.

    3) Systems Implementations: The integration of reconfig-urable optics with WAN systems has been impracticable dueto its cost and a lack of convergence on cross-layer APIsfor managing the WAN optical layer with popular SDN con-trollers. However, some exciting work has demonstrated thepromise for reconfigurable optics is closed settings. Notably,RADWAN [136] and CORONET [109] for bandwidth-variableWAN systems and systems with dynamic optical paths, re-spectively. In this section, we explore reconfigurable opticalWAN systems more deeply in these two contexts. Table IIIsummarizes these systems.

    Bandwidth Variable Transceivers. A team of researchersat Microsoft evaluates bandwidth variable transponders’ ap-plicability for increases throughput in Azure’s backbone inNorth America [46]. They find that throughput for the WANcan increase if they replace the fixed-rate transponders intheir backbone network with three-way sliceable transponders.They also show that for higher-order slices, bandwidth granincreases at diminishing returns.

    Traffic Engineering with rate-adaptive transceivers was re-cently proposed by Singh et al. [128]. The authors are mo-tivated by a data-set of Microsoft’s WAN backbone Signal-to-Noise ratio from all transceivers in the North-Americanbackbone, over two and a half years. They note that over60% of links in the network could operate at 0.75× highercapacity and that 25% of observed outages due to SNR dropscould be mitigated by reducing the modulation of the affectedtransceivers. They evaluate the reconfigurability of Bandwidth-Variable Transponders, showing that reconfiguration time forthe transceivers could be reduced from minutes to millisecondsby not turning the transceivers off. Then, they propose a TEobjective function via linear-programming, to minimize churn,or impact due to SNR fluctuations, in a WAN. Finally, theyevaluate their TE controller on a testbed WAN and show thatthey improve network throughput by 40% over a competitivesoftware-defined networking controller, SWAN [138].

    Dynamic Optical Paths. In the early aughts, researchersexplored the benefit of dynamic optical paths for networks inthe context of grid-computing. Early efforts by Figueira etal. [139] addressed how a system might manage dynamicoptical paths in networks. In this work, the authors propose aweb-based interface for submitting optical reconfiguration re-quests and a controller for optimizing the requests’ fulfillment.They evaluate their system on OMNInet [140], a metropolitanarea network with 10 Gbps interconnects between 4 nodes andWavelength Selective Switches between them. They claim thatthey can construct optical circuits between the OMNInet nodesin 48 seconds. Further, they show that amortized setup timeand transfer is faster than packet-switching for files 2.5 Gb

    or larger (assuming 1 Gbps or greater optical interconnectand 300 Mbps packet switching throughput). They go on toevaluate file transfer speeds using the optical interconnectand show that they can archive average transfer speeds of680 Gbps. Iovanna et al. [141] address practical aspects ofmanaging multilayer packet-optical systems. They present aset of useful abstractions for operating reconfigurable opticalpaths in traffic engineering using an existing managementprotocol, GMPLS.

    Stability is an important feature of any network. An interest-ing question about reconfigurable optical networked systemsarises regarding the stability of optically switched paths. Thatis, if the topology can continuously change to accommodaterandom requests, what service guarantees can the networkmake? Can the fluctuation of the optical layer be detrimentalto IP layer services? Chamania et al. [142] explore this issue indetail, providing an optimal solution for keep quality of serviceguarantees for IP traffic while also improving performancebeyond static optical layer systems.

    Bandwidth-on-demand (BoD) is an exciting application ofreconfigurable networks. Von Lehmen et al. [109] describetheir experience in deploying BoD services on CORONET,DARPA’s WAN backbone. They implement protocols foradd/dropping wavelengths in their WAN with a novel 3-way-handshake protocol. They demonstrate how their system isable to utilize SWAN [138] Traffic Engineering Controller asone such application that benefits from the BoD service.

    More recently, there has been a resurgence academic workhighlighting the potential benefit of dynamic optical paths inthe WAN. One such system, called OWAN (Optical Wide-Area Network) [143], proposes how to use dynamic opticalpaths to improve the delivery time for bulk transfers betweendata centers. They build a testbed network with home-builtROADMs and implement a TE controller to orchestrate bulktransfers between hosts in a mesh optical network of ninenodes. They compare their results with other state-of-the-artTE systems, emphasizing that OWAN delivers more transferson time than any other competing methods.

    Dynamic optical paths increase the complexity of networksand capacity planning tasks because any optical fiber may needto accommodate diverse and variable channels. However, thiscomplexity is rewarded with robustness or tolerance to fiberlink outages. Gossels et al. [144] propose dynamic opticalpaths to make long-haul networks more robust and resilient tonode and link failures by presenting algorithms for allocatingbandwidth on optical paths dynamically in a mesh network.Their objective is to protect networks from any single node orlink failure event. To this end, they present an optimizationframework for network planners, which determines whereto deploy transponders to minimize costs while running anetwork over dynamic optical paths.

    Another effort in reducing the complexity of dynamicoptical path WAN systems was presented by Dukic et al. [11].Their system, Iris, exploits a unique property of regionalconnectivity, i.e., the vast abundance of optical fiber in densemetropolitan areas [145]. They find that the complexity ofmanaging dynamic optical paths is greatly reduced whenswitching at the fiber-strand level versus the (sub-fiber) wave-

  • 15

    BVT Network De-sign

    Amps. Algorithms

    CORONET [109] × × × 3WHSOWAN [122] × × × Simulated Annealing

    FACcBR [104] × × X Case Based ReasoningRADWAN [136] X × × Linear Program

    DDN [137] × X X Time-slotted packet schedulingIris [11] × X X Shortest path for any failure scenario

    TABLE III: Summary of systems implementations of reconfigurable wide area networks

    length level. To this end, they detail their design trade-offspace for inter-data center connectivity across metropolitanareas. They deploy their system in a hardware testbed toemulate connectivity between three data centers, verifying thatoptical switching can be done in 50 to 70 ms over threeamplifiers. They obviate amplifier reconfiguration delays byconducting fiber-level switching rather than wavelength-level.Thus, the amplifiers on a fiber path are configured once forthe channel that traverses it. When a circuit changes its path,away from one data center and towards another, it uses a seriesof amplifiers that have been pre-configured to accommodatethe loss of that given circuit.

    Inter-data center network connectivity over a regional opti-cal backbone was also investigated by Benzaoui et al. [137].Their system, Deterministic Dynamic Network (DDN), im-poses strict constraints for application layer latency and jitter.They show that they can reconfigure optical links in under2 ms, and guarantee consistent latency and jitter through theirtime-slotted scheduling approach.

    4) Open Challenges: Wide-area optical networks are richwith open challenges. The works presented in this sectionhighlight significant developments that have been made to-wards reconfigurable WAN systems and illuminate great ben-efits for such systems. However, programmability, cross-layerinformation sharing, and physical impairment problems mustbe solved. On the programmability front, efforts such asOpenConfig [108], OpenROADM [146], and ONOS [147] areworking to provide white-box system stacks for optical layerequipment. If these are widely adopted and standardized, thiswill open the door for agile and efficient use of wide-areanetworks for a variety of applications (e.g., new tools to com-bat DDoS [148]). Other challenges include wrangling withthe physical constraints of efficient and rapidly reconfigurableWANs, for example, coordination of power adjustments acrossamplifiers for long-haul circuits in conjunction with trafficengineering workloads and constraints.

    V. FUTURE WORK AND APPLICATIONSHaving looked into the theory and practice, my thesis to

    demonstrate the benefits of cross-layer programmability alongfour dimensions: network simulation, network measurement,traffic engineering, and cyber security. The following sub-sections each highlight a potential future application in eachcategory.

    A. Vertically Programmable Network SimulatorRecall the challenge from section III, that there is no

    clear simulation framework for evaluating and constructing

    reconfigurable optical networks. The lack of such a frame-work makes it difficult to compare systems with and withoutdynamic optical components. My future work is to address thischallenge by developing a vertically programmable networksimulator. The purpose of this simulator is to provide ameans to compare optical layer reconfiguration strategies inconjunction with higher layer packet-switching and routingstrategies. I hope to publish this in a simulation venue (e.g.,MASCOTS or SummerSim).

    B. Network Measurement

    In section IV-A, one of the main challenges was thetransparency of the optical-layer network. This transparencyhas two adverse effects. First, it makes active-measurementbased mapping of the physical network impossible (the WSSesand ROADMs are invisible to network measurement). Second,our fiber optical network’s control and management functionsare rendered more complex by lack of a complete mappingbetween the physical and logical (router-level) topology. Myfuture work is to address this challenge by proposing a setof secure enclaves for monitoring optical networks from thenetwork layer. This hypothetical system is an optical-layertraceroute or OLT. OLT’s main idea is to make the opticallayer visible to higher-layer applications (e.g., traffic engi-neering or TE). The critical requirements for secure enclavesare to make the optical layer visible while also preservingthe network’s privacy. We can achieve this through a setof software hooks that expose information about optical lineequipment. Types of information that these hooks could safelyinclude are unique hashes of object IDs and optical layerperformance information (e.g., bit error rate or BER). I hopeto publish this work at a network measurement venue (e.g.,IMC or PAM).

    C. Traffic Engineering (TE)

    Recall the challenges from section IV-C. First, we mustovercome the significant performance penalties for reconfigu-ration imposed by optical equipment (such as amplifiers andtransponders) deployed in today’s WANs. Second, our abilityto dynamically reconfigure optical devices must be utilizedrapidly by optical path protection schemes (e.g., amplifier gaincontrol or AGC, transponder power adjustments) and by TEalgorithms that operate above the physical layer. Third, there isno unified formulation to assess reconfigurable optical networkpaths vs. static allocation or quantify how existing TE schemescan benefit from static backup paths for optimally routingflows through the network.

  • 16

    I’m addressing these challenges in my ongoing and futurework. Regarding the significant performance penalties forlong-haul link changes, I am conducting a study on amplifierreconfiguration times to determine bottlenecks in the processand find methods for removing those bottlenecks. I’m goingto publish this study at the Optical Fiber Conference (OFC).Regarding the timely activation of optical paths for applica-tions such as path-protection schemes, we require a verticallyprogrammable network control system to view and manageseveral aspects of the network (e.g., active wavelength con-nections, traffic demand on the wavelengths, and the physicaltopology components and resources). I’m building a simulatorto accurately model the system’s inputs, the dynamics foroptical layer reconfiguration, and its effect on traffic andconcrete objective functions. This work overlaps with the goalsfor a vertically programmable network simulator from V-A.I hope to publish this at a systems venue (e.g., NSDI orSIGCOMM).

    D. Cybersecurity

    We are now in the era of terabit DDoS attacks. Theimmense attack volumes, attack diversity, sophisticated attackstrategies, and the low cost of launching them make DDoSattacks a critical cybersecurity issue in Internet infrastructurenow and for the foreseeable future. DDoS is not a newproblem, and prior work has made significant progress indevising mitigation strategies to tackle DDoS attacks.