A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures With WirelessRF

download A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures With WirelessRF

of 12

Transcript of A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures With WirelessRF

  • 8/10/2019 A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures With WirelessRF

    1/12

    Smart Computing Review, vol. 3, no. 6, December 2013

    This research was supported by the Beijing Municipal Natural Science Foundation (No.4122010, 2012.1 - 2014.12).

    DOI: 10.6029/smartcr.2013.06.004

    425

    Smart Computing Review

    A Tutorial for Key Problemsin the Design of Hybrid

    Hierarchical NoC

    Architectures withWireless/RF

    Chunhua Xiao , Zhangqin Huang, and Da Li

    Embedded Software and System Institution, Beijing University of Technology / 100022, Beijing, CHINA /[email protected]

    *Corresponding Author: Chunhua Xiao

    Received August 15, 2013; Revised October 31, 2013; Accepted November 8, 2013; Published December 19,2013

    Abstract: As processing nodes scale up, it is difficult for traditional electronic networks to supply

    on-chip communication efficiently due to unacceptable latency, plus power and area consumption.

    Alternative interconnects, such as radio frequency interconnect (RF-I) and optical interconnect,have been explored as interconnection backbones. Hybrid hierarchical architectures with both

    traditional interconnects and emerging interconnects have been widely adopted to get excellent

    trade-off between latency and power. The hybrid hierarchical architecture with a wireless/RF-I

    backbone is more cost-efficient and feasible due to advantages in complementary metal oxidesemiconductor compatibility, compared with other alternative interconnects, and has become one of

    the mainstreams of chip multi-processor systems. However, how to efficiently utilize the

    wireless/RF-I backbone is a new challenge for designers. Based on analysis of existing typical

    hybrid hierarchal wireless/RF-I architectures (HHWAs), the key problems in the Design of

    HHWAs are proposed here, and related potential solutions are provided. In particular, strategies for

    resource management of wireless/RF-I are explored in detail, and different solutions are discussed.

    This work is expected to serve as a basis for future HHWA designs.

    Keywords:Network-on-chip, radio frequency interconnect, wireless interconnect

  • 8/10/2019 A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures With WirelessRF

    2/12

    Xiao et al.: A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures with Wireless/RF426

    Introduction

    s we enter the era of multiple cores and beyond, the number of cores, coprocessors, and on-chip accelerators grows

    rapidly. The dramatic increase of these processing elements (PEs) imposes a tremendous challenge for on-chip

    communication that demand high performance, including lower latency and higher bandwidth, but also minimalperformance per energy/area. According to the International Technology Roadmap for Semiconductors (ITRS) [1],

    improving characteristics of metal wires will no longer satisfy performance requirements, and new interconnect paradigms

    are needed. Different revolutionary approaches, such as optical interconnect [2][3], radio frequency interconnect (RF-I)

    [4][5][6], and wireless interconnect with complementary metal oxide semiconductor (CMOS) ultra wide band (UWB)

    technology [7][8], have been explored. But these emerging interconnects have associated antenna and transceiver area,

    extra integrated components and power overheads, and thus need to be placed and used optimally to achieve the best

    performance without undue overhead [9][10]. Although the traditional planar metal interconnects suffer from limitations

    arising from multi-hop communication, which result in high latency and power consumption, they are still highly effectiveand suitable for short distances. The vast improvements in CMOS technology have led to wires with only 0.18 pJ/bit of

    energy consumption at 1 mm for a 32 nm technology design [11]. Based on these reasons or technology problems, many

    researchers adopted hybrid hierarchical wireless/RF-I architectures (HHWAs) to get excellent trade-offs between latency

    and power with limited extra cost [12][13][14][15][16]. HHWA is characterized by local traditional wired interconnection

    and global wireless/RF-I interconnection, and provides some unique benefits including the following: (1) Instead of multi-hop in traditional interconnection, wireless/RF-I implements one hop for long distance communication, which alleviates

    power consumption while providing high bandwidth and low latency without excessive overhead. (2) Taking full advantage

    of traditional networks on a chip (NoCs) and emerging interconnects, HHWA employs their respective merits. (3)

    Compared with optical interconnects in hybrid architectures, using wireless/RF-I as a global communication backbone

    attains better feasibility and cost-efficiency due to an advantage in CMOS compatibility.

    As an architecture composites emerging technologies and traditional interconnects, new design challenges arise that

    might be bottlenecks to performance improvement. This work explores the key problems in HHWA designs and provides

    related potential solutions, which is expected to serve as basis from which to work towards future HHWA design. The rest

    of the paper is organized as follows. In Section 2, we provide a brief overview of the new alternative interconnect

    technologies (wireless and RF-I) and how they can be leveraged for on-chip communication. Based on the availability ofthese two interconnect technologies, we discuss the topology of HHWAs and explore the existing typical HHWAs in

    Section 3. Due to importance of wireless/RF-I resource management in HWWAs, we did an in-depth survey and analyze

    the resource arbitration mechanisms in Section 4. In Section 5, we summarize the key problems in HHWA design andprovide related feasible solutions. Finally, we conclude our work in Section 6.

    RF-I/Wireless

    RF-I

    Radio frequency interconnect has been proposed as a high-aggregate bandwidth, low-latency alternative to traditional

    interconnect [4][5][19]. Its benefits have been demonstrated for off-chip, on-board communication, as well as for on-chip

    interconnection networks [20][21][22].

    Unlike conventional metallic wires that require charging and discharging the whole wire to signify either 0 or 1,

    RF-I modulates information on an electromagnetic carrier wave that is continuously sent along the transmission line(Figure 1). RF-I has been projected to scale better than traditional RC wires in terms of delay and power consumption; it

    can allow signal transmission across a 400 mm2 die in 0.3 ns via propagation at the effective speed of light [5] as opposed

    to less than, or equal to, 4 ns on a repeated bus.

    Instead of trying to aggressively expand baseband bandwidth (which often involves power-hungry compensation

    techniques to achieve a flat channel frequency response), RF-I divides bandwidth into frequency domains, each becoming a

    narrow-band signal, which saves power. By doing this, RF-I also improves bandwidth efficiency by sending many

    simultaneous streams of data over a single transmission line. This particular technique is referred to as multi-band RF-I [6].

    As shown in the Figure 2, there are N mixers on the transmitting (or Tx) side in multi-band RF-I, where N is the number of

    senders sharing the transmission line. Each mixer up-converts individual data streams into a specific channel (or frequency

    band). On the receiver (Rx) side, N additional mixers are employed to down-convert each signal back to the original data

    and N low-pass-filters (LPF) are used to isolate the data from residual high-frequency components. Based on shortcut

    selection, each transmitter or receiver in the topology will be tuned to a particular frequency (or disabled entirely) to

    implement our shortcuts [5][6].

    A

  • 8/10/2019 A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures With WirelessRF

    3/12

    Smart Computing Review, vol. 3, no. 6, December 2013 427

    C $ $ C $ $ C $ $ C $ $

    C $ $ C $ $ C $ $ C $ $

    $ $ $ $ $ $ $ $

    $ $ $ $ $ $ $ $

    $ $ $ $ $ $ $ $

    $ $ $ $ $ $ $ $

    $ $ $ $ $ $ $ $

    M M

    M

    C Core $ L2 cache bank M Off-chip Memory controller

    RF-I transmission line RF-I node

    M$ $ $ $ $ $ $ $

    Router

    C

    C

    C

    C

    C

    C

    C

    C

    C C C C

    C C C C

    C

    C

    C

    C

    C

    C

    C

    C

    C C C C

    C C C C

    C

    C

    C

    C

    C

    C

    C

    C

    C C C C

    C C C C

    C

    C

    C

    C

    C

    C

    C

    C

    Figure 1.RF-I transmission line in a chip multiprocessor system

    Figure 2.A ten-carrier RF-I and corresponding waveform at the transmission line

    Wireless

    Different from RF-I, the transmission channel does not need to be physically laid out for wireless interconnection, and thecommunication medium is free space [23]. Wireless communication can be over different frequency ranges, from several

    gigahertzes to thousands of gigahertz [24].

    An on-chip antenna is always one of the most difficult, but very important, components that can be integrated on-chipfor HHWAs, because passive devices such as inductors consume the dominant portion of the transceiver area. Fortunately,

    as CMOS technology improves, not only the size but also the cost of the antenna and required circuits will decrease

    dramatically, which provides the feasibility for integrating multiple on-chip antennas [12]. An example of the necessary

    components of wireless transceivers for millimeter wave (mm-wave) links in a chip multiprocessor system is shown in

    Figure 3. A metal zigzag antenna was demonstrated to support wireless network-on-a-chip (WiNoC) [25] and was used to

    design an mm-wave wireless NoC by Deb et al. [26]. As the transmission frequency increased to the terahertz range, carbonnanotubes (CNTs) were explored for the on-chip antenna [27], and the feasibility of designing a WiNoC was demonstrated

    by Ganguly et al. [15]. Compared with RF-I, which needs the transmission line to span the entire chip area, communication

    routing is not limited by the physical channel for wireless interconnection. However, wireless interconnection faces

    interference challenges and cost problems, which are proportional to the communication distance.

  • 8/10/2019 A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures With WirelessRF

    4/12

    Xiao et al.: A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures with Wireless/RF428

    C0 C0

    C1 C1

    C0 C0 C1 C1

    C3 C3 C2 C2

    C3 C3 C2 C2

    Cluster 0: C0

    Antenna

    Swith

    Driver

    Amplifier LNA

    Modulator Carrier

    Frequency Demodulator

    Serializer Deserializer

    Data

    to

    be

    transmitted

    Data

    Received

    Transmitter

    Side

    Receiver

    Side

    Figure 3.An example of mm-wave links in a chip multiprocessor system

    Hybrid Hierarchical Wireless/RF-I Architectures

    Topology

    Topology defines how channels and routers are connected in an interconnection network and determines the performance

    boundsincluding zero-load latency and network throughput [17]. As showed in Figure 4, A hybrid hierarchical

    wireless/RF-I network consists of two types of network: a local network, which uses traditional wire interconnects, and aglobal/express network, which uses wireless/RF-I. For a conventional NoC, there can be various topologies for a local

    network, such as mesh, centralized mesh, ring, star, etc. Each local network forms a subnet and is equipped with a

    wireless/RF-I access point (WAP). As long as the antennas are placed within communication range (or the RF-I is enabled

    between them), only a single hop is needed for inter-subnet communication. All WAPs from all subnets are connected as a

    second-level network forming the global/express network. This upper level of the hierarchy can have various designs with

    different characteristics to achieve the full benefit of on-chip express networks.An important problem when creating an efficient global/wireless network is the placement of WAPs, which will greatly

    influence the trade-off between system performance and cost. If each PE is equipped with a WAP (each local network only

    consists of one node) and can communicate with any other node through the express wireless/RF-I, we can get the bestsystem performance with low latency and high throughput. But the area cost may be unpalatable due to the equipment

    (antennas, transceivers, etc.). If too many PEs share a WAP, or if the WAP is placed improperly, performance improvement

    would be offset by induced overhead. Ganguly et al. induced small world theory to create an HHWA, and inserted wireless

    links through a simulated annealingbased algorithm to minimize the average distance (measured by the number of hops)

    between all source and destination hubs [15]. Chang et al. used RF-I as an express shortcut between intensively

    communicated nodes with communication profiling of the application to accelerate and optimize region-to-region

    communication. They placed the RF-enabled routers in a staggered fashion to minimize the distance any given component

    would need to travel to reach the RF-I [16]. Different from related works, Lee [12] and Di Tomaso et al. [13] placed the

    WAPs at the center of concentrated mesh-based clusters to provide distributed wireless express pathways for inter-cluster,

    long-haul communication to support hundreds of PEs.

    Existing Typical Architectures

    Chang et al. [16] exploited dynamic RF-I bandwidth allocation to realize a reconfigurable hierarchical network-on-a-chip

    architecture. As shown in Figure 5, this architecture uses a mesh topology as the baseline and places adaptive shortcuts as

    an RF-overlaid topology to match different communication demands of the applications. This approach selects shortcuts

    according the optimizing cost equation synthesized with application communication statistics. The selected shortcuts are

    implemented through RF-I enabled routers (standard routers extending a port as an RF-I interface). Each transmitter or

    receiver in the topology is tuned to a particular frequency (or disabled entirely) to offer a shortcut. To enable the new

    available paths (RF-I shortcuts) and also reduce the reconfiguration cost, the routing tables in all network routers will be

    updated before executing the application. A shortest path routing strategy is adopted with RF-I shortcuts to transmit packets.

    This dynamic allocation approach enables reconfiguring the topology via frequency band reassignment, thereby providing

    the benefits of adaptive routing without having to pay the cost of traversing extra channels [23].

  • 8/10/2019 A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures With WirelessRF

    5/12

    Smart Computing Review, vol. 3, no. 6, December 2013 429

    Global Express

    Network

    Local

    Network

    Local

    Network

    Local

    Network

    Figure 4.Hybrid hierarchical wireless/RF-I network

    Base Router

    Traditional Wired Link

    Shortcuts

    Figure 5.An example of adaptive RF-I shortcuts in a chip multiprocessor system

    Modern complex network theory provides powerful methods to analyze network topologies. The small-world theory [28]

    is incorporated in HHWA to simultaneously address the latency, power consumption, and interconnect routing problems by

    minimizing the hop counts in inter-core communication, and we denote these architectures as small-worldbasedarchitectures as shown in Figure 6 [15][18]. For a small-worldbased architecture, the whole system is divided into

    multiple small clusters called subnets, and all PEs within each subnet are connected to a centrally located hub through

    direct links. These hubs are connected to form a second-level hierarchical structure, or global network. Given the number of

    wireless interfaces (WIs), the placement of WIs to these hubs is optimized through a simulated annealingbased algorithm.

    The routing strategy adopted is a combination of dimension order routing for the hubs without WIs and a south-east routing

    algorithm for the hubs with WIs. For inter-subnet communication, the routing path involving the wireless medium is chosenif it reduces the total path length, compared to the wired path [18]. A token flow control strategy is adopted to alleviate the

    potential hotspot problem in WIs, which occurs from the simultaneous multiple access requirements for the wireless links,

    while another different token-passing protocol is used to avoid interference and contention for the wireless medium from a

    particular hub at a given instant.

    An example of a two-level WCube structure is shown in Figure7, which is a multi-level, two-dimensional structure to

    interconnect hundreds to thousands of cores in chip multiprocessors [12]. Two types of routers are included in this network:

    base routers that make up the baseline concentrated mesh, and wireless routers with wireless interfaces to form a wireless

    backbone. Each wireless router is responsible for a cluster of n base routers, while each base router charges k PEs because

    the k-way concentrated mesh is adopted. The wireless routers, base routers and PEs are assigned exclusive addresses inWCube to identify their exact positions in the network, and the whole architecture can be recursively described. Every

    wireless router is assigned a single, different frequency band and is equipped with one wireless transmitter and multiple

    receivers to allow parallel transmission. WCube uses wormhole-based delivery and latency-oriented routing to minimize

    communication latency. The wireless link is chosen if latency can obviously be reduced, compared with only using a

  • 8/10/2019 A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures With WirelessRF

    6/12

    Xiao et al.: A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures with Wireless/RF430

    baseline. WCube offers scalable performance in terms of latency and connectivity, compared other HHWAs, and the

    architecture has proven cost-efficient with 1024 nodes.

    Wireless links

    Traditional Wired links

    Hub

    Processing

    Node

    Switch

    Figure 6.Small-worldbased hybrid hierarchical wireless architecture

    Wcube 0

    Wcube 1

    Wcube 2

    Wireless RouterBase RouterCore L2 Cache

    Figure 7.A two-level WCube structure with a cluster of 16 base routers (i.e. 64 nodes)

    Different from WCube, which uses a centralized wireless hub at each group of 64 nodes, in the iWISE architecture,every router has its own transmitter and receiver for each group of routers. As shown in Figure 8, the iWISE architecturereduces the hop count by distributing these transceivers at each router, as opposed to the centralized hub found in WCube

    [13]. A token scheme is adopted for the wireless routers to share the limited bandwidth, while frequency division

    multiplexing (FDM) and time division multiplexing (TDM) are induced to avoid transmission interference.

    Wireless/RF-I Resource Management

    The wireless/RF access points act as the connective bridges in the hybrid hierarchical wireless/RF-I architecture, whichconnects the local network and global network. If there are multiple packets trying to access the same wireless/RF node at

    once, the wireless/RF access points might become bottlenecks, thus overloading the access points and resulting in higher

    latency, so a reasonable control strategy is needed to alleviate the potential congestion between the multiple wireless/RF

  • 8/10/2019 A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures With WirelessRF

    7/12

    Smart Computing Review, vol. 3, no. 6, December 2013 431

    requirements for the access points. Similarly, another arbitration scheme is needed to decide who can get access to the

    particular wireless medium (or RF-I channel) in a given period, because all wireless/RF-I access points can tune to thesame channel and can send or receive data from any other wireless/RF-I access point in the network. Therefore, how to

    allocate the wireless/RF resource of the specific wireless/RF access point between multiple transmission requirements from

    the PEs (or the base routers in the local network) and how to allocate the specific wireless medium or RF-I channel between

    multiple wireless/RF-I points in a given period are two of the important problems in wireless/RF-I resource management.

    The solutions to the two problems explored so far by different research groups can be broadly classified into three classes,depending on the specific implementation of the HHWA.

    Set 2 Set 3

    Set 0 Set 0

    Traditional

    Wired Link

    Wireless linkRouter

    Core

    Figure 8.An iWISE architecture showing wireless communication between four sets

    One is a fixed static allocation strategy with a coarse-gain arbitration mechanism, which assigns the wireless/RF-I to

    predetermined communication pairs for the entire duration of an applications execution [6][16][12][29]. The chosen pairs

    are allocated a specific wireless link (or RF channel), and each transmitter or receiver in the topology will be tuned to a

    particular frequency; thus the specific bandwidth is exclusive to the transmitter, and contention is avoided [16]. Another

    frequency band is extended to act as a multicast channel, with multiple receivers tuned to that frequency band to receive

    multicast. A certain processing node is chosen as the only transmitter of the multicast channel, and other PEs that want to

    send a multicast should first implicitly send the multicast message via conventional mesh links to the designated transmitter.

    The destination bit vector (DBV) is used to distinguish multicast transmissions from other network communication. To

    improve scalability and connectivity, Lee et al. [12] adopted wireless links instead of RF-I to support thousands of cores. Asingle, different frequency band is assigned to every wireless router, which is exclusively used for transmission. Every

    micro wireless router is equipped a single transmit antenna and multiple receive antennas, and the receivers are statically

    tuned to the frequency bands of their logical neighbors (whose addresses differ from that router in only one bit) to

    implement parallel transmission without frequency interference. However, this approach does not provide a congestion

    control mechanism to alleviate the potential bottleneck if too many packets try to use the wireless backbone at once.

    Another class adopts a token-based arbitration mechanism [30] to solve access contention for the wireless/RF-I resource

    [13][15][18]. To address contention from multiple wireless requirements to transmit packets through the express pathway, a

    token flow control along with a distributed routing strategy is adopted to alleviate congestion [18]. If taking the wirelesslink for communication reduces the total hop count, and if the token of this wireless link to the destination is available, the

    access transmission is allowed. To address contention between wireless routers for a specific wireless medium, a differentwireless token-passing protocol can be used [18]. The particular wireless router possessing the wireless token can broadcast

    flits into the wireless medium, and the wireless token will be forwarded to the next wireless router after all flits belonging

    to a packet at the current wireless token-holding router are transmitted. Different from other HHWAs that centralize the

    wireless routers, iWISE distributes the transceivers at each router to avoid hotspots and reduce the hop count. In the iWISE

    architecture, a sharing scheme with tokens is used to share the limited bandwidth, along with FDM and TDM mechanismsto avoid interference. In this token-based arbitration scheme, possession of a token represents the right to transmit on a

    certain frequency to a set [13]. Two different sharing schemes: token-partial and token-full, are explored with different

    workloads, which demonstrated how the different design of token-based arbitration can influence arbitration cost (latency)

    and channel utilization for different traffic patterns, so as to affect the communication performance.

    Although the fixed static allocation strategy can dynamically and adaptively choose different shortcuts for different

    applications, the shortcuts cannot be adjusted according to real-time workload requirements. Token-based dynamic

  • 8/10/2019 A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures With WirelessRF

    8/12

    Xiao et al.: A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures with Wireless/RF432

    arbitration, which allocates the channels in real time to communicating pairs on demand with low arbitration latency, power,

    and hardware cost, faces a channel utilization problem and long arbitration latency with non-uniform communication.However, modern and future CMPs tend not to exhibit this uniformity due to spatial communication heterogeneity. So

    stream arbitration was proposed by Xiao et al. [31] as an efficient dynamic bandwidth utilization scheme that can deal with

    both spatial and temporal communication heterogeneity. Unlike token arbitration, where channels are coupled to receivers,

    a channel in stream arbitration can be used to send packets from any sender to any receiver, which efficiently addresses the

    problem of spatial communication heterogeneity. Since stream arbitration is inherently a dynamic arbitration scheme, italso efficiently handles temporal communication heterogeneity. Stream arbitration partitions the aggregate bandwidth intoarbitration channels and data channels. Active sources (nodes that want to send flits through wireless/RF-I) compete for the

    data channels in the arbitration channel in order to talk to their desired destination nodes. Stream arbitration is a distributed

    mechanism without a centralized arbitrator and is implemented independently and simply. Stream arbitration proved to be

    an efficient scheme for resource arbitration for emerging network technologies, with a case study consisting of a modeled

    RF-I network.

    Key Problems in HHWA Design

    Wireless or RF-I?

    As we know, both wireless and RF-I have better compatibility compared to other technologies, such as optical

    interconnects, and perform well as an expressway for long and critical communication in an HHWA, compared to

    traditional NoCs with only wired connects; but each has its own merits and characters. When we design an HHWA,

    which emerging interconnects should we choose? Wireless or RF-I, or both? As we discussed in Section 2, the biggestdifference between RF-I and wireless is the transmission medium, for no channel needs to be physically laid out with

    wireless interconnects, whereas a transmission line (TL) is needed for electromagnetic carrier wave transmission in RF-I.

    So the area cost of RF-I will be a challenge for the design of very large scale integrated circuits since the long TL needs to

    span the whole chip for remote transmission, and the crosstalk (or inter-channel interference) between adjacent TLs may

    also pose problems for long TLs with very high frequencies [12]. Without a physical channel needed, wireless

    interconnects provide better scalability and connectivity compared with RF-I. But the on-chip antenna is always one of themost difficult components to be integrated for large CMPs [12][15]. In addition, due to the induced cost, wireless is not as

    efficient with very short distance communication. A comparative analysis of the energy dissipation per bit between wireless

    and wired communication channels was carried out by Chang et al. [18], which showed mm-wave wireless shortcuts arealways energy-efficient when the link length is 7 mm, but inefficient below 7 mm, compared to traditional wired links [24].

    Why not employ their (wireless and RF-I) respective merits and complementary strengths? For mid-sized networks

    within the range of tens to the low hundreds of PEs, we can adopt RF-I, which is more feasible for reducing latency and

    energy consumption. For very large scale networks with thousands of cores, wireless interconnect can be adopted to

    provide better scalability. An alternative approach is a combination of wireless links and RF-I, which uses RF-I to bridge

    the gap between the baseline mesh and wireless interconnect for midrange messages, using wireless interconnects only for

    long-range communication [12]. This hierarchical architecture with three levels provides better trade-off between cost andperformance, but the design of relay nodes for inter-level transmission might be a problem, which should be explored in

    depth to minimize the extra cost and potential bottlenecks.

    Placement of the wireless/RF-I access points

    The placement of wireless/RF-I access points is crucial for optimum performance gain because it establishes high-speed,

    low-energy interconnects on the network. The aim is to minimize the number of cycles between distant or critical endpoints

    so as to get the optimal architecture design with minimal average latency or hop count. The existing optimizationtechniques, such as evolutionary algorithms (EAs) [32], coevolutionary algorithms [33] and the simulated annealing (SA)

    algorithm [34], afford us powerful methods to help with architecture construction. The choice of optimization algorithm is a

    trade-off between better results and faster speed for a large search space. EAs are generally believed to give better results

    but lengthy times. SA reaches comparably good solutions with acceptable search time [34][18]. No matter which heuristics

    is adopted, a cost metric is needed for optimization evaluation, which includes the distance (in hops) and the probability of

    communication between sources and destinations. It is a good approach to introduce application communication statistics

    into the cost metric to find the optimum position for the placement of wireless/RF-I access points, so as to accelerate

    communication on paths that are most frequently used by the application [16].

    Routing

  • 8/10/2019 A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures With WirelessRF

    9/12

    Smart Computing Review, vol. 3, no. 6, December 2013 433

    The routing strategy determines the path a packet takes from its source to its destination. Due to the different transmission

    characteristics of RF-I/wireless compared with traditional wired interconnects, and the harsh requirements for on-chipdesign of a hierarchical architecture, the routing mechanism in an HHWA should be simple and reliable, without incurring

    too much power, area and latency overhead. We divide routing mechanism into local routing and global routing by whether

    using wireless/RF-I. Local routing depends on the topology of the subnets. For example, if the PEs within a subnet are

    connected in a mesh, then data routing within the subnet follows dimension order routing. Global routing relates to whether

    and how to use the RF/wireless interconnects. Flow control, deadlock avoidance and RF-I/wireless resource managementstrategy are key problems in the global routing design. Kim et al. [23] and Deb et al. [24] analyzed the different strategiesadopted by existing HWWAs, and provide very good references and guidance for future HHWA designs. A comprehensive

    study quantifying merits and limitations for different strategies and their implementation challenges needs to be carried out,

    with an informative comparative analysis [24].

    Wireless/RF-I resource allocation

    According to the ITRS [36], unity current gain frequency fT and maximum available power gain fmax will be 600 GHz and

    1 THz, respectively, in 16 nm CMOS technology. With the advances in CMOS circuits, tens to hundreds of gigahertz of

    bandwidth will be available in the near future [26] [12][15][24]. How to efficiently utilize the available bandwidth is one ofthe important problems in HHWA design. The arbitration mechanisms for wireless/RF-I resource contention were

    discussed in Section 3, which showed that bandwidth sharing between all the wireless/RF-I access points (referred to as a

    bandwidth sharing scheme) with stream arbitration performs better in non-uniform traffic compared with token arbitrationwith a specific exclusive occupancy for every wireless/RF-I access point (referred to as a bandwidth distributed scheme). If

    we partition the aggregate bandwidth into a set of communication channels (aggregate bandwidth is calculated as the

    number of channels multiplied by the bandwidth of each channel), each wireless/RF-I access point can only obtain a small

    proportion of the total bandwidth in the distributed allocation strategy. Because every access point occupies a specific

    channel, this mechanism is very efficient for uniform traffic patterns with high access contention. For a sharing mechanism,

    all the available bandwidth is a public resource, and only the winners occupy the channels in a fixed period, so as to

    dynamically allocate the resource as demanded in real time with better bandwidth utilization.

    To further explore the influence of bandwidth allocation, Xiao et al. [31] did an experiment with fixed aggregate

    bandwidth with stream arbitration and a bandwidth sharing scheme. This work adjusted the number of channels and the

    channel bandwidth to achieve that aggregate bandwidth. The simulation results showed that a compromise needs to be

    found between high bandwidth channels and additional channels. There is potential optimization for bandwidth allocation

    with a dynamic bandwidth partition [31].

    Transmission reliability

    Although wireless/RF-I performs well for long distance transmission with high bandwidth, low latency and low energy

    consumption, the bit-error problem is a challenge to ensuring reliable message transmission. Within the maximum

    communication distance of future CMPs, 1.5 cm, the bit-error rate (BER) of the on-chip wireless channel is less than 109

    ,

    which is far higher than that of RC wires. (Current RC wires have an extremely low BER of approximately 1014 [12].)

    Error control coding (ECC) is explored by Ganguly et al. [37], who showed that by implementing joint crosstalk avoidancetriple error correction and simultaneous quadruple error detection codes [38] in the wire line links and Hamming code

    based product codes (H-PCs) in the wireless links of a hierarchical wireless NoC with CNT antennas [37], it is possible to

    improve overall reliability of the wireless NoC manifold. However, application of ECC introduces timing and area

    overhead and also incurs fixed overhead over every packet [12][15]. Research into WCube devised a novel and simple loss

    management solution that uses a zero-signalingoverhead scheme, overhearing-and-retransmission (OAR), based onoverhearing on intermediate hops, and uses an on-demand, checksum-based error-detection and retransmission scheme atthe last hop [12]. OAR detects and recovers packet losses without extra signaling overhead with a buffer-based mechanism.

    The packet is verified by the checksum at the destination, and retransmits if the checksum does not match. This solution is

    simple, and induced less extra cost compared with ECC, but the forwarding sequence of packets should be kept to ensure

    the correct transmission.

    Scalability

    To target future large-scale CMPs, scalability is one of the most important problems for the design of an on-chip hybridhierarchical architecture. Lee et al. [12] proposed the WCube recursive wireless interconnect structure, which offers

    connectivity to thousands of cores in CMPs. A case study with a network consisting of 1024 PEs proved efficient with

    WCube and demonstrated a reduced observed latency of 20% to 45% compared to current 2-D wired mesh designs. Since

    future communication patterns tend towards the non-uniform and heterogeneous, Xiao et al. [31] proposed a cluster-based

  • 8/10/2019 A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures With WirelessRF

    10/12

    Xiao et al.: A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures with Wireless/RF434

    hierarchical architecture that uses a local transmission line for each core cluster, and a global TL to connect the local TLs.

    A network with 16x16 RF nodes for a 32x32 router NoC (each 2x2 router shares one RF node) proved efficient in averagenetwork latency and energy consumption with a hierarchical TL architecture and hierarchical stream arbitration, compared

    to architecture with a single TL spanning the whole trip [31]. The three-level architecture with traditional RC connects, RF-

    I and wireless links is also one of the potential solutions for scalability in architecture, and detailed implementation needs to

    be proposed in future designs.

    Conclusion

    As a new architecture composite with emerging interconnects, new design challenges need to be targeted for hybrid

    hierarchical wireless/RF-I architectures. Based on analysis of the existing typical HHWAs, we explored strategies for

    wireless/RF-I resource management for the first time and discussed the strengths and disadvantages of different solutions.

    The key problems in hybrid hierarchical wireless/RF-I architecture design are explored, and related potential solutions are

    provided, which we expect to serve as a basis to help with future HHWA designs. Quantitative analysis for the performance

    benefits of different HHWAs need to be benchmarked in future work, and detailed investigations for physicalimplementations need to be explored in the future.

    References

    [1] International Technology Roadmap for Semiconductors (ITRS), 2012.

    [2] A. Shacham, K. Bergman, L. P. Carloni, Photonic networks-on-chip for future generations of chip multiprocessors,

    IEEE Transactions on Computers, vol. 57, no. 9, pp. 1246-1260, 2008.Article (CrossRef Link)

    [3] D. Vantrease, R. Schreiber, M. Monchiero, M. McLaren, N. P. Jouppi, M. Fiorentino, A. Davis, N. Binkert, R. G.

    Beausoleil, J. H. Ahn, Corona: System Implications of Emerging Nanophotonic Technology, in Proc. of the 35th

    Annual International Symposium on Computer Architecture (ISCA08), Washington, DC, USA, pp. 153-164, 2008.

    Article (CrossRef Link)

    [4] M. F. Chang, I. Verbauwhede, C. Chien, Z. Xu, J. Kim, J. Ko, Q. Gu, B. Lai, Advanced RF/baseband interconnectschemes for inter- and intra-ulsi communications,IEEE Transactions on Electron Devices, vol. 52, no. 7, pp. 1271-

    1285, 2005.Article (CrossRef Link)

    [5]

    M. F. Chang, E. Socher, R. Tam, J. Cong, G. Reinman, RF interconnects for communications on-chip,in Proc. ofthe 2008 international symposium on Physical design (ISPD08), ACM New York, NY, pp. 78-83, 2008. Article

    (CrossRef Link)

    [6] M. F. Chang, J. Cong, A. Kaplan, M. Naik, G. Reinman, E. Socher, S.-W. Tam, CMP Network-on-Chip Overlaidwith Multi-Band RF-Interconnect, in Proc. of the IEEE Int'l Symposium on High-Performance Computer

    Architecture (HPCA), Salt Lake City, UT, February, pp. 191-202, 2008.Article (CrossRef Link)

    [7] D. Zhao, Y. Wang, SD-MAC: Design and Synthesis of A Hardware-Efficient Collision-Free QoS-Aware MAC

    Protocol for Wireless Network-on-Chip,IEEE Transactions on Computers, vol. 57, no, 9, pp. 1230-1245Sep, 2008.

    Article (CrossRef Link)

    [8] Y. Wang, D. Zhao, The Design and Synthesis of a Synchronous and Distributed MAC Protocol for Wireless

    Network-on-Chip,inProc. IEEE Intl Conf. Computer-Aided Design, Nov. 2007.Article (CrossRef Link)

    [9] S. Deb, K. Chang, et al., Design of an Efficient NoC Architecture using Millimeter-Wave Wireless Links,in Proc.

    of 13th Intl Symposiumon Quality Electronic Design, pp. 165-172, Mar. 2012.Article (CrossRef Link)

    [10]

    L. P. Carloni, P. Pande, Y. Xie, Networks-on-chip in emerging interconnect paradigms: Advantages and challenges,in Proc. of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip, pp. 93-102, 2009. Article

    (CrossRef Link)[11]H. S. Wang, X. Zhu, L. S. Peh, S. Malik, Orion: A power-performance simulator for interconnection networks,in

    Proc. of the 35th Annual ACM/IEEE International Symposium on Microarchitecture, pp. 294305, Nov. 2002.Article

    (CrossRef Link)

    [12]S. B. Lee et al., A scalable micro wireless interconnect structure for CMPs,in Proc. ACM Annu. Int. Con. Mobile

    Comput. Network. (MobiCom), pp. 20-25, 2009.Article (CrossRef Link)

    [13] D. D. Tomaso et al., iWise: Inter-router wireless scalable express channels for Network-on-Chips (NoCs)

    architecture,inProc. Annu. Symp. High Performance Interconnects, pp. 11-18, 2011.Article (CrossRef Link)

    [14]W. J. Dally, Express cubes: Improving the performance of k-ary n-cube interconnection networks,IEEE Trans.

    Computers, vol. 40, no. 9, pp. 1016-1023, Sep. 1991.Article (CrossRef Link)

    [15]A. Ganguly, K. Chang, S. Deb, P. Pande, B. Belzer, C. Teuscher, Scalable hybrid wireless network-on-chip

    architectures for multicore systems, IEEE Trans. Computers, vol. 60, no. 10, pp. 1485-1502, Oct. 2011. Article

    http://dx.doi.org/10.1109/TC.2008.78http://dx.doi.org/10.1109/TC.2008.78http://dx.doi.org/10.1109/TC.2008.78http://dx.doi.org/10.1109/ISCA.2008.35http://dx.doi.org/10.1109/ISCA.2008.35http://dx.doi.org/10.1109/TED.2005.850699http://dx.doi.org/10.1109/TED.2005.850699http://dx.doi.org/10.1109/TED.2005.850699http://dx.doi.org/10.1145/1353629.1353649http://dx.doi.org/10.1145/1353629.1353649http://dx.doi.org/10.1145/1353629.1353649http://dx.doi.org/10.1109/HPCA.2008.4658639http://dx.doi.org/10.1109/HPCA.2008.4658639http://dx.doi.org/10.1109/HPCA.2008.4658639http://dx.doi.org/10.1109/TC.2008.86http://dx.doi.org/10.1109/TC.2008.86http://dx.doi.org/10.1109/ICCAD.2007.4397332http://dx.doi.org/10.1109/ICCAD.2007.4397332http://dx.doi.org/10.1109/ICCAD.2007.4397332http://dx.doi.org/10.1109/ISQED.2012.6187490http://dx.doi.org/10.1109/ISQED.2012.6187490http://dx.doi.org/10.1109/ISQED.2012.6187490http://dx.doi.org/10.1109/NOCS.2009.5071456http://dx.doi.org/10.1109/NOCS.2009.5071456http://dx.doi.org/10.1109/NOCS.2009.5071456http://dx.doi.org/10.1109/MICRO.2002.1176258http://dx.doi.org/10.1109/MICRO.2002.1176258http://dx.doi.org/10.1109/MICRO.2002.1176258http://dx.doi.org/10.1109/MICRO.2002.1176258http://dx.doi.org/10.1145/1614320.1614345http://dx.doi.org/10.1145/1614320.1614345http://dx.doi.org/10.1145/1614320.1614345http://dx.doi.org/10.1109/HOTI.2011.12http://dx.doi.org/10.1109/HOTI.2011.12http://dx.doi.org/10.1109/HOTI.2011.12http://dx.doi.org/10.1109/12.83652http://dx.doi.org/10.1109/12.83652http://dx.doi.org/10.1109/12.83652http://dx.doi.org/10.1109/TC.2010.176http://dx.doi.org/10.1109/TC.2010.176http://dx.doi.org/10.1109/TC.2010.176http://dx.doi.org/10.1109/12.83652http://dx.doi.org/10.1109/HOTI.2011.12http://dx.doi.org/10.1145/1614320.1614345http://dx.doi.org/10.1109/MICRO.2002.1176258http://dx.doi.org/10.1109/MICRO.2002.1176258http://dx.doi.org/10.1109/NOCS.2009.5071456http://dx.doi.org/10.1109/NOCS.2009.5071456http://dx.doi.org/10.1109/ISQED.2012.6187490http://dx.doi.org/10.1109/ICCAD.2007.4397332http://dx.doi.org/10.1109/TC.2008.86http://dx.doi.org/10.1109/HPCA.2008.4658639http://dx.doi.org/10.1145/1353629.1353649http://dx.doi.org/10.1145/1353629.1353649http://dx.doi.org/10.1109/TED.2005.850699http://dx.doi.org/10.1109/ISCA.2008.35http://dx.doi.org/10.1109/TC.2008.78
  • 8/10/2019 A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures With WirelessRF

    11/12

    Smart Computing Review, vol. 3, no. 6, December 2013 435

    (CrossRef Link)

    [16]M. F. Chang, J. Cong, A. Kaplan, A. Kaplan, C. Liu, M. Naik, J. Premkumar, G. Reinman, E. Socher, S.-W. Tam,Power reduction of CMP communication networks via RF-interconnects, in Proc. of the 41st annual IEEE/ACM

    International Symposium on Microarchitecture (MICRO 41), Washington, DC, USA, pp. 376-387, 2008. Article

    (CrossRef Link)

    [17]W. J. Dally, T. B, Principles and Practices of Interconnection Networks. Waltham,MA: Morgan Kaufmann, 2004.

    [18]

    K. Chang, S. Deb, et al., Performance Evaluation and Design Trade-offs for Wireless Network-on-Chip Architecture,ACM Journal on Emerging Technologies in Computing Systems, vol. 8, no. 8, 2012.Article (CrossRef Link)

    [19]M. F. Chang, V. P. Roychowdhury, L. Zhang, H. Shin, Y. Qian, RF/wireless interconnect for inter- and intra-chip

    communications,Proceedings of the IEEE, vol. 89, no. 4, Apr. 2001.Article (CrossRef Link)

    [20]J. Ko, J. Kim, Z. Xu, Q. Gu, C. Chien, M. Chang, An RF/baseband FDMA -interconnect transceiver for

    reconfigurable multiple access chip-to-chip communication, in Proc. of Dig. Tech. Papers Int. Solid-State Circuits

    Conf., vol. 1, pp. 338-602, Feb. 2005.Article (CrossRef Link)

    [21]H. Wu, L. Nan, S.-W. Tam, et al., A 60GHz on-chip RF-Interconnect with /4 coupler for 5Gbps bi-directional

    communication and multi-drop arbitration,inProc. of Custom Integrated Circuits Conference (CICC), pp. 1-4, 2012.

    Article (CrossRef Link)

    [22]Y. Kim, G.-S. Byun, A. Tang, C.-P. Jou, H.-H. Hsien, G. Reinman, J. Cong, M. F. Chang, An 8Gb/s/pin 4pJ/b/pin

    single-t-line dual (Base+RF) band simultaneous bidirectional mobile memory I/O interface, in Proc. of the IEEE

    International Solid-State Circuits Conference (ISSCC), pp. 50-51, 2012.Article (CrossRef Link)

    [23]

    J. Kim, K. Choi, et al., Exploiting New Interconnect Technologies in On-Chip Communication,IEEE Journal onemerging and selected topics in circuits and systems, vol. 2, no. 2, pp124-136, June 2012.Article (CrossRef Link)

    [24]S. Deb, A. Ganguly, P. Pande, D. Heo, B. Belzer, Wireless NOC as interconnection backbone for multicore chips:

    Promises and challenges,IEEE Journal on emerging and selected topics in circuits and systems, vol. 2, no. 2, pp228-

    239, June 2012.Article (CrossRef Link)

    [25]J. Lin et al., Communication using antennas fabricated in silicon integrated circuits, IEEE J. Solid-State Circuits,

    vol. 42, no. 8, pp.1678-1687, Aug. 2007.Article (CrossRef Link)

    [26]S. Deb et al., Enhancing performance of Network-on-Chip architectures with millimeter-wave wireless interconnects,

    inProc. IEEE Int. Conf. ASAP, pp. 73-80, 2010.Article (CrossRef Link)

    [27]K. Kempa et al., Carbon nanotubes as optical antennae, Adv. Mater., vol. 19, pp. 421-426, 2007.Article (CrossRef

    Link)

    [28]D. J. Watts, S. H. Strogatz, Collective dynamics of small-world networks, Nature, vol. 393, pp. 440442, 1998.Article (CrossRef Link)

    [29]

    M. F. Chang, J. Cong, A. Kaplan, M. Naik, G. Reinman, E. Socher, S.-W. Tam, CMP Network-on-Chip Overlaidwith Multi-Band RF-Interconnect, UCLA Computer Science Department Technical Report UCLA/CSD-TR-07-0032,

    Dec. 2007.

    [30]A. Kumar, L.-S. Peh, N. K. Jha, Token flow control, inProc. of the 41st IEEE/ACM International Symposium on

    Microarchitecture (MICRO 08), pp. 342-353, 2008.Article (CrossRef Link)

    [31]C. Xiao, M.-C. Frank Chang, J. Cong, M. Gill, Z. Huang, C. Liu, G. Reinman, H. Wu, Stream Arbitration: Towards

    Efficient Bandwidth Utilization for Emerging On-Chip Interconnects, ACM Transactions on Architecture and Code

    Optimization, vol. 9, no. 4, Jan. 2013.Article (CrossRef Link)

    [32]A. E. Eiben, J. E. Smith, Introduction to Evolutionary Computing,Springer Berlin, 2003.Article (CrossRef Link)

    [33]M. Sipper, Evolution of Parallel Cellular Machines: The Cellular Programming Approach,Springer Berlin, 1997.

    Article (CrossRef Link)

    [34]S. Kirkpatrick, Jr C. D. Gelatt M. P. Vecchi, Optimization by simulated annealing,Science, vol. 220, pp. 671-680,

    1983.Article (CrossRef Link)

    [35]

    T. Jansen, I. Wegener, A comparison of simulated annealing with a simple evolutionary algorithm on pseudo-boolean functions of unitation,Theor. Comput. Sci, vol. 386, pp. 73-93, 2007.Article (CrossRef Link)

    [36]International technology roadmap for semiconductors, 2007 edition.[37]A. Ganguly et al., A unified error control coding scheme to enhance the reliability of a hybrid wireless Network-on-

    Chip, inProc. IEEE Int. Symp. Defect Fault Tolerance VLSI Nanotechnol. Syst, pp.277285, 2011.Article (CrossRef

    Link)

    [38]A. Ganguly et al., Crosstalk-aware channel coding schemes for energy efficient and reliable NoC interconnects,

    IEEE Trans. Very Large Scale (VLSI) Syst., vol. 17, no. 11, pp. 16261639, Nov. 2009.Article (CrossRef Link)

    [39]N. Hardavellas, M. Ferdman, B. Falsafi, A. Ailamaki, Reactive NUCA: near-optimal block placement and replication

    in distributed caches, in Proc. of the 36th annual international symposium on Computer architecture (ISCA '09).

    ACM, New York, NY, USA, 184-195, 2009.Article (CrossRef Link)

    [40]H. Lee, S. Cho, R. C. Bruce, StimulusCache: Boosting Performance of Chip Multiprocessors with Excess Cache,

    Proc. of the IEEE Int'l Symposium on High-Performance Computer Architecture (HPCA), Bangalore, India, Jan. 2010.

    http://dx.doi.org/10.1109/MICRO.2008.4771806http://dx.doi.org/10.1109/MICRO.2008.4771806http://dx.doi.org/10.1109/MICRO.2008.4771806http://dx.doi.org/10.1145/2287696.2287706http://dx.doi.org/10.1145/2287696.2287706http://dx.doi.org/10.1145/2287696.2287706http://dx.doi.org/10.1109/5.920578http://dx.doi.org/10.1109/5.920578http://dx.doi.org/10.1109/5.920578http://dx.doi.org/10.1109/ISSCC.2005.1494007http://dx.doi.org/10.1109/ISSCC.2005.1494007http://dx.doi.org/10.1109/ISSCC.2005.1494007http://dx.doi.org/10.1109/CICC.2012.6330666http://dx.doi.org/10.1109/CICC.2012.6330666http://dx.doi.org/10.1109/ISSCC.2012.6176874http://dx.doi.org/10.1109/ISSCC.2012.6176874http://dx.doi.org/10.1109/ISSCC.2012.6176874http://dx.doi.org/10.1109/ISSCC.2012.6176874http://dx.doi.org/10.1109/ISSCC.2012.6176874http://dx.doi.org/10.1109/ISSCC.2012.6176874http://dx.doi.org/10.1109/JETCAS.2012.2201031http://dx.doi.org/10.1109/JETCAS.2012.2201031http://dx.doi.org/10.1109/JETCAS.2012.2201031http://dx.doi.org/10.1109/JSSC.2007.900236http://dx.doi.org/10.1109/JSSC.2007.900236http://dx.doi.org/10.1109/JSSC.2007.900236http://dx.doi.org/10.1109/ASAP.2010.5540799http://dx.doi.org/10.1109/ASAP.2010.5540799http://dx.doi.org/10.1109/ASAP.2010.5540799http://dx.doi.org/10.1002/adma.200601187http://dx.doi.org/10.1002/adma.200601187http://dx.doi.org/10.1002/adma.200601187http://dx.doi.org/10.1002/adma.200601187http://dx.doi.org/10.1038/30918http://dx.doi.org/10.1038/30918http://dl.acm.org/citation.cfm?id=1521786http://dl.acm.org/citation.cfm?id=1521786http://dl.acm.org/citation.cfm?id=1521786http://dx.doi.org/10.1145/2400682.2400719http://dx.doi.org/10.1145/2400682.2400719http://dx.doi.org/10.1145/2400682.2400719http://dx.doi.org/10.1007/978-3-662-05094-1http://dx.doi.org/10.1007/978-3-662-05094-1http://dx.doi.org/10.1007/978-3-662-05094-1http://dx.doi.org/10.1007/3-540-62613-1http://dx.doi.org/10.1007/3-540-62613-1http://dx.doi.org/10.1126/science.220.4598.671http://dx.doi.org/10.1126/science.220.4598.671http://dx.doi.org/10.1126/science.220.4598.671http://dx.doi.org/10.1016/j.tcs.2007.06.003http://dx.doi.org/10.1016/j.tcs.2007.06.003http://dx.doi.org/10.1016/j.tcs.2007.06.003http://dx.doi.org/10.1109/DFT.2011.24http://dx.doi.org/10.1109/DFT.2011.24http://dx.doi.org/10.1109/DFT.2011.24http://dx.doi.org/10.1109/DFT.2011.24http://dx.doi.org/10.1109/TVLSI.2008.2005722http://dx.doi.org/10.1109/TVLSI.2008.2005722http://dx.doi.org/10.1109/TVLSI.2008.2005722http://dl.acm.org/citation.cfm?id=1555779http://dl.acm.org/citation.cfm?id=1555779http://dl.acm.org/citation.cfm?id=1555779http://dl.acm.org/citation.cfm?id=1555779http://dx.doi.org/10.1109/TVLSI.2008.2005722http://dx.doi.org/10.1109/DFT.2011.24http://dx.doi.org/10.1109/DFT.2011.24http://dx.doi.org/10.1016/j.tcs.2007.06.003http://dx.doi.org/10.1126/science.220.4598.671http://dx.doi.org/10.1007/3-540-62613-1http://dx.doi.org/10.1007/978-3-662-05094-1http://dx.doi.org/10.1145/2400682.2400719http://dl.acm.org/citation.cfm?id=1521786http://dx.doi.org/10.1038/30918http://dx.doi.org/10.1002/adma.200601187http://dx.doi.org/10.1002/adma.200601187http://dx.doi.org/10.1109/ASAP.2010.5540799http://dx.doi.org/10.1109/JSSC.2007.900236http://dx.doi.org/10.1109/JETCAS.2012.2201031http://dx.doi.org/10.1109/ISSCC.2012.6176874http://dx.doi.org/10.1109/ISSCC.2012.6176874http://dx.doi.org/10.1109/CICC.2012.6330666http://dx.doi.org/10.1109/ISSCC.2005.1494007http://dx.doi.org/10.1109/5.920578http://dx.doi.org/10.1145/2287696.2287706http://dx.doi.org/10.1109/MICRO.2008.4771806http://dx.doi.org/10.1109/MICRO.2008.4771806
  • 8/10/2019 A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures With WirelessRF

    12/12

    Xiao et al.: A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures with Wireless/RF436

    Chunhua Xiao received her B.S. in Electronic Information Engineering from Shijiazhuang

    Tiedao University, Hebei Province, China, in 2007, and her M.S. in Computer Science from

    Beijing University of Technology, Beijing, China, in 2010. She is currently a PhD student in

    Department of Computer Science and Technology, Beijing University of Technology. Her

    research interests include embedded system co-design, Multi-processor system-on-chip, and

    Network-on-Chip.

    Zhangqin Huang received his B.S., M.S., and PhD in Computer Science from Xian Jiaotong

    University, China, in 1986, 1989 and 2000, respectively. He is currently the Deputy Director of

    the Embedded Software and Systems Institute (ESSI), Beijing University of Technology (BJUT),

    China. His current research interests include co-design for embedded software and hardware,

    humancomputer interaction based on internet, Multi-processor system-on-chip, mass datastorage, and network information security.

    Da Li received his B.S., M.S., and PhD in Computer Science from Xian Jiaotong University,

    China, in 2002, 2006 and 2012, respectively. He is currently a instructor of Embedded Software

    and Systems Institute (ESSI), Beijing University of Technology (BJUT). His research interests

    include embedded FPGA system design and multi-core processors.

    Copyright 2013 KAIS