Fast Beam Training Architecture for Hybrid mmWave Transceivers · Fast Beam Training Architecture...

16
Fast Beam Training Architecture for Hybrid mmWave Transceivers Yang, J., Jin, S., Wen, C-K., Yang, X., & Matthaiou, M. (2020). Fast Beam Training Architecture for Hybrid mmWave Transceivers. IEEE Transactions on Vehicular Technology. https://doi.org/10.1109/TVT.2020.2963847 Published in: IEEE Transactions on Vehicular Technology Document Version: Peer reviewed version Queen's University Belfast - Research Portal: Link to publication record in Queen's University Belfast Research Portal Publisher rights © 2019 IEEE. This work is made available online in accordance with the publisher’s policies. Please refer to any applicable terms of use of the publisher. General rights Copyright for the publications made accessible via the Queen's University Belfast Research Portal is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. Take down policy The Research Portal is Queen's institutional repository that provides access to Queen's research output. Every effort has been made to ensure that content in the Research Portal does not infringe any person's rights, or applicable UK laws. If you discover content in the Research Portal that you believe breaches copyright or violates any law, please contact [email protected]. Download date:11. Jun. 2020

Transcript of Fast Beam Training Architecture for Hybrid mmWave Transceivers · Fast Beam Training Architecture...

Page 1: Fast Beam Training Architecture for Hybrid mmWave Transceivers · Fast Beam Training Architecture for Hybrid mmWave Transceivers Jie Yang, Student Member, IEEE, Shi Jin, Senoir Member,

Fast Beam Training Architecture for Hybrid mmWave Transceivers

Yang, J., Jin, S., Wen, C-K., Yang, X., & Matthaiou, M. (2020). Fast Beam Training Architecture for HybridmmWave Transceivers. IEEE Transactions on Vehicular Technology. https://doi.org/10.1109/TVT.2020.2963847

Published in:IEEE Transactions on Vehicular Technology

Document Version:Peer reviewed version

Queen's University Belfast - Research Portal:Link to publication record in Queen's University Belfast Research Portal

Publisher rights© 2019 IEEE.This work is made available online in accordance with the publisher’s policies. Please refer to any applicable terms of use of the publisher.

General rightsCopyright for the publications made accessible via the Queen's University Belfast Research Portal is retained by the author(s) and / or othercopyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associatedwith these rights.

Take down policyThe Research Portal is Queen's institutional repository that provides access to Queen's research output. Every effort has been made toensure that content in the Research Portal does not infringe any person's rights, or applicable UK laws. If you discover content in theResearch Portal that you believe breaches copyright or violates any law, please contact [email protected].

Download date:11. Jun. 2020

Page 2: Fast Beam Training Architecture for Hybrid mmWave Transceivers · Fast Beam Training Architecture for Hybrid mmWave Transceivers Jie Yang, Student Member, IEEE, Shi Jin, Senoir Member,

1

Fast Beam Training Architecture for HybridmmWave Transceivers

Jie Yang, Student Member, IEEE, Shi Jin, Senoir Member, IEEE, Chao-Kai Wen, Member, IEEE,Xi Yang, and Michail Matthaiou, Senoir Member, IEEE

Abstract—Millimeter-wave (mmWave) communications attractconsiderable interest due to the massive available spectrum.However, to establish communication links, a beam trainingprocedure is indispensable. How to accelerate the beam trainingprocess is one of the key challenges towards realizing mmWavecommunications in practice. In this study, we first propose anovel low-cost digital beamforming (DBF) module assisted hybrid(DA-hybrid) architecture, by exploiting both the capabilities ofanalog and digital modules. To make this topology practical, wedeploy coarse radio frequency (RF) chains and low-resolutionanalog-to-digital converters in the low-cost DBF module to reducecost and power consumption. Second, we design a fast beamtraining method (named DAH-BT) by utilizing the proposed DA-hybrid architecture and leveraging the sparse nature of mmWavechannels, in which an internal calibration method is adapted toobtain the parameters of the RF impairments and the orthogonalmatching pursuit algorithm is utilized to estimate beams. Wealso prove that the developed measurement matrices satisfy therestricted isometry property. Extensive simulation results showthat the DA-hybrid architecture can not only provide close to100% beam matching accuracy, but also dramatically reduce thesystem power consumption and cost. In addition, the proposedDAH-BT scheme consumes the shortest time for beam trainingover the state-of-art methods with comparable spectral efficiency.

Index Terms—Beam training, digital beamforming architecture,hybrid architecture, low-resolution ADC, millimeter-wave system,RF impairments.

I. INTRODUCTION

Millimeter-wave (mmWave) frequencies have been identifiedas a catalyst for the next generation wireless communicationsystems [1], since the mmWave band offers extremely largebandwidth and can therefore boost peak data rates. However, thepath loss of mmWave signals is inherently large, and mmWavesignals are sensitive to blockage, hence, large antenna arraysand highly directional transmission should be combined to com-pensate for the severe penetration path losses [2]. Unfortunately,directional communications complicate the link establishment,and the indispensable beam training is time-consuming [3].To support high mobility in mmWave systems, which enableswidespread applications, such as vehicular communications andwireless virtual reality [4], [5], there is an urgent need toovercome a critical challenge in mmWave communications:

Manuscript received August 06, 2019; revised November 14, 2019; acceptedDecember 19, 2019. This work was supported in part by the xxx. The associateeditor coordinating the review of this paper and approving it for publicationwas C. Yuen. (Corresponding author: Shi Jin)

Jie Yang, Shi Jin, and Xi Yang are with the National Mobile CommunicationsResearch Laboratory, Southeast University, Nanjing, 210096, P. R. China(e-mail: yangjie, jinshi, [email protected]). Chao-Kai Wen is with theInstitute of Communications Engineering, National Sun Yat-sen University,Kaohsiung, 804, Taiwan (e-mail: [email protected]). MichailMatthaiou is with the Institute of Electronics, Communications and InformationTechnology (ECIT), Queen’s University Belfast, Belfast, BT3 9DT, U.K.(email: [email protected]).

how to improve the reliability and reduce the latency of beamtraining. In this study, we design a novel system architecture bythe cross-design of analog and digital modules and propose acompatible fast beam training scheme via compressed sensing(CS) tools to precisely address the above mentioned challenge.

We now recall that the practical implementation of mmWavesystems faces several hardware challenges compared to sub-6 GHz communication systems [6]. The fully digital beam-forming (DBF) architecture [7], [8], compared with otherarchitectures, has the highest precoding freedom, flexible multi-beam ability, fast beam steering speed, and high beamformingprecision. However, the comparison result in [8] showed that thetotal cost and power consumption of the fully DBF architectureis more than twice of that of hybrid architectures. The hybridarchitecture utilizes a lens topology [9] or is connected to abank of phase shifters to reduce the number of RF chains [10],so as to maintain the hardware cost and power consumptionat reasonable levels. However, the signal processing becomescomplicated, since it is split between the analog and digitaldomains in the hybrid architecture. In [11], different hybridarchitectures were compared in terms of power consumptionand performance.

Recently, to further reduce the power consumption and com-putational complexity, quantized multiple-input multiple-output(MIMO) systems were proposed in [12] that use low-cost low-resolution analog-to-digital converters (ADCs), e.g., 1-4 bits.Several aspects of quantized MIMO systems have been studied,such as capacity analysis [13], energy efficiency analysis [14],data detection [15], and channel estimation [16]. However, onlyvery few works in the literature have investigated hardwareimperfections in the DBF or hybrid architectures [17], [18],including phase noise, power amplifier nonlinearities, and In-phase and Quadrature-phase (IQ) imbalance [19], [20], whichcan seriously undermine the system performance especiallyin mmWave frequencies. The work in [17] considered theaggregate impact of hardware impairments and modeled it asadditive Gaussian noise. The study in [18] characterized thesuperposition of different RF impairments by a more accurateextended error vector magnitude (EEVM) model. Capitalizingon [18], we will take low-resolution phase shifters, low-bitADCs, and radio frequency (RF) impairments into consider-ation.

mmWave beam training has become a popular research topicin recent years [21]–[30]. The most straightforward methodis the exhaustive search [21], which is prohibitively time-consuming, where the base station (BS) and the mobile station(MS) scan all the angle-of-arrival (AoA) and angle-of-departure(AoD) beam pairs until they find the strongest one. IEEE802.11ad standardized a beam sector based scheme for beam

Page 3: Fast Beam Training Architecture for Hybrid mmWave Transceivers · Fast Beam Training Architecture for Hybrid mmWave Transceivers Jie Yang, Student Member, IEEE, Shi Jin, Senoir Member,

2

training [22], whose main idea is to start with sectors of widebeams to do a coarse beam estimation and then shrink thebeamwidth adaptively and successively to obtain a more refinedbeam. Most works considered sequential and single-directionalscanning during the beam training [23]–[26], while, work in[27] leveraged the multi-directional scanning abilities of thehybrid transceivers to accelerate the beam training process. Thedrawback of such methods, however, is the need of successivefeedback between the BS and the MS, which is difficult toachieve at the initial channel acquisition stage [3]. The sparsenature of mmWave channels in the angular domain motivatesthe application of CS-based techniques to expedite the beamtraining process [28]–[30]. However, these solutions mainlyfocus on the channel estimation performance, which inducehigh computational complexity and rely on unrealistic hardwareassumptions. In our study, to accelerate the beam training,eliminate the demand of feedback, and reduce the algorithmiccomplexity, we develop a more flexible beam training method,by embedding the idea of CS into the sector-based beamtraining scheme to change the fixed hierarchical sectors intoadaptive refined beams.

In this study, we investigate the problem of beam trainingacceleration by proposing a novel low-cost DBF module as-sisted hybrid (DA-hybrid) architecture and a compatible beamtraining method (DAH-BT). According to the standards 3GPPTS 38.211 [31] and 3GPP TS 38.213 [32], the beam training isconducted within the periodic half frames, which can occupyhigh proportion of the total transmission process in 5G NR,especially for fast-moving communication scenarios. Hence, itis fundamentally important to greatly reduce the time of beamtraining by adding a low-cost auxiliary module. In contrastto almost all of the approaches in the literature [23]–[27],our design exploits both the abilities of the hybrid module togenerate desired multiple beams and the low-cost DBF moduleto accurately capture the angular information. To make theproposed architecture more practically viable, we take intoaccount the hardware imperfections, including RF impairments,low-bit ADCs, and low-resolution phase shifters, and developrobust algorithms and reduce the hardware cost and powerconsumption. The main contributions of our study are asfollows:

• Hardware Architecture (DA-hybrid): We propose anovel low-cost DBF module assisted hybrid architecture,named DA-hybrid, where we deploy coarse RF chains andlow-resolution ADCs in the low-cost DBF module. Wealso model the RF chain impairments and low-bit quanti-zation by the EEVM model and the additive quantizationnoise model, respectively. We provide a comprehensivecost and power consumption comparison of different archi-tectures with respect to the price and power consumptioninformation of commercial key devices. Through this nu-merical comparison, the DA-hybrid architecture composedof sub-connected hybrid module and low-cost DBF moduleis proven to be a promising solution for a good tradeoffbetween beam training performance and hardware cost.

• Accelerate Beam Training (DAH-BT): We design afast beam training method based on the proposed DA-hybrid architecture, named DAH-BT. We avail of thesparsity of mmWave channels in the angular domain toquickly estimate the AoAs and AoDs of beams via the

adapted orthogonal matching pursuit (OMP) algorithm.Then, we develop an internal calibration method to obtainthe parameters of RF impairments in the EEVM modelbefore beam training, which simplifies the CS-based beamtraining algorithms by avoiding the relatively complex bi-linear techniques. To evaluate the adapted OMP algorithm,we also prove that the designed measurement matricessatisfy the restricted isometry property (RIP).

Extensive simulation results show that the DA-hybrid archi-tecture composed of the sub-connected hybrid module and thelow-cost DBF module can not only achieve close to 100% beammatching accuracy, but also reduce the system power consump-tion and cost. In addition, the proposed DAH-BT shows greatpotential in saving the time resources over traditional methods[25], [27] with comparable spectral efficiency.

The rest of this paper is organized as follows: The systemmodel is presented in Section II. In Section III, we proposethe DA-hybrid architecture, and analyze the cost and powerconsumption. In Section IV, we design a fast beam trainingmethod DAH-BT by utilizing the proposed system architecture.Our simulation results are presented in Section V.We concludethe study in Section VI.

Notations—Throughout this paper, uppercase boldface Aand lowercase boldface a denote matrices and vectors, re-spectively. For any matrix A, the superscripts A∗, AT andAH stand for the conjugate, transpose and conjugate-transpose,respectively. A diagonal matrix is denoted by diag{·} withdiagonal entries given in {·}, and blkdiag{A1,A2, · · · ,Ak}denotes a block-diagonal matrix constructed by A1, A2, andAk. For any vector a, a∗ represents the conjugate, and the 2-norm is denoted by ‖a‖2. The quantitative function is denotedby Q(·) and the vector operator is denoted by vec(·). Inaddition, the Kronecker product is represented by ⊗, whilez ∼ CN (0, σ2) denotes a complex-valued Gaussian randomvariable z with zero mean and variance σ2. For any realnumbers a and b, bac denotes the largest integer no greaterthan a, dae denotes the smallest integer no smaller than a, andmod

{ab

}means the remainder of a being divided by b. For

any complex number c, |c| represents the modulus of c. For anyset S, |S| denotes the number of elements in set S.

II. SYSTEM MODEL

A. Signal Model

We consider a mmWave system in Fig. 1, where the BS andthe MS are equipped with a uniform planar array (UPA) withNt = Naz

t ×Nelt and Nr = Naz

r ×Nelr antennas,1 respectively.

The UPA is placed on the y−z plane. We assume a narrowbandpoint-to-point flat fading channel, and the received signal at theBS antenna array can be expressed as

yUL = HULxUL + nUL, (1)

where HUL ∈ CNt×Nr denotes the uplink channel matrixbetween the MS and the BS. The received signal at the MSantenna array is given by

yDL = HDLxDL + nDL, (2)

1Throughout the paper, the superscripts “az” and “el” will be used to denotethe azimuth and elevation domains, correspondingly. Likewise, the superscripts“UL” and “DL” will denote the uplink and downlink, respectively.

Page 4: Fast Beam Training Architecture for Hybrid mmWave Transceivers · Fast Beam Training Architecture for Hybrid mmWave Transceivers Jie Yang, Student Member, IEEE, Shi Jin, Senoir Member,

3

Fig. 1: Block diagram of the proposed DA-hybrid architecture. The BS and MS consist of an antenna array, a low-cost DBF module, a hybrid module, and anelectronic switch.

where HDL ∈ CNr×Nt represents the downlink channel matrix;xUL ∈ CNr×1 and xDL ∈ CNt×1 are the transmitted signal atthe antenna array. Moreover, nUL and nDL are the additive com-plex Gaussian noise vectors. With the assumption of channelreciprocity, we have HUL = (HDL)T .

B. Physical Channel Model

Before proceeding, we first specify the commonly usedmmWave channel model which characterizes the geometricalstructure and limited scattering nature of mmWave channels[2]. The channel model is expressed as

HDL =

√NrNtL

L∑l=1

αlaRx(θazl , θ

ell )aHTx(φ

azl , φ

ell ), (3)

where L is the number of paths, αl is the complex path gain ofthe l-th path, θazl represents the angle between the incident waveof the l-th path and the y axis; θell represents the angle betweenthe incident wave of the l-th path and the z axis; φazl representsthe angle between the transmitted wave of the l-th path andthe y axis; φell represents the angle between the transmittedwave of the l-th path and the z axis; θazl , θell , φazl and φell aremodeled as uniformly distributed variables in [0, π). Moreover,aRx(θ

azl , θ

ell ) ∈ CNr×1 and aTx(φ

azl , φ

ell ) ∈ CNt×1 denote the

receiving and transmitting UPA steering vectors, respectively.The definition of aRx(θ

azl , θ

ell ) is given by

aRx(θazl , θ

ell ) = aRx(θ

azl )⊗ aRx(θ

ell ), (4)

where

aRx(θazl )=

1√Nazr

[1, ej

2πdλ cos(θazl ), . . . , ej(N

azr −1) 2πd

λ cos(θazl )]T,

(5)and

aRx(θell )=

1√Nelr

[1, ej

2πdλ cos(θell ), . . . , ej(N

elr −1) 2πd

λ cos(θell )]T,

(6)where λ is the wavelength of the carrier and d is the distancebetween two neighboring antennas [33]–[35]. Likewise, byreplacing θazl , θell , Naz

r , and Nelr in (4), (5), and (6) with

φazl , φell , Nazt , and Nel

t , we can obtain the expression ofaTx(φ

azl , φ

ell ). Note that, generally, we consider half-wavelength

spaced UPAs in the remainder of the study.

C. Virtual Channel Representation

Unfortunately, it is difficult to analyze and estimate thephysical model HDL in (3) due to the nonlinear relationship

of parameters {αl, θazl , θell , φazl , φell }. However, the physicalchannel model can be well approximated by the virtual channelmodel, which is a linear representation in the angular domain[36]. To harness the tractable virtual channel representation, weassume that the AoAs and AoDs are taken from angular gridsof Gazr , Gelr , Gazt and Gelt points in [0, π), respectively. Thediscrete angles are given by

θazi = arccos(2(i− 1)/Gazr − 1), i = 1, 2, . . . , Gazr , (7)

θeli = arccos(2(i− 1)/Gelr − 1), i = 1, 2, . . . , Gelr , (8)

φazj = arccos(2(j − 1)/Gazt − 1), j = 1, 2, . . . , Gazt , (9)

φelj = arccos(2(j − 1)/Gelt − 1), j = 1, 2, . . . , Gelt . (10)

Let Gr = Gazr × Gelr and Gt = Gazt × Gelt . Then, we definethe receiving and transmitting dictionary matrices Ar and At,with Ar ∈ CNr×Gr being given by

Ar = Aazr ⊗ Ael

r , (11)where

Aazr =

[aRx(θ

az1 ),aRx(θ

az2 ), . . . ,aRx(θ

azGazr

)], (12)

andAelr =

[aRx(θ

el1 ),aRx(θ

el2 ), . . . ,aRx(θ

elGelr

)]. (13)

Similarly, by replacing aRx(θazi ) and aRx(θ

eli ) with aTx(φ

azj ) and

aTx(φelj ) in (11), (12), and (13), we can obtain the expression

of At ∈ CNt×Gt . Therefore, the virtual representation of HDL

is given byHDL

v = ArHaAHt , (14)

where Ha is the virtual channel coefficients matrix (whichwill be defined in Section IV-B). The virtual channel HDL

v

approximates the physical channel HDL, and the approximationerror will decrease as Gr and Gt increase. For a particularsystem architecture, there is a resolvable phase resolution. Inthis study, we assume that the resolution of the phase is π/N ,hence, Gazr , Gelr , Gazt , and Gelt should be no greater than N .The virtual channel effectively captures the underlying sparsemultipath environment comprising L physical paths throughGrGt resolvable paths.

III. HARDWARE ARCHITECTURE OF THE PROPOSEDTRANSCEIVER SYSTEM

In this section, we propose a low-cost DBF module assistedhybrid architecture. In addition, we compare the cost and powerconsumption of the proposed DA-hybrid architecture with thetraditional DBF and hybrid architectures.

Page 5: Fast Beam Training Architecture for Hybrid mmWave Transceivers · Fast Beam Training Architecture for Hybrid mmWave Transceivers Jie Yang, Student Member, IEEE, Shi Jin, Senoir Member,

4

(a) Fully-connected hybrid module: each RF chainis connected to all the antennas.

(b) Sub-connected hybrid module: each RF chainis connected to a subset of antennas.

(c) Low-cost DBF module.

Fig. 2: Illustration of different architectures.

A. System Topology and Frame Structure

As was previously mentioned, the fully DBF architectureshave a number of performance advantages, however, the hard-ware cost and power consumption are viewed as their majorconstrains [4]. The hybrid architectures keep the cost and com-plexity to affordable levels by using fewer RF chains comparedto the number of antennas and allow multi-terminal multi-stream precoding [2], while the signal processing becomescomplicated. To leverage the advantages of both the DBF andhybrid architectures, we propose the DA-hybrid architecture, asillustrated in Fig. 1. The DA-hybrid architecture is composedof the following modules: an antenna array, a low-cost DBFmodule, a hybrid module, and an electronic switch.

•The antenna array is a UPA with Nt and Nr antennas inthe BS and MS, respectively. Each antenna array is connectedto both the low-cost DBF module and the hybrid module.

•The hybrid module consists of a digital precoding (com-bining) matrix FBB ∈ CNtRF×Ns (WBB ∈ CNtRF×Ns ) andan analog precoding (combining) matrix FRF ∈ CNt×NtRF(WRF ∈ CNt×NtRF ) at the BS, or a digital precoding (com-bining) matrix FBB ∈ CNrRF×Ns (WBB ∈ CNrRF×Ns ) andan analog precoding (combining) matrix FRF ∈ CNr×NrRF(WRF ∈ CNr×NrRF ) at the MS, where N t

RF and NrRF repre-

sents the number of RF chains at the BS and MS, respectively,and Ns denotes the number of baseband data streams, withNs ≤ N t

RF ≤ Nt and Ns ≤ NrRF ≤ Nr. Each RF chain carries

a pair of amplitude-domain IQ high-resolution ADCs. Fig. 2(a)and Fig. 2(b) show the fully-connected and the sub-connectedhybrid architectures, respectively. The signal transmitted by thehybrid architecture in a time slot is given by

x = FRFFBBs, (15)where s ∈ CNs×1 is the baseband data sequence. Throughingenious design of FRF, FBB and s, the beam can point at aparticular direction, or cover the whole space [27]. Note thatFRF is implemented by analog phase shifters, such that itsentries are of constant modulus. Commercially used RF phaseshifters generally have 6-bit resolution, hence, in this study, weassume that RF phase shifters in the hybrid modules have 6-bitphase resolution [37], without amplitude adjustment. Therefore,the phase resolution parameter N is 26 = 64. According to (7)-(10) in Section II-C, the angular space of beams is meshed,with Gazr , Gelr , Gazt and Gelt no greater than 64, such that

Gr (Gt) beam directions are stored in the beam dictionarymatrix Ar (At). We can design the analog precoding matrixFRF based on beam dictionary matrix (by combining differentcolumn vectors of the beam dictionary matrix). The hybridarchitecture can reduce the number of RF chains, maintain thehardware cost and power consumption at reasonable levels, andguarantee satisfactory transmission performance. However, itsplits the signal processing into the analog domains (FRF andWRF) and digital domains (FBB and WBB), which results inhigh computational complexity. Due to the complicated signalprocessing and the lack of RF chains, beam training has torely on multi-level scanning and feedback, which are timeconsuming exercises.

•The low-cost DBF module consists of Nt and Nr low-cost coarse RF chains in the BS and MS, respectively. EachRF chain carries a pair of amplitude-domain IQ low-resolutionADCs. The block diagram of the low-cost DBF module isdepicted in Fig. 2(c). Note that only the first two antennas havetransmission capability. There is no analog combining matrix inthe low-cost DBF module, therefore, data from all the antennasis retained directly but with a certain degree of resolution loss.The low-cost DBF module and the hybrid module will notwork at the same time slot, hence, the power consumption willnot be doubled. We adopt the EEVM model to capture thecombined impact of all kinds of RF impairments [18], [19].For illustration simplicity, let k denote the k-th RF chain of thereceiver, χ(k) = κ(k)ejϕ(k) denotes the attenuation and phaseshift effects on the k-th RF chain with |χ(k)| 6 1, and nRF(k)is the additive Gaussian noise. Take downlink as an example,the downlink received data impaired by imperfect RF chainscan be expressed as

yDLRF = χ(HDLxDL + nDL) + nRF, (16)

where χ = diag{χ(1), . . . , χ(Nr)}, and nRF = [nRF(1), . . . ,nRF(Nr)]

T . Then, we use the additive quantization noise modelto formulate the coarsely quantized outputs of low-resolutionADCs with ideal automatic gain control [13]. The quantizedsignal is expressed as

rDLADC = Q(yDLRF) = ηyDL

RF + nq, (17)

where Q(·) represents the non-linear quantitative function,which can be modeled as a linear function [38]. Let b denotethe quantization bits of the ADC, then the value of η for b 6 5

Page 6: Fast Beam Training Architecture for Hybrid mmWave Transceivers · Fast Beam Training Architecture for Hybrid mmWave Transceivers Jie Yang, Student Member, IEEE, Shi Jin, Senoir Member,

5

Fig. 3: Illustration of the frame structure, which consists of internal calibrationphase, beam training phase, channel estimation phase, data transmission phase,and guard intervals between different phases. The beam training phase consistsof two sub-phases.

can be found in [13]: when b = 1, η = 0.6366; when b = 2,η = 0.8825; when b = 4, η = 0.9905. Note that nq denotes theadditive Gaussian quantization noise vector. Although we applycoarse RF chains and low-bit ADCs to reduce the cost andpower consumption, the low-cost DBF module still managesto retain some advantages of the ideal DBF module, e.g., ahigh enough number of RF chains can make the receivedmeasurements sparser, hence, by using compressed sensingalgorithms, beam directions can be estimated even at lowoperating SNRs.

•The electronic switch is strictly controlled by the framestructure to make the system switch between the hybrid moduleand the low-cost DBF module.

The proposed frame structure is illustrated in Fig. 3: Thefirst phase is the internal calibration phase (introduced in detailin Section IV-A), during which both of the BS and the MScompute the values of the RF impairments’ parameters of theirown low-cost DBF module. The next is the beam training phase(introduced in detail in Section IV-B), which is divided into twophases: in phase 1, the BS switches to the hybrid module toomni-directionally transmit the beam training sequence, and theMS switches to the low-cost DBF module to obtain AoAs bythe standard CS algorithm2. In phase 2, by taking advantage ofchannel reciprocity, the MS switches to the hybrid module tosingle-directionally transmit beam training sequences along theestimated beam directions successively, and the BS switches tothe low-cost DBF module to obtain AoDs by the standard CSalgorithm. Finally, during the channel estimation phase and thedata transmission phase, both of the BS and MS switch to thehybrid module.

B. Cost and Power Consumption Analysis

In this section, we analyze the cost and power consump-tion of the following architectures: the fully-connected hybridarchitecture (A1), the sub-connected hybrid architecture (A2),the fully DBF architecture (A3) [2], and the low-cost DBFarchitecture (A4). The previous work in [11] summarizes thepower consumption of the ADC and the phase shifter, however,these values are relatively low since the considered devices’designs are not commercial products. Here, we assume that thetransceiver operates at the 28-GHz band with a 500-MHz signalbandwidth, and we refer to commercial products in AppliedDynamics International (ADI) company website that meet theabove conditions. With the development of chip technology, theprice and power consumption of products will change, but the

2The low-cost DBF module and the hybrid module share the same antennaarray and beam codebook, hence, the angular information obtained by the low-cost DBF module can be directly used for the hybrid module.

trend we revealed in this study is valuable for reference. Wepursue the analysis on the receiver side as an example, andthe required devices are summarized in Table I. The choiceof the reference value is difficult, since the cost and powerconsumption for ADI devices present high variability, e.g., theprice of one ADC device with around 1Gs/s sampling rateranges from $292.19 to $1326.00, and the power consumptionranges from 0.71W to 5.1W per ADC device. For A1, A2,and A3, which we assume that good device performance isrequired, we choose the average price and power consumptionas the reference value, e.g., $809 and 3W per ADC device.For A4, with coarse RF chains and low-bit ADCs, we choosea relatively cheaper device with lower power consumption asreference, e.g., the No. HMCAD1511 8-bit ADC device costs$300 with 1W power consumption.3 Similarly, we can obtainthe reference cost and power consumption of the low noiseamplifier (LNA), mixer, local oscillator (LO), low pass filter(LPF),4 and base band amplifier (BBA). Since there is nophase shifter in A4, we only choose the average cost andpower consumption as the reference values for the phase shifter,by also noting that the previous work [8] considers similarreference values. According to [8], [11], the electronic switchhas negligible power consumption and cost.

To compare the cost and power consumption of the consid-ered architectures, we calculate the total cost and power con-sumption of each architecture based on Table I with Nr = 64and Nr

RF as a variable. Since we assume that the basebandprocessor consumes almost the same power in different modules[11], we can ignore its impact in our analysis without severelysacrificing the modeling accuracy. The simulation result isshown in Fig. 4: (1) A1 and A3 are much more expensivethan A2 and A4, and with Nr

RF increasing, the cost of A4becomes lower than A2; (2) the power consumption of A3 is thehighest and that of A2 is the lowest among four architectures,nevertheless, the power consumption of A4 will be lowerthan A2 with Nr

RF increasing. (3) the low-cost DBF moduleassisted sub-connected hybrid architecture (A2+A4) has lowercost and power consumption than A1 and A3. Given the aboveanalysis, as an additional module, A4 has relative low costand power consumption. Moreover, compared to the low-costDBF module assisted fully-connected hybrid module, the low-cost DBF module assisted sub-connected hybrid architecturecan further reduce the total cost and power consumption withcomparable beam training performance shown in Section V-A.

IV. ACCELERATING BEAM TRAINING BY THE PROPOSEDTRANSCEIVER SYSTEM

The proposed DA-hybrid transceiver architecture can inprinciple accelerate the beam training process. In this section,we propose a fast beam training method, which only requiresLest + 1 time slots, where Lest is the number of the estimatedpaths. As briefly introduced in Section III-A, there are internalcalibration phase and beam training phase in the frame struc-ture. In the following, we will elaborate on these phases.

3Actually, 4 or less bits ADC devices will have lower cost and powerconsumption [2].

4The power consumption of the LPF is relatively low, which is around0.02W, we directly set the reference value to 0.02W.

Page 7: Fast Beam Training Architecture for Hybrid mmWave Transceivers · Fast Beam Training Architecture for Hybrid mmWave Transceivers Jie Yang, Student Member, IEEE, Shi Jin, Senoir Member,

6

TABLE I: Comparison of cost and power consumption for different devices on the receiver sideADI device Reference ADI device Reference Number Number Number Number

Device cost cost power power in in in in($) low/avg ($) (W) low/avg (W) A1 A2 A3 A4

LNA 25.85-223.13 40/125 0.09-0.64 0.09/0.4 Nr Nr Nr Nr

Mixer 25.09-102.91 26/64 0.595-1.442 0.6/2 NrRF Nr

RF Nr Nr

LO 7.63-200.01 8/104 0.03-1.75 0.1/0.9 NrRF Nr

RF Nr Nr

LPF 2.10-17.50 2.5/10 0.02 0.02 NrRF Nr

RF Nr Nr

BBA 1.31-5.65 1.5/3.5 0.02-0.16 0.02/0.09 NrRF Nr

RF Nr Nr

ADC 292.19-1326.00 300/809 0.71-5.1 1/3 NrRF Nr

RF Nr Nr

Phase shifter 170 170 0.4 0.4 NrRFNr Nr - -

1 2 3 4 5 6 7 8 9 10N

RFr

0

5

10

Co

st

(do

llar)

104

A1

A2

A3

A4

A2+A4

1 2 3 4 5 6 7 8 9 10N

RFr

100

200

300

400

Po

we

r co

nsu

mp

tio

n (

W)

A1

A2

A3

A4

A2+A4

Fig. 4: Cost and power consumption of different architectures versus the numberof RF chains Nr

RF .

A. Internal Calibration of RF Impairments’ Parameters

The RF impairments include phase noise, IQ imbalance andnonlinearities. The EEVM model in (16) generally describesthe aggregation of multiple impairments [39]–[41]. In thesubsequent beam training phase, we are going to use the OMPalgorithm, however, χ as a part of measurement matrices isunknown. To fulfill the requirements of the OMP algorithm[42], we need to know the values of χ. Since all RF chainsin the same low-cost DBF module share one clock, the valuesof χ are in fact stable over a long coherence time. Thus, it ispossible to internally calibrate the RF impairments’ parametersby a similar calibration method in [43]. For calibration, thelow-cost DBF module should have transmission capability inat least the first two antennas. Therefore, as shown in Fig. 2(c),the first two antennas carry digital-to-analog converters (DAC).The adapted calibration is summarized in Algorithm 1.

Algorithm 1 Internal CalibrationStep 1: The antenna connected to RF chain 1 sends a trainingsignal s = 1.Step 2: The antennas connected to RF chains numbered(2, . . . , Nr) receive the signal; after the low-resolution ADC,the quantized received signal is b1→i, where 2 6 i 6 Nr.Step 3: The antenna connected to RF chain 2 sends trainingsignal s = 1.Step 4: The antennas connected to RF chains numbered(1, 3, . . . , Nr) receive the signal; after the low-resolution ADC,the quantized received signal is b2→i, where 1 6 i 6 Nr andi 6= 2.Output:χcal = Q(χ(2))diag

{b2→1b1→3

b1→2b2→3, 1, b1→3

b1→2, b1→4

b1→2, . . . ,

b1→Nr

b1→2

}.

To be specific, firstly, the antenna connected to RF chain 1sends a training signal s = 1. Then, the antennas connected

to RF chains numbered (2, . . . , Nr) receive the signal; afterthe low-resolution ADC, the quantized received signal b1→i isgiven by

b1→i = Q (ht1χ(i)) ≈ ht1Q(χ(i)) 2 6 i 6 Nr, (18)where t1 and χ(i) denote the unknown RF chain impairments’coefficient of the transmit antenna 1 and the receive antennai, respectively; h represents the channel between the trans-mission and receiver antennas. Since the distance between thetransmission and receiver antennas is quite close, the effectivesignal-to-noise ratio (SNR) can be high enough. Therefore, itis reasonable to assume that the Gaussian thermal noise isnegligible and the channel between different antennas is thesame, denoted as h in (18).

Next, the antenna connected to RF chain 2 sends a trainingsignal s = 1. Then, the antennas connected to RF chainsnumbered (1, 3, . . . , Nr) receive the signal, similarly, after thelow-resolution ADC, the quantized received signal b2→i is givenby

b2→i ≈ ht2Q(χ(i)) 1 ≤ i ≤ Nr, i 6= 2, (19)where t2 is the unknown RF chain impairments’ coefficient ofthe transmit antenna 2. According to (18) and (19), we canobtain that

Q(χ(i))

Q(χ(2))≈

{b2→1b1→3

b1→2b2→3i = 1,

b1→i

b1→22 < i ≤ Nr.

(20)

Then, we obtain the relative quantized estimation of RF chainimpairments’ parameters as

χcal=Q(χ(2))diag

{b2→1b1→3

b1→2b2→3, 1,

b1→3

b1→2,b1→4

b1→2, . . . ,

b1→Nrb1→2

}.

(21)We can assume that Q(χ(2)) = 1 in this study, since theperformance of AoA/AoD estimation based on CS algorithmswill not deteriorate when each RF chain deviates from the realRF chain impairments by the same multiplicative factor. If moreaccurate results are needed, it is not difficult to measure thevalue of Q(χ(2)) by external measuring facilities.

B. Fast Beam Training Method

In this section, we propose a fast beam training method byutilizing the proposed DA-hybrid architecture, therefore, wename the proposed beam training method DAH-BT. Since theMS can be devices which are equipped with large antennaarrays, such as vehicles, in this study we assume that Nt � Land Nr � L.5 Our design leverages a simple observation: theangular equivalent channel matrix Ha is sparse and there is aone-to-one relationship between AoAs/AoDs and the indexesof entries in Ha.

Definition 1. Let Sd denote the set of dominant entries in Ha,where Sd = {(i, k) : |Ha(i, k)| > ε}, and ε is an appropriately

5To be more practical, in simulations, we consider less RF chains in the MSthan BS.

Page 8: Fast Beam Training Architecture for Hybrid mmWave Transceivers · Fast Beam Training Architecture for Hybrid mmWave Transceivers Jie Yang, Student Member, IEEE, Shi Jin, Senoir Member,

7

Ph

ase

1P

ha

se

2

11

L

... ...

Omnidirectional Transmit Omnidirectional Receive Obtain Beams of Arrival

Compressed

Sensing

Obtain Beams of Departure Omnidirectional Receive Single-directional Transmit

Single-directional TransmitOmnidirectional ReceiveObtain Beams of Departure

Compressed

Sensing

Compressed

Sensing

Fig. 5: Illustration of the proposed DAH-BT method, where Lest is the numberof the estimated paths.

chosen threshold. If d = |Sd| � GrGt, we say that Ha iseffectively d-sparse.

The work in [28] deduced the expression of Ha with uniformlinear array (ULA) deployed at both BS and MS. We can inferthat the (i, k)-th entry of Ha for UPA equipped at both BS andMS is given by

Ha(i, k) ≈L∑l=1

αlfNazr

(m(i)

Nazr

−θazl)fNelr

(n(i)

Nelr

−θell)

× f∗Nazr

(p(k)

Nazt

−φazl)f∗Nazr

(q(k)

Nelt

−φell),

(22)

where m(i)∆=⌊

iGelr

⌋+1, n(i)

∆=mod

{iGelr

}, p(k)

∆=⌊kGelt

⌋+1,

q(k)∆= mod

{kGelt

}; fNazr (·), fNelr (·), fNazt (·) and fNelt (·)

are the Dirichlet kernels, where fN (θ) = 1N

∑Ni=1 e

−j2πiθ.According to the summation formula of geometric sequences,fN (θ) = 1

N e−jπ(N+1)θ sin(Nπθ)

sin(πθ) . The property of the sincfunction infers that the indexes of dominant entries in Ha

are corresponding to AoA/AoD pairs, which can be calculatedaccording to (7)-(10), and (22). Then, it is obvious that Ha iseffectively L-sparse due to L paths. Therefore, we can find thematching beams by utilizing standard CS techniques to coarselyreconstruct Ha.

The schematic diagram of the DAH-BT procedure is depictedin Fig. 5. Simply put, the DAH-BT is divided into two phases6:during phase 1, the BS omni-directionally transmits the trainingsignal, and the MS omni-directionally receives the signal, then,via a standard compressed sensing algorithm, the MS estimatesthe AoAs of the beams; during phase 2, by exploiting chan-nel reciprocity, the MS single-directionally transmits trainingsignals along the estimated beam directions successively, andthe BS omni-directionally receives the signal and estimates theAoDs of the beams by a method similar to phase 1. Finally, thebeams are matched. A more detailed introduction of the beamtraining is given as follows.

6Considering that the maximum uplink transmit power is less than that ofdownlink, it is reasonable to design omnidirectional transmission at the BS anddirectional transmission at the MS.

1) DAH-BT Phase 1: The BS omni-directionally sends thetraining signal, as shown in Fig. 5. The transmitted data is givenby

xDL = Pt[1, 0, . . . , 0]T , (23)

where xDL ∈ CNt×1 and Pt is the transmit power, which is setto 1 in this study. The signal received at the MS by the low-costDBF module is given by

yDLRF = χms(H

DLxDL + nDL) + nRF = χmsHDLxDL + nDL

eff, (24)where χms contains the RF chain impairments’ parameters atthe MS, and nDL

eff∆= χmsn

DL +nRF is still the additive Gaussiannoise. Then, the received signal quantized by the low-bit ADCcan be written as

rDLADC=Q(χmsH

DLxDL + nDLeff

)= ηχmsH

DLxDL + nDLe , (25)

where nDLe

∆= ηnDL

eff + nq, and we assume that nDLe ∼

CN (0, σ2dl,eI). Note that a Gaussian distribution is the worst-

case assumption. Here, we define the downlink receive SNR as

SNRDL = 10 log10

(‖ηχmsH

DLxDL‖22/(Nrσ2dl,e)

). (26)

According to (14), by approximately replacing HDL with HDLv =

ArHaAHt , we obtain

rDLADC = ηχmsArHaAHt xDL + nDL

e . (27)

Since√NtGt

At1 = [1, 0, 0, . . . , 0]T , where 1 represents a vectorin which all elements are one, we make xDL =

√NtGt

At1. Then,we have

rDLADC = η

√NtGt

χmsArhDLv + nDL

e , (28)

where hDLv

∆= HaA

Ht At1. Since AH

t At ≈ I, we have hDLv ≈

Ha1. Hence, the row index of hv corresponds to the AoAs ingrid. Denote Φ

∆= η

√NtGt

χmsAr, then, (28) can be simplifiedinto

rDLADC = ΦhDLv + nDL

e . (29)According to the Definition 1 and the property of Ha, hDL

v alsohas a sparse nature. Hence, the equation (29) fits with the CSmodel. Nevertheless, we also need to verify that the derivedmatrix Φ in (29) meets the requirements of the measurementmatrix in CS. To analyze the property of Φ, we should reviewsome concepts of CS. In particular, one fundamental propertyof the measurement matrix that has been very useful in provingthe optimality of CS reconstruction procedures is the RIP [44],[45].Definition 2. Let A be an p×q matrix with p < q. If there existsδk ∈ (0, 1), for all x with ‖x‖0 6 k, the following inequalityholds, we say that the matrix A satisfies the k-order RIP:

(1− δk)‖x‖22 6 ‖Ax‖22 6 (1 + δk)‖x‖22, (30)where ‖ · ‖0 counts the number of nonzero entries of a vector.

Note that RIP can be explained as approximately preservingthe distance of the k-sparse vector pairs. If a matrix meetsthe RIP condition, many algorithms will have great chance torecover a sparse signal from measurement with noise success-fully [28]. A. C. Gilbert et al. have proven that partial Fouriermatrices satisfy the RIP [46]. The partial Fourier matrices areformed by randomly choosing P rows from N × N Fouriermatrices, and normalizing each column. It is obvious that At

and Ar are partial Fourier matrices. Then, we will prove thatthe measurement matrix Φ satisfies the RIP.

Lemma 1. Let Φ = η√NtGt

χmsAr, where the coefficient

Page 9: Fast Beam Training Architecture for Hybrid mmWave Transceivers · Fast Beam Training Architecture for Hybrid mmWave Transceivers Jie Yang, Student Member, IEEE, Shi Jin, Senoir Member,

8

η√NtGt

6 1, the entries in χms satisfy 0 < |χms(i)| < 1, andAr is the matrix defined in (11). Then, for an effectively d-sparse vector hv , there exists δd ∈ (0, 1) such that

(1− δd)‖hDLv ‖22 6 ‖ΦhDL

v ‖22 6 (1 + δd)‖hDLv ‖22. (31)

Namely, Φ satisfies the d-order RIP.

Proof. Please refer to Appendix A.

Hence, with the knowledge of Φ and rDLADC, we can estimatehDLv by the standard OMP algorithm. We denote the estimation

of hDLv as hDL

v . There are Lest entries in hDLv with non-vanishing

power, where Lest is the number of estimated paths. To preventsome paths from being missed, we take Lest slightly larger thanL in simulations. Based on the indexes of these Lest entries,we find the azimuth and elevation AoAs of the correspondingpaths from the discrete angles, according to (7), (8), and (22).The set of estimated AoAs is denoted asSAoA=

[(θazAoA1

, θelAoA1),(θazAoA2

, θelAoA2),. . .,(θazAoALest , θ

elAoALest

)].

The process of beam training phase 1 is summarized in Algo-rithm 2.

Algorithm 2 OMP-based DAH-BT Phase 1Step 1: The BS omni-directionally transmits xDL by the hybridmodule.Step 2: The MS omni-directionally receives signal rDLADC =

η√NtGt

χmsArhDLv + nDL

e by the low-cost DBF module.Step 3: OMP-based AoAs estimation.

Input: rDLADC, Φ = η√NtGt

χmsAr and stopping criterionInitialization: r0 = rDLADC, x0 = 0, Λ0 = ∅, t = 0While not converged do

Match: vt = ΦT rt

Identify: Λt+1 = Λt ∪ {argmaxj|vt(j)|}

(where vt(j) is the j-th entry of vt)Update: xt+1= argmin

z:supp(z)⊆Λt‖r−Φz‖2

rt+1 = r−Φxt+1

t = t+ 1end WhileOutput: hDL

v = xt

Step 4: Find the largest Lest entries in hDLv , look for the

corresponding angles according to (7), (8), and (22), and recordthem in set SAoA.

2) DAH-BT Phase 2: After obtaining the set of receivingbeams SAoA, we avail of channel reciprocity, and the MStransmits the training signal across the directions in set SAoAsuccessively in Lest time slots by the hybrid module, as shownin Fig. 5. The reasons for this successive transmission are (1)match the beams one by one without feedback; (2) better focusthe transmission energy of the MS. The transmitted data is givenby

XUL=Pt

[a∗Rx(θ

azAoA1

, θelAoA1),. . .,a∗Rx(θ

azAoALest

, θelAoALest )],

(32)where XUL ∈ CNr×Lest , and Pt is the transmit power, whichis set to 1 in this study. Then, the signal received at the BS bythe low-cost DBF module is given by

YULRF =χbs(H

ULXUL+NUL)+NULRF=χbsH

ULXUL+NULeff, (33)

where χbs contains the RF chain impairments’ parameters atthe BS, and NUL

eff∆= χbsN

UL+NULRF is still the additive Gaussian

noise. Then, the received signal quantized by the low-bit ADCcan be written as

RULADC=Q

(χbsH

ULXUL+NULeff

)=ηχbsH

ULXUL+NULe , (34)

where NULe

∆= ηNUL

eff + Nq, and we assume that the elementsin NUL

e follow Gaussian distribution with zero mean and σ2ul,e

variance. Here, we define the uplink receive SNR asSNRUL

l = 10 log10

(‖ηχbsH

TxULl ‖22/(Ntσ2

ul,e)), (35)

where xULl is the l-th column of XUL with 1 ≤ l ≤ Lest.

According to (14), by replacing HUL approximately to HULv =

A∗tHTa AT

r , we obtainRUL

ADC = ηχbsA∗tH

Ta AT

r XUL + NULe . (36)

According to (32), ATr XUL is a Gr × Lest matrix, and each

column of ATr XUL has approximately a single 1 at the index of

the corresponding AoA. Denote Ha∆= HT

a ATr XUL, and Ha is

a Gt × Lest matrix. Each column of Ha corresponds to AoAsin set SAoA, and the row indexes of Ha correspond to AoDsin grid. Then, we have

RULADC = ηχbsA

∗t Ha + NUL

e . (37)Vectorizing RUL

ADC, we haverULADC = vec(RUL

ADC) = η(I⊗ χbsA∗t )h

ULv + nUL

e , (38)

where hULv

∆= vec(Ha) and nUL

e∆= vec(NUL

e ), Then, denoteΦ

∆= η(I⊗ χbsA

∗t ), and (38) can be simplified to

rULADC = ΦhULv + nUL

e . (39)Similarly, the following lemma shows that the measurementmatrix Φ satisfies the RIP.

Lemma 2. Let Φ = η(I⊗χbsA∗t ), where the coefficient η 6 1,

and the entries in χbs satisfy 0 < |χbs(i)| < 1, and At is thematrix defined in Section II-C. Then for an effectively d-sparsevector hUL

v , there exists δd ∈ (0, 1) such that

(1− δd)‖hULv ‖22 6 ‖ΦhUL

v ‖22 6 (1 + δd)‖hULv ‖22. (40)

Namely, Φ satisfies the d-order RIP.

Proof. Please refer to Appendix B.

Therefore, hULv can be estimated by the OMP algorithm,

and the estimate of hULv is denoted as hUL

v . Then, we reshapehULv ∈ CGtLest×1 into Ha ∈ CGt×Lest . The index of the largest

entry in the l-th column of Ha represents the AoD of the l-thpath. According to (9), (10), and (22), we can obtain the set ofestimated AoDs denoted asSAoD=

[(φazAoD1

,φelAoD1),(φazAoD2

,φelAoD2),. . .,(φazAoDLest,φ

elAoDLest

)].

To recapitulate, the process of beam training phase 2 is summa-rized in Algorithm 3. After phase 1 and phase 2 of the DAH-BT,we finally identify the pairs of transmitting and receiving beamsrecorded in SAoD and SAoA, respectively.

V. NUMERICAL RESULTS

In this section, we carry out simulations to investigate thefeasibility of the proposed DA-hybrid architecture and theperformance of the proposed DAH-BT method. In all of thesimulations, Nt, Nr, Gazt , Gelt , Gazr and Gelr are set to 64.7

Other parameters will be defined with each simulation. Finally,

7The phase resolution parameter N is 26 = 64, due to the 6-bit resolutionof the RF phase shifters. Hence, Gaz

r , Gelr , Gaz

t , and Gelt should be no greater

than 64.

Page 10: Fast Beam Training Architecture for Hybrid mmWave Transceivers · Fast Beam Training Architecture for Hybrid mmWave Transceivers Jie Yang, Student Member, IEEE, Shi Jin, Senoir Member,

9

Algorithm 3 OMP-based DAH-BT Phase 2Step 1: The MS transmits xUL

l = a∗Rx(θazAoAl

, θelAoAl), wherel = 1, . . . , Lest, successively by the hybrid module.Step 2: The BS omni-directionally receives signal rULADC = η(I⊗χbsA

∗t )h

ULv + nUL

e by the low-cost DBF module.Step 3: OMP-based AoDs estimation is similar to Step 3 inAlgorithm 2, where Input is rULADC and Φ = η(I⊗χbsA

∗t ), and

Output is hULv .

Step 4: Reshape hULv ∈ CGtLest×1 into Ha ∈ CGt×Lest , the

index of the largest entry in the l-th column of Ha representsthe AoD of the l-th path, then, record AoDs in set SAoD.

-70 -60 -50 -40 -30 -20 -10 0SNR (dB)

0

0.5

1

Be

am

Ma

tch

ing

Accu

racy

in P

ha

se

1

A3: ideal hardware

A4: 1-bit ADC

A2

A1

-70 -60 -50 -40 -30 -20 -10 0SNR (dB)

0

0.5

1

Be

am

Ma

tch

ing

Accu

racy

in P

ha

se

2 T: A2; R: A3 ideal hardware

T: A2; R: A4 1-bit ADC

T: A2; R: A2

T: A2; R: A1

T: A1; R: A3 ideal hardware

T: A1; R: A4 1-bit ADC

T: A1; R: A2

T: A1; R: A1

Fig. 6: Comparison of the beam matching accuracy of the proposed beamtraining method in different architectures with L = 1. The receive SNR ofthe upper figure is defined in (26), and the receive SNR of the lower figure isdefined in (35).

all the numerical results provided in this section are obtainedfrom Monte Carlo simulations with 1000 independent channelrealizations for each BS-MS configuration.

-70 -60 -50 -40 -30 -20 -10 0SNR (dB)

0.4

0.6

0.8

1

Be

am

Ma

tch

ing

Accu

racy

in P

ha

se

1

L=1; A4: 1-bit ADC

L=1; A4: 2-bit ADC

L=1; A4: 4-bit ADC

L=1; A4: infinite-bit ADC

L=1; A3: ideal hardware

L=3; A4: 1-bit ADC

L=3; A4: 2-bit ADC

L=3; A4: 4-bit ADC

L=3; A4: infinite-bit ADC

L=3; A3: ideal hardware

-70 -60 -50 -40 -30 -20 -10 0SNR (dB)

0.4

0.6

0.8

1

Be

am

Ma

tch

ing

Accu

racy

in P

ha

se

2

T: A2; R: A4 1-bit ADC

T: A2; R: A4 2-bit ADC

T: A2; R: A4 4-bit ADC

T: A2; R: A4 infinite-bit ADC

T: A2; R: A3 ideal hardware

L=3

L=1

Fig. 7: Beam matching accuracy of the proposed beam training method withdifferent ADC bits and L. The receive SNR of the upper figure is defined in(26), and the receive SNR of the lower figure is defined in (35).

A. Beam Matching Accuracy

In the first set of simulations, we aim at comparing the beamtraining performance of the proposed DA-hybrid architecturewith the traditional DBF and hybrid architectures [2], andanalyzing the impact of replacing the fully-connected hybridmodule with sub-connected hybrid module in the proposed DA-hybrid architecture. In addition, we further study the effect ofincreasing the number of dominant paths L on the proposedDAH-BT method. Since the angles are meshed, according to(7)-(10), the index of the grid point corresponds to the angleone by one. In our algorithm, we estimate the index of the gridpoint that is closest to the real angle. The indexes of the realangle can also be obtained, which may not be integers, by theinverse operation of (7)-(10). When the difference between theestimated indexes and the true indexes is smaller than a giventhreshold, we determine that the beam is correctly matched.In this section, the threshold is set to 5. We define the beammatching accuracy as Ta/T , which signifies that the beams arecorrectly matched Ta times out of T Monte Carlo simulations.

Fig. 6 shows the beam matching accuracy for different archi-tectures. During the simulation, the path number L is set to 1.For beam training phase 1, since the signal is omni-directionallytransmitted, for different transmitter architectures, the trans-mitted signals have the same expression xDL = [1, 0, . . . , 0]T .Hence, we do not have to consider different transmitter archi-tectures during phase 1. We compare the performance of fourreceiver architectures by setting Nr

RF for A1 and A2 to 8 andLr for A2 to 8. For beam training phase 2, beamforming isconsidered and different transmitter architectures have differentbeamforming performance. Obviously, the fully DBF architec-ture A3 has the highest precoding freedom and beamformingprecision among four architectures mentioned in this study,while the low-cost DBF architecture A4 with imperfect RFchains and low-bit ADCs cannot be used as transmitter, sincethe RF impairments’ parameters of the transmitter are unknownto the receiver. Therefore, we only compare two transmitterarchitectures, the fully-connected hybrid architecture A1 andthe sub-connected hybrid architecture A2. Considering that thenumber of RF chains at the MS is smaller than at the BS,we make the following settings: at the transmitters of the MS,the number of RF chains for A1 and A2 is set to 4, henceeach subarray in A2 has 16 antennas; while at the receiversof the BS, the number of RF chains for A1 and A2 is setto 8, and therefore each subarray in A2 has 8 antennas. Atthe receivers, to fit in the CS framework, the WRF in A1 isa random matrix with elements consisting of 1/

√N tRF and

−1/√N tRF , while WRF in A2 is a block diagonal matrix,

where WRF = blkdiag{w1, . . . ,wi, . . . ,wNrRF} and wi is

a random vector with elements consisting of 1/√Lt and

−1/√Lt. Actually, more suitable measurement matrices and

more efficient CS algorithms can be selected to further improvethe performance of A1 and A2. However, this goes outsidethe scope of this study. In this section, we aim at qualitativelycomparing the performance of different architectures.TABLE II: Beam matching accuracy comparison of different receiver architec-tures in phase 1.

Receiver BMA in Phase 1A1 0.771A2 0.829A3 0.995

A4: 1-bit ADC 0.994

Page 11: Fast Beam Training Architecture for Hybrid mmWave Transceivers · Fast Beam Training Architecture for Hybrid mmWave Transceivers Jie Yang, Student Member, IEEE, Shi Jin, Senoir Member,

10TABLE III: Beam matching accuracy comparison of different architectures inphase 2.

Transmitter Receiver BMA in Phase 2A1 A1 0.777A1 A2 0.819A1 A3 0.995A1 A4: 1-bit ADC 0.993A2 A1 0.762A2 A2 0.808A2 A3 0.995A2 A4: 1-bit ADC 0.993

There are several interesting findings shown in Fig. 6: both inphase 1 and 2, as the SNR increases, A4 as a receiver graduallyreaches the performance of A3, which achieves close to 100%beam matching accuracy, while A1 and A2 are far from suchprecision, because A4 can get more information despite of theRF impairments and low ADC bits, making the measurementsmore sparse than A1 and A2, in which the measurements arecompressed by the combining matrix; in phase 2, althoughA2 is less flexible than A1, A2 as a transmitter can achieveperformance close to A1, because in phase 2, only single beamtransmission is needed. More specifically, we list the beammatching accuracy for different architectures in Table II andIII, the values of which are obtained in Fig. 6 with receiveSNRDL = −40 dB for phase 1 and SNRUL = −40 dB for phase2, respectively. In terms of beam matching accuracy, A4 as areceiver has performance close to A3 from Table II, and A2 as atransmitter has performance close to A1 from Table III. Hence,we can conclude that the DA-hybrid architecture composedof the sub-connected hybrid (A2) module and the low-costDBF (A4) module can not only achieve high beam matchingaccuracy, but also reduce the system power consumption andcost, which confirms the feasibility of the proposed DA-hybridarchitecture.

Fig. 7 further displays the beam matching accuracy for A4with different ADC bits and L. For beam training phase 1, as thenumber of L increases, the beam matching accuracy decreasesaccordingly, due to the reduced sparsity. By comparing the reddotted line “A3”, which represents the fully DBF architecturewith ideal hardware, with solid lines, which represent imperfectRF chains and low-bit ADCs, we find that the performanceloss caused by the aggregate impact of residual hardwareimpairments increases when L = 3. However, with the module“A4: 4-bit ADC”, our proposed algorithm can gradually getclose to the performance of ideal hardware module A3, withina certain angle error of beams set by the threshold, as the receiveSNR increases. During beam training phase 2, since beams aresent and received one by one, the beam matching accuracy forbeam training phase 2 will not be affected by the increase ofL, but the beam training time grows linearly with L.

As mentioned before, the beam matching accuracy is cal-culated by setting a particular threshold. For a more intuitiveunderstanding of the beam matching accuracy, we compare theestimated azimuth and elevation AoA indexes with the trueindexes in Fig. 8, where we take DAH-BT phase 1 for exampleby setting L to 3 and SNRDL in (26) to −30 dB. As shownin Fig. 8, for A4 with 1-bit ADC, the average deviation ofthe estimated index in both azimuth and elevation is around 2;for A4 with 4-bit ADC, the average deviation of the estimatedindex is further reduced to less than 1; when the ADC in A4 isinfinite-bit, the accuracy reaches the level of A3. Particularly,from Fig. 8 for A4 with 4-bit ADC, we can see that Lest ≥ L,and the estimated paths cluster around the true values. Actually,

0 8 16 24 32 40 48 56 64

Azimuth angle of arrival index

0

8

16

24

32

40

48

56

64

Ele

va

tio

n a

ng

le o

f a

rriv

al in

de

x

(A4: 1-bit ADC)

0 8 16 24 32 40 48 56 64

Azimuth angle of arrival index

0

8

16

24

32

40

48

56

64

Ele

va

tio

n a

ng

le o

f a

rriv

al in

de

x (A4: 2-bit ADC)

0 8 16 24 32 40 48 56 64

Azimuth angle of arrival index

0

8

16

24

32

40

48

56

64

Ele

va

tio

n a

ng

le o

f a

rriv

al in

de

x (A4: 4-bit ADC)

0 8 16 24 32 40 48 56 64

Azimuth angle of arrival index

0

8

16

24

32

40

48

56

64

Ele

va

tio

n a

ng

le o

f a

rriv

al in

de

x (A4: infinite-bit ADC)

0 8 16 24 32 40 48 56 64

Azimuth angle of arrival index

0

8

16

24

32

40

48

56

64

Ele

va

tio

n a

ng

le o

f a

rriv

al in

de

x

(A3)

Truth

Estimate

Fig. 8: Comparison of the estimated azimuth and elevation AoA indexes withtrue values in phase 1 with L = 3 and SNRDL = −30 dB.

10 20 30 40 50 60 70

d (m)

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Be

am

Ma

tch

ing

Accu

racy in

Ph

ase

1

A4: 1-bit ADC

A4: 2-bit ADC

A4: 4-bit ADC

A4: infinite-bit ADC

A3: ideal hardware

Fig. 9: Beam matching accuracy of the proposed beam training method withdifferent BS-MS distance with L = 1.

in the case where the beam matching accuracy requirement isslightly lower, A4 with 1-bit ADC can be used.

Finally, we analyze the impact of BS-MS distance on beammatching accuracy. In the simulations, we model the path gainof the l-th path as αl ∼ CN (0, γ10−0.1PL) [27]. According to[47], for omnidirectional path loss, PL = α+10β log10(d)+ξdB, where ξ ∼ N (0, σ2) denotes lognormal shadowing, γ =Urτ−1100.1Z is the cluster power fraction with Z ∼ N (0, ς2)and U ∼ U [0, 1], and d represents the BS-MS distance inmeters. The system is assumed to operate at 28 GHz carrierfrequency with 500 MHz bandwidth, hence, the referencevalues of α, β, σ, rτ , and ς can be found in [47, Table I], whererτ = 2.8 and ς = 4.0; for the LoS path, α = 61.4, β = 2, andσ = 5.8 dB; for the NLoS path, α = 72.0, β = 2.92, andσ = 8.7 dB. We assume that the transmit power at the BS isset to 33 dBm, the noise power at the MS is set to -83 dBm, thethreshold is set to 5. The simulation results are shown in Fig. 9and Fig. 10, where the beam matching accuracy decreases whenthe BS-MS distance d increases. Since the BS transmit poweris fixed, the received energy of the signal decreases with dincreases, which causes this performance degradation. However,

Page 12: Fast Beam Training Architecture for Hybrid mmWave Transceivers · Fast Beam Training Architecture for Hybrid mmWave Transceivers Jie Yang, Student Member, IEEE, Shi Jin, Senoir Member,

11

10 20 30 40 50 60 70

d (m)

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Be

am

Ma

tch

ing

Accu

racy in

Ph

ase

1

A4: 1-bit ADC

A4: 2-bit ADC

A4: 4-bit ADC

A4: infinite-bit ADC

A3: ideal hardware

Fig. 10: Beam matching accuracy of the proposed beam training method withdifferent BS-MS distance with L = 3.

under the given simulation parameters, with the module “A4:4-bit ADC”, the beam matching accuracy at beam trainingphase 1 (omnidirectional transmission) can reach 80% or evenmore within 50 (m) when L = 1 and within 45 (m) whenL = 3. Since mmWave frequency band is used for short-distance transmission, the simulation results have confirmed thefeasibility of the proposed beam training algorithm.

B. Outage Probability

After beam training, the BS and MS use the estimated beamdirections to build their precoder PDATA ∈ CNt×Lest (at the BS)and combiner CDATA ∈ CNr×Lest (at the MS). To further an-alyze the performance of the proposed DA-hybrid architectureand DAH-BT method, we quantify the quality of the precoderand combiner by considering the spectral efficiency [23]:

R = log2

∣∣∣∣ILest +P

LestR−1n CH

DATAHDLPDATAP

HDATAH

DLHCDATA

∣∣∣∣ ,(41)

where Lest is the number of estimated main paths via beamtraining, P is the transmit power, Rn is the post-processingnoise covariance matrix, i.e., Rn = σ2

dlCHDATACDATA, and σ2

dl

is the average downlink noise power. Theoretically, R in (41)reaches the maximum value when PDATA and CDATA equal tothe right and left sigular vectors of the channel matrix HDL,respectively. Here, we define the downlink transmit SNR =10 log10(P/σ2

dl). Eq. (41) can be used as a valid indicator ofhow good the estimated beam steering directions are comparedto the optimal ones.

To analyze the spectral efficiency of the proposed method,we deploy A1 as transceivers in both BS and MS duringthe data transmission phase. Note that when L = 1 (onlyone beam is sent at a time), the beamforming and combin-ing vector are obtained by PDATA = aTx(φ

azAoD, φ

elAoD) and

CDATA = aRx(θazAoA, θ

elAoA), respectively, where (θazAoA, θ

elAoA)

and (φazAoD, φelAoD) are defined in sets SAoA and SAoD with

Lest = 1, respectively. When L > 1, multiple beams will betransmitted at the same time, the precoder and combiner are

-20 -15 -10 -5 0 5

SNR (dB)

0

5

10

15

20

25

30

35

Ra

te (

bp

s/H

z)

L=1 DAH-BT (T: A1; R: A4 4-bit ADC)

L=1 Perfect (SVD)

L=3 DAH-BT (T: A1; R: A4 4-bit ADC)

L=3 Perfect (SVD)

Fig. 11: Spectral efficiency comparison between SVD-based precoding/com-bining with perfect channel knowledge and the proposed method when L = 1and L = 3.

given by

PDATA=blkdiag{aTx(φ

azAoD1

,φelAoD1),. . .,aTx(φ

azAoDLest

,φelAoDLest)},

(42)and

CDATA=blkdiag{aRx(θ

azAoA1

,θelAoA1),. . .,aRx(θ

azAoALest

, θelAoALest)},

(43)respectively. Fig. 11 shows that the spectral efficiency ofproposed method is slightly worse than the optimal one whenL = 1 and L = 3., since the estimated beam directions aretaken from limited angular grids.

For further analysis, we define the outage as the event thatthe spectral efficiency delivered to the MS is below a targetvalue Rth. In Fig. 12, we plot the measured outage probabilityfor Rth = 0.1 bps/Hz as a function of the downlink transmitSNR. Obviously, when L = 1 (only one beam is sent at a time)A2 has the same beamforming and combining performance asA1. Therefore, the outage probability of the proposed methodis 0, which can be inferred from Fig. 11. As expected, whenL = 3, the outage probability increases, since the beamformingperformance of the sub-connected hybrid module deteriorateswhen multiple beams are transmitted. The sequential adaptivebeam training (SA-BT) was proposed in [23], [25] and paralleladaptive beam training (PA-BT) was proposed in [27]. Whenthe downlink transmit SNR exceeds −10 dB, the performanceof the proposed DAH-BT method exceeds that of SA-BT andPA-BT.

C. Time Resources

To provide further insights into the time resources required bythe proposed DAH-BT, let us briefly review the procedure of theSA-BT and PA-BT shown in Fig. 13 and Fig. 14, respectively.In SA-BT, the BS and MS can respectively transmit and receivetraining signals through only one sector at each time slot.At each stage, the BS (MS) divides its angular domain intoKBS(KMS) sectors, hence KBSKMS time slots are needed totrain all the possible transmit/receive sector pairs, besides oneadditional slot is required for feedback. Then, the number of

Page 13: Fast Beam Training Architecture for Hybrid mmWave Transceivers · Fast Beam Training Architecture for Hybrid mmWave Transceivers Jie Yang, Student Member, IEEE, Shi Jin, Senoir Member,

12

-30 -25 -20 -15 -10 -5 0 5

SNR (dB)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Ou

tag

e p

rob

ab

ility

L=1 SA-BT (D=1) [25]

L=1 PA-BT (D=2) [27]

L=3 DAH-BT (T: A2; R: A4 4-bit ADC)

L=1 DAH-BT (T: A2; R: A4 4-bit ADC)

Fig. 12: Outage probability referred to a target value Rth = 0.1 bps/Hz versustransmit SNR for different beam training methods.beam training stages required to estimate one AoA/AoD pairwith angular resolution 2π/Na is

S = max[logKBS Na, logKMS Na

]. (44)

Therefore, the total beam training time for SA-BT isLestS(KBSKMS + 1)Tslot, where Tslot denotes one timeslot duration. In PA-BT, the beam training time is furthershortened, since the BS (MS) can transmit the training signalover DBS(DMS) sectors simultaneously, then the total beamtraining time for PA-BT is LestS

(⌈KBSDBS

⌉ ⌈KMSDMS

⌉+ 1)Tslot.

However, when Na is large, the time overhead of the SA-BT and PA-BT will become prohibitive. The time overheadof our proposed method DAH-BT is (Lest + 1)Tslot, which isobviously small in mmWave communication since Lest is small.It is easy to observe from Fig. 15 that the proposed DAH-BThas great potential in saving time resources.

1 2 3 4 5L

est

0

20

40

60

80

100

120

140

Nu

mb

er

of

Tslo

t

SA-BT [25]

PA-BT [27]

Proposed

Fig. 15: Beam training time versus Lest for different methods with Na = 32and KBS = KMS = DBS = DMS = 2.

VI. CONCLUSION

In this study, we researched the problem of beam trainingacceleration by proposing a novel DA-hybrid architecture anda compatible DAH-BT method. Specifically, we first designed a

low-cost DBF module assisted hybrid architecture, where coarseRF chains and low-resolution ADCs were considered in thelow-cost DBF module. We also modeled the RF impairmentsand low-bit quantization by the EEVM model and the addi-tive quantization noise model, respectively. Through numericalcomparison of the cost and power consumption, the DA-hybrid architecture composed of sub-connected hybrid moduleand low-cost DBF module was proven to be an alternativesolution for a good tradeoff between beam training performanceand hardware cost. Additionally, by leveraging the sparsity ofmmWave channels in the angular domain, we developed a fastDAH-BT method based on the proposed DA-hybrid architec-ture, which included the internal calibration phase to acquirethe parameters of RF impairments and the beam training phaseto obtain AoAs and AoDs of beams via the OMP algorithm.We also evaluated the algorithm by proving that the designedmeasurement matrices satisfy the RIP. In contrast to almost allof the approaches in the literature, our design exploited boththe abilities of the hybrid module to generate desired multiplebeams and the low-cost DBF module to accurately capture theangular information. Finally, simulation results revealed that theDA-hybrid architecture composed of the sub-connected hybridmodule and the low-cost DBF module can not only approachclose to 100% beam matching accuracy, but also reduce thesystem power consumption and cost. Moreover, the proposedDAH-BT (which requires only Lest+1 time slots) showed greatadvantage in saving time resources over traditional methodswith comparable spectral efficiency.

APPENDIX APROOF OF LEMMA 1

Denote a = η√NtGt

and |a| < 1, then Φ = aχmsAr, andΦhDL

v = aχmsArhDLv . By denoting Arh

DLv = [g1, g2, . . . , gNr ]

T ,we have

‖ΦhDLv ‖22 =

Nr∑i=1

|aχms(i)gi|2 = a2Nr∑i=1

|χms(i)|2|gi|2, (45)

since |χms(i)|2 < 1, then

a2Nr∑i=1

|χms(i)|2|gi|2 6 a2Nr∑i=1

|gi|2 = a2‖ArhDLv ‖22. (46)

As known that Ar satisfies the d-th order RIP, there existsδ+ ∈ (0, 1) such that

a2‖ArhDLv ‖226 a2(1 + δ+)‖hDL

v ‖22. (47)

Since a2 < 1, we have

a2(1 + δ+)‖hDLv ‖22 6 (1 + δ+)‖hDL

v ‖22. (48)

Hence, we obtain ‖ΦhDLv ‖22 6 (1+δ+)‖hDL

v ‖22. Denote χmin =

min16i6Nr

|χms(i)|, then

a2Nr∑i=1

|χms(i)|2|gi|2 > a2χ2min

Nr∑i=1

|gi|2 = a2χ2min‖ArhDL

v ‖22. (49)

Since Ar satisfies d-th RIP, there exists δ ∈ (0, 1) such that

a2χ2min‖ArhDL

v ‖22>a2χ2min(1−δ)‖hDL

v ‖22=[1−

(1−a2χ2

min(1−δ))]‖hDL

v ‖22. (50)

Page 14: Fast Beam Training Architecture for Hybrid mmWave Transceivers · Fast Beam Training Architecture for Hybrid mmWave Transceivers Jie Yang, Student Member, IEEE, Shi Jin, Senoir Member,

13

Fig. 13: Illustration of the SA-BT protocol with KBS = KMS = 2.Fig. 14: Illustration of the PA-BT protocol with KBS =KMS = DBS = DMS = 2.

Let δ− = 1− a2χ2min(1− δ), we have

a2χ2min‖Arh

DLv ‖22 > (1− δ−)‖hDL

v ‖22. (51)

Hence, we obtain ‖ΦhDLv ‖22 > (1 − δ−)‖hDL

v ‖22. Let δd =

max (δ−, δ+), according to (48) and (51), it is obviously that(1− δd)‖hDL

v ‖22 6 ‖ΦhDLv ‖22 6 (1 + δd)‖hDL

v ‖22.

APPENDIX BPROOF OF LEMMA 2

Let Ha(i) represent the i-th column of matrix Ha, where1 6 i 6 L. Then

‖ΦhULv ‖22=‖η(I⊗ χbsA

∗t )h

ULv ‖22

(a)=η2

L∑i=1

‖χbsA∗t Ha(i)‖22, (52)

where (a) holds because hULv is the vectorized version of Ha.

As we know that A∗t satisfies the d-th order RIP. Similar to(45), (46) and (47), there exists δ+ ∈ (0, 1) such that

η2L∑i=1

‖χbsA∗t Ha(i)‖22

(a)

6L∑i=1

‖A∗t Ha(i)‖22

6 (1 + δ+)L∑i=1

‖Ha(i)‖22 = (1 + δ+)‖hULv ‖22,

(53)

where (a) holds because η < 1 and |χbs(i)|2 < 1. Hence, weget ‖ΦhUL

v ‖22 6 (1+ δ+)‖hULv ‖22. For the same reason as in (49)

and (50) , we have

η2L∑i=1

‖χbsA∗t Ha(i)‖22 > η2χ2

min

L∑i=1

‖A∗t Ha(i)‖22

> η2χ2min(1− δ)

L∑i=1

‖Ha(i)‖22,(54)

and we have

η2χ2min(1− δ)

L∑i=1

‖Ha(i)‖22 = η2χ2min(1− δ)‖hUL

v ‖22= [1− (1− η2χ2

min(1− δ))]‖hULv ‖22.

(55)Let δ− = 1− η2χ2

min(1− δ), we obtain

η2L∑i=1

‖χbsA∗t Ha(i)‖22 > (1− δ−)‖hUL

v ‖22, (56)

hence, we have ‖ΦhULv ‖22 > (1− δ−)‖hUL

v ‖22. Finally we obtain

(1− δd)‖hULv ‖22 6 ‖ΦhUL

v ‖22 6 (1 + δd)‖hULv ‖22, (57)

where δd = max (δ−, δ+).

REFERENCES

[1] T. R. Rappaport, et al., “Millimeter wave mobile communications for 5Gcellular: It will work!” IEEE Access, vol. 1, pp. 335-349, May 2013.

[2] R. W. Heath, Jr., N. G. Prelcic, S. Rangan, W. Roh, and A. M. Sayeed,“An overview of signal processing techniques for millimeter wave MIMOsystems,” IEEE J. Sel. Top. Signal Process., vol. 10, no. 3, pp. 436-453,Apr. 2016.

[3] C. N. Barati, et al., “Initial access in milllimeter wave cellular systems,”IEEE Trans. Wireless Commun., vol. 15, no. 12, pp. 7926-7940, Dec. 2016.

[4] M. Xiao, et al., “Millimeter wave communications for future mobilenetworks,” IEEE J. Sel. Areas Commun., vol. 35, no. 9, pp. 1909-1935,Sept. 2017.

[5] A. Alkhateeb, S. Alex, P. Varkey, Y. Li, Q. Qu, and D. Tujkovic, “Deeplearning coordinated beamforming for highly-mobile millimeter wavesystems,” IEEE Access, vol. 6, pp. 37328-37348, Jun. 2018.

[6] X. Yang, M. Matthaiou, J. Yang, C. K. Wen, F. Gao, and S. Jin, “Hardware-constrained millimeter wave systems for 5G: Challenges, opportunities, andsolutions,” IEEE Commun. Mag., vol. 57, no. 1, pp. 44-50, Jan. 2019.

[7] N. Tawa, T. Kuwabara, Y. Maruta, M. Tanio, and T. Kaneko, “28 GHzdownlink multi-user MIMO experimental verification using 360 elementdigital AAS for 5G massive MIMO,” in Proc. IEEE European MicrowaveConference (EuMC), Sept. 2018.

[8] B. Yang, Z. Yu, J. Lan, R. Zhang, J. Zhou, and W. Hong, “Digitalbeamforming-based massive MIMO transceiver for 5G millimeter-wavecommunications,” IEEE Trans. Microw. Theory Techn., vol. 66, no. 7, pp.3403-3418, Jul. 2018.

[9] M. A. B. Abbasi, V. F. Fusco, H. Tataria, and M. Matthaiou, “Constant-εr lens beamformer for low-complexity millimeter-wave hybrid MIMO,”IEEE Trans. Microw. Theory Techn., vol. 67, no. 7, pp. 2894-2903, Jul.2019.

[10] A. Alkhateeb, J. Mo, N. Gonzalez-Prelcic, and R. W. Heath, Jr., “MIMOprecoding and combining solutions for millimeter-wave systems,” IEEECommun. Mag., vol. 52, no. 12, pp. 122-131, Dec. 2014.

[11] R. Mendez-Rial, C. Rusu, N. Gonzalez-Prelcic, A. Alkhateeb, and R. W.Heath, Jr., “Hybrid MIMO architectures for millimeter wave communica-tions: Phase shifters or switches?” IEEE Access, vol. 4, no. 1, pp. 247-267,Mar. 2016.

[12] C. Mollen, J. Choi, E. G. Larsson, and R. W. Heath, Jr., “Uplinkperformance of wideband massive MIMO with one-bit ADCs,” IEEE Trans.Wireless Commun., vol. 16, no. 1, pp. 87-100, Oct. 2016.

[13] L. Fan, S. Jin, C. K. Wen, and H. Zhang, “Uplink achievable rate formassive MIMO systems with low-resolution ADC,” IEEE Commun. Lett.,vol. 19, no. 12, pp. 2186-2189, Dec. 2015.

[14] Q. Bai and J. Nossek, “Energy efficiency maximization for 5G multi-antenna receivers,” Trans. Emerg. Telecommun. Techn., vol. 26, no. 1, pp.3-14, Jan. 2015.

[15] S. Wang, Y. Li, and J. Wang, “Multiuser detection in massive spatialmodulation MIMO with low-resolution ADCs,” IEEE Trans. WirelessCommun., vol. 14, no. 4, pp. 2156-2168, Apr. 2015.

[16] C. K. Wen, C. J. Wang, S. Jin, K. -K. Wong, and P. Ting, “Bayes-optimaljoint channel-and-data estimation for massive MIMO with low-precisionADCs,” IEEE Trans. Signal Process., vol. 64, no. 10, pp. 2541-2556, May2016.

[17] E. Bjornson, J. Hoydis, M. Kountouris, and M. Debbah, “MassiveMIMO systems with non-ideal hardware: Energy efficiency, estimation,and capacity limits,” IEEE Trans. Inf. Theory, vol. 60, no. 11, pp. 7112-7139, Nov. 2014.

[18] L. Xu, X. Lu, S. Jin, F. Gao, and Y. Zhu, “On the uplink achievable rateof massive MIMO system with low-resolution ADC and RF impairments,”IEEE Commun. Lett., vol. 23, no. 3, pp. 502-505, Jan. 2019.

[19] T. C. W. Schenk, RF Imperfections in High-rate Wireless Systems: Impactand Digital Compensation. Springer Netherlands, 2008.

Page 15: Fast Beam Training Architecture for Hybrid mmWave Transceivers · Fast Beam Training Architecture for Hybrid mmWave Transceivers Jie Yang, Student Member, IEEE, Shi Jin, Senoir Member,

14

[20] N. Kolomvakis, M. Matthaiou, and M. Coldrey, “IQ imbalance inmultiuser systems: Channel estimation and compensation,” IEEE Trans.Commun., vol. 64, no. 7, pp. 3039-3051, Jul. 2016.

[21] T. Nitsche, C. Cordeiro, A. B. Flores, E. W. Knightly, E. Perahia, andJ. C. Widmer, “IEEE 802.11ad: Directional 60 GHz communication formulti-gigabit-per-second Wi-Fi,” IEEE Commun. Mag., vol. 52, no. 12,pp. 132-141, Dec. 2014.

[22] “IEEE 802.11ad specification. part 11: wireless LAN medium accesscontrol (MAC) and physical layer (PHY) specifications amendment 3:enhancements for very high throughput in the 60 GHz band,” Dec. 2012.

[23] A. Alkhateeb, O. El Ayach, G. Leus, and R. W. Heath, Jr., “Channelestimation and hybrid precoding for millimeter wave cellular systems,”IEEE J. Sel. Topics Signal Process., vol. 8, pp. 831-846, Oct. 2014.

[24] O. El Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, and R. W. Heath, Jr.,“Spatially sparse precoding in millimeter wave MIMO systems,” IEEETrans. Wireless Commun., vol. 13, no. 3, pp. 1499-1513, Mar. 2014.

[25] D. Donno, J. Palacios, D. Giustiniano, and J. Widmer, “Hybrid analog-digital beam training for mmWave systems with low-resolution RF phaseshifters,” in Proc. IEEE Int. Conf. Commun. (ICC), May 2016, pp. 700-705.

[26] S. Noh, M. D. Zoltowski, and D. J. Love, “Multi-resolution codebookand adaptive beamforming sequence design for millimeter wave beamalignment,” IEEE Trans. Wireless Commun., vol. 16, no. 9, pp. 5689-5701,Sept. 2017.

[27] D. Donno, J. Palacios, and J. Widmer, “Millimeter-wave beam trainingacceleration through low-complexity hybrid transceivers,” IEEE Trans.Wireless Commun., vol. 16, no. 6, pp. 3646-3660, Jun. 2017.

[28] W. U. Bajwa, J. Haupt, A. M. Sayeed, and R. Nowak, “Compressedchannel sensing: A new approach to estimating sparse multipath channels,”Proc. IEEE, vol. 98, no. 6, pp. 1058-1076, Apr. 2010.

[29] K. Venugopal, A. Alkhateeb, N. G. Prelcic, and R. W. Heath, Jr.,“Channel estimation for hybrid architecture-based wideband millimeterwave systems,” IEEE J. Sel. Areas Commun., vol. 35, no. 9, pp. 1996-2009, Sept. 2017.

[30] J. Yang, C. K. Wen, S. Jin, and F. Gao, “Beamspace channel estimationin mmWave systems via cosparse image reconstruction technique,” IEEETrans. Commun., vol. 66, no. 10, pp. 4767-4782, Oct. 2018.

[31] Technical Specification Group Radio Access Network; Study on NewRadio (NR) Physical channels and modulation (Release 15), document3GPP TS 38.211 V15.7.0, Sep. 2019.

[32] Technical Specification Group Radio Access Network; Study on NewRadio (NR) Physical layer procedures for control (Release 15), document3GPP TS 38.213 V15.7.0, Sep. 2019.

[33] A. M. Sayeed, “Deconstructing multiantenna fading channels,” IEEETrans. Signal Process., vol. 50, no. 10, pp. 2563-2579, Oct. 2002.

[34] J. Brady and A. M. Sayeed, “Beamspace MU-MIMO high density gigabitsmall-cell access at millimeter-wave frequencies,” in Proc. IEEE Int.Workshop Signal Process. Adv. Wireless Commun. (SPAWC), Jun. 2014,pp. 80-84.

[35] J. Song, J. Choi, and D. J. Love, “Common codebook millimeter wavebeam design: Designing beams for both sounding and communication withuniform planar arrays,” IEEE Trans. Commun., vol. 65, no. 4, pp. 1859-1872, Apr. 2017.

[36] A. M. Sayeed, “A virtual representation for time- and frequency-selectivecorrelated MIMO channels,” in Proc. IEEE Int. Conf. Acoustics, Speech,Signal Processing (ICASSP), May 2003, vol. 4, pp. 648-651.

[37] X. Quan, et al., “A 52-57 GHz 6-bit phase shifter with hybrid of passiveand active structures,” IEEE Microw. Wireless Compon. Lett., vol. 28, no.3, pp. 236-238, Mar. 2018.

[38] J. J. Bussgang, “Crosscorrelation functions of amplitude-distorted Gaus-sian signals,” MIT Res. Lab. Electron., Cambridge, MA, USA, Tech. Rep.216, Mar. 1952.

[39] X. Zhang, M. Matthaiou, E. Bjornson, and M. Coldrey, “Impact of residualtransmit RF impairments on training-based MIMO systems,” IEEE Trans.Commun., vol. 63, no. 8, pp. 2899-2911, Aug. 2015.

[40] E. Bjornson, M. Matthaiou, and M. Debbah, “Massive MIMO with non-ideal arbitrary arrays: Hardware scaling laws and circuit-aware design,”IEEE Trans. Wireless Commun., vol. 14, no. 8, pp. 4353-4368, Aug. 2015.

[41] E. Bjornson, M. Matthaiou, and M. Debbah, “A new look at dual-hoprelaying: Performance limits with hardware impairments,” IEEE Trans.Commun., vol. 61, no. 11, pp. 4512-4525, Nov. 2013.

[42] T. T. Cai and L. Wang, “Orthogonal matching pursuit for sparse signalrecovery with noise,” IEEE Trans. Inf. Theory, vol. 57, no. 7, pp. 4680-4688, Jun. 2011.

[43] C. Shepard, et al., “Argos: Practical many-antenna base stations,” in Proc.ACM Int’l. Conf. Mobile Computing and Networking, Aug. 2012, pp. 53-64.

[44] D. L. Donoho, “Compressed sensing,” IEEE Trans. Inf. Theory, vol. 52,no. 4, pp. 1289-1306, Apr. 2006.

[45] E. J. Candes and T. Tao, “Decoding by linear programming,” IEEE Trans.Inf. Theory, vol. 51, no. 12, pp. 4203-4215, Dec. 2005.

[46] A. C. Gilbert, S. Guha, P. Indyk, S. Muthukrishman, and M. J. Strauss,“Near-optimal sparse Fourier representations via sampling,” in Proc. 34thACM Symp. Theory Comput., May 2002, pp. 152-161.

[47] M. R. Akdeniz, et al., “Millimeter wave channel modeling and cellularcapacity evaluation,” IEEE J. Sel. Areas Commun., vol. 32, no. 6, pp. 1164-1179, Jun. 2014.

Jie Yang (S’18) received the B.S. degree in com-munication engineering from Nanjing University ofScience and Technology, Nanjing, China, in 2015,the M.S. degree in information and communica-tions engineering from Southeast University, Nanjing,China, in 2018. She is currently working towards thePh.D. degree in information and communications en-gineering with Southeast University, Nanjing, China.Her current research interests include millimeter-wavewireless communication, massive multi-in multi-out,and compressed sensing.

Shi Jin (S’06-M’07-SM’17) received the B.S. degreein communications engineering from Guilin Univer-sity of Electronic Technology, Guilin, China, in 1996,the M.S. degree from Nanjing University of Posts andTelecommunications, Nanjing, China, in 2003, andthe Ph.D. degree in information and communicationsengineering from the Southeast University, Nanjing,in 2007. From June 2007 to October 2009, he wasa Research Fellow with the Adastral Park ResearchCampus, University College London, London, U.K.He is currently with the faculty of the National Mobile

Communications Research Laboratory, Southeast University. His research in-terests include space time wireless communications, random matrix theory, andinformation theory. He serves as an Associate Editor for the IEEE Transactionson Wireless Communications, and IEEE Communications Letters, and IETCommunications. Dr. Jin and his co-authors have been awarded the 2011 IEEECommunications Society Stephen O. Rice Prize Paper Award in the field ofcommunication theory and a 2010 Young Author Best Paper Award by theIEEE Signal Processing Society.

Chao-Kai Wen (S’00–M’04) received the Ph.D. de-gree from the Institute of Communications Engineer-ing, National Tsing Hua University, Taiwan, in 2004.He was with Industrial Technology Research Insti-tute, Hsinchu, Taiwan and MediaTek Inc., Hsinchu,Taiwan, from 2004 to 2009. Since 2009, he has beenwith National Sun Yat-sen University, Taiwan, wherehe is Professor of the Institute of CommunicationsEngineering. His research interests center around theoptimization in wireless multimedia networks.

Xi Yang received the the B.S. degree and the M.S.degree from Southeast University, Nanjing, China in2013 and 2016 respectively. She is currently workingtoward the Ph.D. degree with the School of Informa-tion Science and Engineering, Southeast University.Her main research interests include wireless com-munication system prototyping, massive MIMO, andmillimeter wave communications.

Page 16: Fast Beam Training Architecture for Hybrid mmWave Transceivers · Fast Beam Training Architecture for Hybrid mmWave Transceivers Jie Yang, Student Member, IEEE, Shi Jin, Senoir Member,

15

Matthaiou_photo-eps-converted-to.pdf

Michail Matthaiou (S’05–M’08–SM’13) was bornin Thessaloniki, Greece in 1981. He obtained theDiploma degree (5 years) in Electrical and ComputerEngineering from the Aristotle University of Thessa-loniki, Greece in 2004. He then received the M.Sc.(with distinction) in Communication Systems andSignal Processing from the University of Bristol, U.K.and Ph.D. degrees from the University of Edinburgh,U.K. in 2005 and 2008, respectively. From September2008 through May 2010, he was with the Institutefor Circuit Theory and Signal Processing, Munich

University of Technology (TUM), Germany working as a Postdoctoral ResearchAssociate. He is currently a Reader (equivalent to Associate Professor) inMultiple-Antenna Systems at Queen’s University Belfast, U.K. after holding anAssistant Professor position at Chalmers University of Technology, Sweden. Hisresearch interests span signal processing for wireless communications, massiveMIMO systems, hardware-constrained communications, mm-wave systems anddeep learning for communications.

Dr. Matthaiou and his coauthors received the IEEE Communications So-ciety (ComSoc) Leonard G. Abraham Prize in 2017. He was awarded theprestigious 2018/2019 Royal Academy of Engineering/The Leverhulme TrustSenior Research Fellowship and recently received the 2019 EURASIP EarlyCareer Award. His team was also the Grand Winner of the 2019 Mobile WorldCongress Challenge. He was the recipient of the 2011 IEEE ComSoc BestYoung Researcher Award for the Europe, Middle East and Africa Region anda co-recipient of the 2006 IEEE Communications Chapter Project Prize forthe best M.Sc. dissertation in the area of communications. He has co-authoredpapers that received best paper awards at the 2018 IEEE WCSP and 2014IEEE ICC and was an Exemplary Reviewer for IEEE COMMUNICATIONSLETTERS for 2010. In 2014, he received the Research Fund for InternationalYoung Scientists from the National Natural Science Foundation of China. He iscurrently the Editor-in-Chief of Elsevier Physical Communication and a SeniorEditor for IEEE WIRELESS COMMUNICATIONS LETTERS. In the past, he wasan Associate Editor for the IEEE TRANSACTIONS ON COMMUNICATIONS,Associate Editor/Senior Editor for IEEE COMMUNICATIONS LETTERS andwas the Lead Guest Editor of the special issue on “Large-scale multipleantenna wireless systems” of the IEEE JOURNAL ON SELECTED AREAS INCOMMUNICATIONS.