1524 IEEE JOURNAL ON SELECTED AREAS IN...

15
1524 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 35, NO. 7, JULY 2017 Low-Rank Tensor Decomposition-Aided Channel Estimation for Millimeter Wave MIMO-OFDM Systems Zhou Zhou, Jun Fang, Member, IEEE, Linxiao Yang, Hongbin Li, Senior Member, IEEE, Zhi Chen, and Rick S. Blum, Fellow, IEEE Abstract— We consider the problem of downlink chan- nel estimation for millimeter wave (mmWave) MIMO-OFDM systems, where both the base station (BS) and the mobile station (MS) employ large antenna arrays for directional pre- coding/beamforming. Hybrid analog and digital beamforming structures are employed in order to offer a compromise between hardware complexity and system performance. Different from most existing studies that are concerned with narrowband chan- nels, we consider estimation of wideband mmWave channels with frequency selectivity, which is more appropriate for mmWave MIMO-OFDM systems. By exploiting the sparse scattering nature of mmWave channels, we propose a CANDECOMP/ PARAFAC (CP) decomposition-based method for channel para- meter estimation (including angles of arrival/departure, time delays, and fading coefficients). In our proposed method, the received signal at the MS is expressed as a third-order tensor. We show that the tensor has the form of a low-rank CP, and the channel parameters can be estimated from the associated factor matrices. Our analysis reveals that the uniqueness of the CP decomposition can be guaranteed even when the size of the tensor is small. Hence the proposed method has the potential to achieve substantial training overhead reduction. We also develop Cramér-Rao bound (CRB) results for channel parameters and compare our proposed method with a compressed sensing-based method. Simulation results show that the proposed method attains mean square errors that are very close to their associated CRBs and present a clear advantage over the compressed sensing- based method. Index Terms—MmWave MIMO-OFDM systems, channel esti- mation, CANDECOMP/PARAFAC (CP) decomposition, Cramér- Rao bound (CRB). Manuscript received November 1, 2016; revised February 18, 2017; accepted March 6, 2017. Date of publication April 28, 2017; date of current version June 19, 2017. This work was supported in part by the National Science Foundation of China under Grant 61522104, Grant 61428103, and Grant 91438118, in part by the Key Project of Fundamental and Frontier Research of Chongqing under Grant cstc2015jcyjBX0085, in part by the National Science Foundation under Grant ECCS-1408182, Grant ECCS- 1609393, and Grant ECCS-1405579, and in part by the Air Force Office of Scientific Research under Grant FA9550-16-1-0243. (Corresponding author: Jun Fang.) Z. Zhou, J. Fang, L. Yang, and Z. Chen are with the National Key Laboratory of Science and Technology on Communications, University of Electronic Science and Technology of China, Chengdu 611731, China (e-mail: [email protected]). H. Li is with the Department of Electrical and Computer Engineer- ing, Stevens Institute of Technology, Hoboken, NJ 07030 USA (e-mail: [email protected]). R. S. Blum is with the Department of Electrical and Computer Engineering, Lehigh University, Bethlehem, PA 18015 USA (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JSAC.2017.2699338 I. I NTRODUCTION M ILLIMETER-WAVE (mmWave) communication is a promising technology for future cellular net- works [1]–[4]. The large bandwidth available in mmWave bands can offer gigabit-per-second communication data rates. However, high signal attenuation at such high frequency presents a major challenge for system design [5]. To com- pensate for the significant path loss, large antenna arrays should be used at both the base station (BS) and the mobile station (MS) to provide sufficient beamforming gain for mmWave communications [6], [7]. This requires accurate channel estimation which is essential for the proper operation of directional precoding/beamforming in mmWave systems. Note that the high directivity makes mmWave communications vulnerable to blockage events, which can be frequent in indoor environments. To address this issue, multiple relays can be employed and a path selection technique was proposed in [8] to select the optimal path that maximizes the throughput. Channel estimation in mmWave systems is challenging due to hybrid precoding structures and the large number of antennas. A primary challenge is that hybrid precoding structures [9]–[13] employed in mmWave systems prevent the digital baseband from directly accessing the entire channel dimension. This is also referred to as the channel subspace sampling limitation [14], [15], which makes it difficult to acquire useful channel state information (CSI) during a prac- tical channel coherence time. To address this issue, fast beam scanning and searching techniques have been extensively stud- ied, e.g. [14], [16], [17]. The objective of beam scanning is to search for the best beamformer-combiner pair by letting the transmitter and receiver scan the adaptive sounding beams or coded beams chosen from pre-determined sounding beam codebooks. Nevertheless, as the number of antennas increases, the size of the codebook should be enlarged accordingly, which in turn results in an increase in the sounding/training overhead. Unlike beam scanning techniques whose objective is to find the best beam pair, another approach is to directly estimate the mmWave channel or its associated parameters, e.g. [15], [18]–[21]. In particular, by exploiting the sparse scattering nature of the mmWave channels, mmWave channel estima- tion can be formulated as a sparse signal recovery problem, and it has been shown [18], [19] that substantial reduc- tion in training overhead can be achieved via compressed 0733-8716 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Transcript of 1524 IEEE JOURNAL ON SELECTED AREAS IN...

Page 1: 1524 IEEE JOURNAL ON SELECTED AREAS IN …static.tongtianta.site/paper_pdf/f93e6104-554f-11e9-8339-00163e08… · review on tensors and the CP decomposition. More details regarding

1524 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 35, NO. 7, JULY 2017

Low-Rank Tensor Decomposition-AidedChannel Estimation for Millimeter

Wave MIMO-OFDM SystemsZhou Zhou, Jun Fang, Member, IEEE, Linxiao Yang, Hongbin Li, Senior Member, IEEE,

Zhi Chen, and Rick S. Blum, Fellow, IEEE

Abstract— We consider the problem of downlink chan-nel estimation for millimeter wave (mmWave) MIMO-OFDMsystems, where both the base station (BS) and the mobilestation (MS) employ large antenna arrays for directional pre-coding/beamforming. Hybrid analog and digital beamformingstructures are employed in order to offer a compromise betweenhardware complexity and system performance. Different frommost existing studies that are concerned with narrowband chan-nels, we consider estimation of wideband mmWave channels withfrequency selectivity, which is more appropriate for mmWaveMIMO-OFDM systems. By exploiting the sparse scatteringnature of mmWave channels, we propose a CANDECOMP/PARAFAC (CP) decomposition-based method for channel para-meter estimation (including angles of arrival/departure, timedelays, and fading coefficients). In our proposed method, thereceived signal at the MS is expressed as a third-order tensor.We show that the tensor has the form of a low-rank CP, andthe channel parameters can be estimated from the associatedfactor matrices. Our analysis reveals that the uniqueness of theCP decomposition can be guaranteed even when the size of thetensor is small. Hence the proposed method has the potential toachieve substantial training overhead reduction. We also developCramér-Rao bound (CRB) results for channel parameters andcompare our proposed method with a compressed sensing-basedmethod. Simulation results show that the proposed methodattains mean square errors that are very close to their associatedCRBs and present a clear advantage over the compressed sensing-based method.

Index Terms— MmWave MIMO-OFDM systems, channel esti-mation, CANDECOMP/PARAFAC (CP) decomposition, Cramér-Rao bound (CRB).

Manuscript received November 1, 2016; revised February 18, 2017;accepted March 6, 2017. Date of publication April 28, 2017; date of currentversion June 19, 2017. This work was supported in part by the NationalScience Foundation of China under Grant 61522104, Grant 61428103, andGrant 91438118, in part by the Key Project of Fundamental and FrontierResearch of Chongqing under Grant cstc2015jcyjBX0085, in part by theNational Science Foundation under Grant ECCS-1408182, Grant ECCS-1609393, and Grant ECCS-1405579, and in part by the Air Force Office ofScientific Research under Grant FA9550-16-1-0243. (Corresponding author:Jun Fang.)

Z. Zhou, J. Fang, L. Yang, and Z. Chen are with the National KeyLaboratory of Science and Technology on Communications, Universityof Electronic Science and Technology of China, Chengdu 611731, China(e-mail: [email protected]).

H. Li is with the Department of Electrical and Computer Engineer-ing, Stevens Institute of Technology, Hoboken, NJ 07030 USA (e-mail:[email protected]).

R. S. Blum is with the Department of Electrical and Computer Engineering,Lehigh University, Bethlehem, PA 18015 USA (e-mail: [email protected]).

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/JSAC.2017.2699338

I. INTRODUCTION

M ILLIMETER-WAVE (mmWave) communication isa promising technology for future cellular net-

works [1]–[4]. The large bandwidth available in mmWavebands can offer gigabit-per-second communication data rates.However, high signal attenuation at such high frequencypresents a major challenge for system design [5]. To com-pensate for the significant path loss, large antenna arraysshould be used at both the base station (BS) and the mobilestation (MS) to provide sufficient beamforming gain formmWave communications [6], [7]. This requires accuratechannel estimation which is essential for the proper operationof directional precoding/beamforming in mmWave systems.Note that the high directivity makes mmWave communicationsvulnerable to blockage events, which can be frequent in indoorenvironments. To address this issue, multiple relays can beemployed and a path selection technique was proposed in [8]to select the optimal path that maximizes the throughput.

Channel estimation in mmWave systems is challengingdue to hybrid precoding structures and the large numberof antennas. A primary challenge is that hybrid precodingstructures [9]–[13] employed in mmWave systems prevent thedigital baseband from directly accessing the entire channeldimension. This is also referred to as the channel subspacesampling limitation [14], [15], which makes it difficult toacquire useful channel state information (CSI) during a prac-tical channel coherence time. To address this issue, fast beamscanning and searching techniques have been extensively stud-ied, e.g. [14], [16], [17]. The objective of beam scanning isto search for the best beamformer-combiner pair by lettingthe transmitter and receiver scan the adaptive sounding beamsor coded beams chosen from pre-determined sounding beamcodebooks. Nevertheless, as the number of antennas increases,the size of the codebook should be enlarged accordingly, whichin turn results in an increase in the sounding/training overhead.

Unlike beam scanning techniques whose objective is to findthe best beam pair, another approach is to directly estimatethe mmWave channel or its associated parameters, e.g. [15],[18]–[21]. In particular, by exploiting the sparse scatteringnature of the mmWave channels, mmWave channel estima-tion can be formulated as a sparse signal recovery problem,and it has been shown [18], [19] that substantial reduc-tion in training overhead can be achieved via compressed

0733-8716 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Page 2: 1524 IEEE JOURNAL ON SELECTED AREAS IN …static.tongtianta.site/paper_pdf/f93e6104-554f-11e9-8339-00163e08… · review on tensors and the CP decomposition. More details regarding

ZHOU et al.: LOW-RANK TENSOR DECOMPOSITION-AIDED CHANNEL ESTIMATION FOR MILLIMETER WAVE MIMO-OFDM SYSTEMS 1525

sensing methods. In [19], an adaptive compressed sensingmethod was developed for mmWave channel estimation basedon a hierarchical multi-resolution beamforming codebook.Compared to the standard compressed sensing method, theadaptive method is more efficient as the training precoding isadaptively adjusted according to the outputs of earlier stages.Nevertheless, this improved efficiency comes at the expenseof requiring feedback from the MS to the BS. Other com-pressed sensing-based mmWave channel estimation methodsinclude [22]–[28]. Most of the above existing methods areconcerned with estimation of narrowband channels. MmWavesystems, however, are very likely to operate on widebandchannels with frequency selectivity [29]. In [30], the authorsconsidered the problem of multi-user uplink channel esti-mation in mmWave MIMO-OFDM systems and proposed adistributed compressed sensing-based scheme by exploitingthe angular domain structured sparsity of mmWave widebandfrequency-selective fading channels. Precoding design, withlimited feedback for frequency selective wideband mmWavechannels, was studied in [29].

In this paper, we study the problem of downlink chan-nel estimation for mmWave MIMO-OFDM systems, wherewideband frequency-selective fading channels are considered.We propose a CANDECOMP/PARAFAC (CP) decomposition-based method for downlink channel estimation. The proposedmethod is based on the following three key observations. First,by adopting a simple setup at the transmitter, the receivedsignal can be organized into a third-order tensor which admitsa CP decomposition. Second, due to the sparse scatteringnature of mmWave channels, the tensor has an intrinsic lowCP rank that guarantees the uniqueness of the CP decom-position. Third, the channel parameters, including angles ofarrival/departure, time delays, and fading coefficients, can beeasily extracted based on the decomposed factor matrices.We conduct a rigorous analysis on the uniqueness of theCP decomposition. Analyses show that the uniqueness of theCP decomposition can be guaranteed even when the size ofthe tensor is small. This result implies that our proposedmethod can achieve a substantial training overhead reduction.The Cramér-Rao bound (CRB) results for channel parametersare also developed, which provides a benchmark for theperformance of our proposed method, and also describes thebest asymptotically achievable performance. Our experimentsshow that the mean square errors attained by the proposedmethod are close to their corresponding CRBs.

Our proposed CP decomposition-based method enjoysthe following advantages as compared with the compressedsensing-based method. Firstly, unlike compressed sensingtechniques which require to discretize the continuous parame-ter space into a finite set of grid points, our proposed method isessentially a gridless approach and therefore is free of the griddiscretization errors. Secondly, the proposed method capturesthe intrinsic multi-dimensional structure of the multiway data,which helps achieve a performance improvement. Thirdly,the use of tensors for data representation and processingleads to a very low computational complexity, whereas mostcompressed sensing methods are usually plagued by highcomputational complexity. Our simulation results show that

our proposed method has a computational complexity as lowas the simplest compressed sensing method, i.e. the orthogonalmatching pursuit (OMP) method [31], [32], while achievinga much higher estimation accuracy than the OMP. Lastly,the conditions for the uniqueness of the CP decompositionare easy to analyze, and can be employed to determinethe exact amount of training overhead required for uniquedecomposition. In contrast, it is usually difficult to analyze andcheck the exact recovery condition for generic dictionaries forcompressed sensing techniques.

We would like to emphasize the difference between thecurrent work and our previous work [21]. Although bothworks employ the CP decomposition for mmWave channelestimation, their formulations are quite different. Specifically,our previous work [21] considered the estimation of multi-user uplink narrowband mmWave channels, which requires adelicate layered transmission scheme such that the receivedsignal can be expressed as a tensor that admits a CP decom-position. In contrast, our current work considers the estimationof downlink wideband frequency-selective mmWave channels.Subcarriers in the OFDM systems provide an additional modethat naturally leads to a tensor representation of the data.In the current paper, we also provide CRB results for channelparameters, which is not available in our previous work.

The rest of the paper is organized as follows. In Section II,we provide notations and basics on the CP decomposi-tion. The system model and the channel estimation problemare discussed in Section III. In Section IV, we propose aCP decomposition-based method for mmWave channel estima-tion. The uniqueness of the CP decomposition is also analyzed.Section V develops CRB results for the estimation of channelparameters. A compressed sensing-based channel estimationmethod is discussed in Section VI. Computational complexityof the proposed method and the compressed sensing-basedmethod is analyzed in Section VII. Simulation results areprovided in Section VIII, followed by concluding remarks inSection IX.

II. PRELIMINARIES

To make the paper self-contained, we provide a briefreview on tensors and the CP decomposition. More detailsregarding the notations and basics on tensors can be foundin [33]–[36]. Simply speaking, a tensor is a generalization ofa matrix to higher-order dimensions, also known as ways ormodes. Vectors and matrices can be viewed as special cases oftensors with one and two modes, respectively. Throughout thispaper, we use symbols ⊗ , ◦ , and � to denote the Kronecker,outer, and Khatri-Rao product, respectively.

Let XXX ∈ CI1×I2×···×IN denote an N th-order tensor withits (i1, . . . , iN )th entry denoted by Xi1 ···iN . Here the orderN of a tensor is the number of dimensions. Fibers are ahigher-order analogue of matrix rows and columns. The mode-n fibers of XXX are In-dimensional vectors obtained by fixingevery index but in . Slices are two-dimensional sections of atensor, defined by fixing all but two indices. Unfolding ormatricization is an operation that turns a tensor into a matrix.The mode-n unfolding of a tensor XXX , denoted as X(n), arrangesthe mode-n fibers to be the columns of the resulting matrix.

Page 3: 1524 IEEE JOURNAL ON SELECTED AREAS IN …static.tongtianta.site/paper_pdf/f93e6104-554f-11e9-8339-00163e08… · review on tensors and the CP decomposition. More details regarding

1526 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 35, NO. 7, JULY 2017

Fig. 1. Schematic of CP decomposition.

The CP decomposition decomposes a tensor into a sum ofrank-one component tensors (see Fig. 1), i.e.

XXX =R∑

r=1

λr a(1)r ◦ a(2)

r ◦ · · · ◦ a(N)r (1)

where a(n)r ∈ CIn , the minimum achievable R is referred to as

the rank of the tensor, and A(n) � [a(n)1 . . . a(n)

R ] ∈ CIn×R

denotes the factor matrix along the n-th mode. Elementwise,we have

Xi1 i2 ···iN =R∑

r=1

λr a(1)i1r a(2)

i2r · · · a(N)iN r (2)

The mode-n unfolding of XXX can be expressed as

X (n) = A(n)�(

A(N) � · · · A(n+1) � A(n−1) � · · · A(1))T

(3)

where � � diag(λ1, . . . , λR).

III. SYSTEM MODEL

Consider a mmWave massive MIMO-OFDM system con-sisting of a base station (BS) and multiple mobile sta-tions (MSs). To facilitate the hardware implementation, hybridanalog and digital beamforming structures are employedby both the BS and the MS (see Fig. 2). We assumethat the BS is equipped with NBS antennas and MBSRF chains, and the MS is equipped with NMS anten-nas and MMS RF chains. The number of RF chains isless than the number of antennas, i.e. MBS < NBS andMMS < NMS. In particular, we assume MMS = 1,i.e. each MS has only one RF chain. The total numberof OFDM tones (subcarriers) is assumed to be K , amongwhich K subcarriers are selected for training. For simplicity,here we assume subcarriers {1, 2, . . . , K } are assigned fortraining. Nevertheless, our formulation and method can beeasily extended to other subset choices. In the downlinkscenario, we only need to consider a single user systembecause the channel estimation is conducted by each userindividually [37]. To simplify our problem, we ignore theinter-cell interference resulting from frequency reuse fromneighboring cells. To mitigate inter-cell interference, manyuseful techniques such as the coordinated transmission schemewith large-scale CSI at the transmitter [38] and the cognitivetransmission scheme [39] were recently developed.

We adopt a downlink training scheme similar to [18], [19].For each subcarrier, the BS employs T different beamformingvectors at T successive time frames. Each time frame isdivided into M sub-frames, and at each sub-frame, the MS usesan individual combining vector to detect the transmitted signal.

The beamforming vector associated with the kth subcarrier atthe t th time frame can be expressed as

xk(t) = FRF(t)Fk(t)sk(t) ∀k = 1, . . . , K (4)

where sk(t) ∈ Cr denotes the pilot symbol vector, Fk(t) ∈CMBS×r denotes the digital precoding matrix for the kthsubcarrier, and FRF(t) ∈ CNBS×MBS is a common RF precoderfor all subcarriers. The procedure to generate the beamformingvector (4) is elaborated as follows. The pilot symbol vectorsk(t) at each subcarrier is first precoded using a digitalprecoding matrix Fk(t). The symbol blocks are transformedto the time-domain using K -point inverse discrete Fouriertransform (IDFT). A cyclic prefix is then added to the symbolblocks, finally a common RF precoder FRF(t) is applied toall subcarriers.

At each time frame, the MS successively employs MRF combining vectors {qm} to detect the transmitted signal.Note that these combining vectors are common to all subcarri-ers. At each sub-frame, the received signal is first combined inthe RF domain. Then, the cyclic prefix is removed and symbolsare converted back to the frequency domain by performinga discrete Fourier transform (DFT). After processing, thereceived signal associated with the kth subcarrier at the mthsub-frame can be expressed as [29]

yk,m(t) = qTm Hk xk(t) + wk,m(t) (5)

where qm ∈ CNMS denotes the combining vector used at

the mth sub-frame, Hk ∈ CNMS×NBS is the channel matrixassociated with the kth subcarrier, and wk,m(t) denotes theadditive Gaussian noise. Collecting the M received signals{yk,m(t)}M

m=1 at each time frame, we have

yk(t) = QT Hk xk(t) + wk(t)

= QT Hk FRF(t)Fk(t)sk(t) + wk(t) (6)

where

yk(t) � [yk,1(t) . . . yk,M (t)]T

wk(t) � [wk,1(t) . . . wk,M (t)]T

Q � [q1 . . . qM ] (7)

Measurement campaigns in dense-urban NLOS environ-ments reveal that mmWave channels typically exhibit limitedscattering characteristics [40]. Also, considering the widebandnature of mmWave channels, we adopt a geometric widebandmmWave channel model with L scatterers between the MS andthe BS. Each scatterer is characterized by a time delay τl ,angles of arrival and departure (AoA/AoD), θl , φl ∈ [0, 2π].With these parameters, the channel matrix in the delay domaincan be written as [29], [30]

H(τ ) =L∑

l=1

αl aMS(θl)aTBS(φl)δ(τ − τl) (8)

where αl is the complex path gain associated with the lth path,aMS(θl) and aBS(φl) are the antenna array response vectorsof the MS and BS, respectively, and δ(·) represents the deltafunction. Throughout this paper, we assume

Page 4: 1524 IEEE JOURNAL ON SELECTED AREAS IN …static.tongtianta.site/paper_pdf/f93e6104-554f-11e9-8339-00163e08… · review on tensors and the CP decomposition. More details regarding

ZHOU et al.: LOW-RANK TENSOR DECOMPOSITION-AIDED CHANNEL ESTIMATION FOR MILLIMETER WAVE MIMO-OFDM SYSTEMS 1527

Fig. 2. A block diagram of the MIMO-OFDM transceiver that employs hybrid analog/digital precoding.

A1 Different scatterers have different angles of arrival, anglesof departure as well as time delays, i.e. θi �= θ j , φi �= φ j ,τi �= τ j for i �= j .

Since scatterers are randomly distributed in space, this assump-tion is usually valid in practice.

Given the delay-domain channel model, the frequency-domain channel matrix Hk associated with the kth subcarriercan be obtained as

Hk =L∑

l=1

αl exp(− j2πτl fsk/K )aMS(θl)aTBS(φl) (9)

where fs denotes the sampling rate. Our objective is toestimate the channel matrices {Hk}K

k=1 from the receivedsignals yk(t),∀k = 1, . . . , K ,∀t = 1, . . . , T . In particular,we wish to provide a reliable channel estimate by usingas few measurements as possible because the number ofmeasurements is linearly proportional to the number of timeframes and the number of sub-frames, both of which areexpected to be minimized. To facilitate our algorithm develop-ment, we assume that the digital precoding matrices and thepilot symbols remain the same for different subcarriers, i.e.Fk(t) = F(t), sk(t) = s(t),∀k = 1, . . . , K . As will be shownlater, this simplification enables us to develop an efficienttensor factorization-based method to extract the channel stateinformation from very few number of measurements.

IV. PROPOSED CP DECOMPOSITION-BASED METHOD

Suppose Fk(t) = F(t) and sk(t) = s(t),∀k = 1, . . . , K .Let S � [s(1) . . . s(T )]. The received signal at the kthsubcarrier can be written as

Y k = QT Hk P + Wk k = 1, . . . , K (10)

where

Y k � [ yk(1) . . . yk(T )]Wk � [wk(1) . . . wk(T )]

P � [ p(1) . . . p(T )] (11)

in which p(t) � FRF(t)F(t)s(t).Since signals from multiple subcarriers are available at the

MS, the received signal can be expressed by a third-ordertensor YYY ∈ CM×T ×K whose three modes respectively standfor the sub-frame, the time frame, and the subcarrier, and its

(m, t, k)th entry is given by yk,m(t). Substituting (9) into (10),we obtain

Y k =L∑

l=1

αl,k QT aMS(θl)aTBS(φl)P + Wk

=L∑

l=1

αl,k aMS(θl)aTBS(φl) + Wk (12)

where αl,k � αl exp(− j2πτl fsk/K ), aMS(θl) � QT aMS(θl),and aBS(φl) � PT aBS(φl). We see that each slice of the tensorYYY , Y k , is a weighted sum of a common set of rank-one outerproducts. The tensor YYY thus admits the following CANDE-COMP/PARAFAC (CP) decomposition which decomposes atensor into a sum of rank-one component tensors, i.e.

YYY =L∑

l=1

aMS(θl) ◦ aBS(φl) ◦ (αl g(τl)) + WWW (13)

in which

g(τl) � [exp(− j2πτl fs(1/K )) . . . exp(− j2πτl fs(K/K ))]T

(14)

Due to the sparse scattering nature of the mmWave channel,the number of paths, L, is usually small relative to thedimensions of the tensor. Hence the tensor YYY has an intrinsiclow-rank structure. As will be discussed later, this low-rankstructure ensures that the CP decomposition of YYY is unique upto scaling and permutation ambiguities. Therefore an estimateof the parameters {αl , φl , θl , τl} can be obtained by performinga CP decomposition of the received signal YYY . Define

A � [aMS(θ1) . . . aMS(θL)] (15)

B � [aBS(φ1) . . . aBS(φL)] (16)

C � [α1 g(τ1) . . . αL g(τL)] (17)

These three matrices {A, B, C} are factor matrices associatedwith a noiseless version of YYY .

A. CP Decomposition

If the number of paths, L, is known or estimated a priori,the CP decomposition of YYY can be accomplished by solving

minA,B,C

‖YYY −L∑

l=1

al ◦ bl ◦ cl‖2F (18)

Page 5: 1524 IEEE JOURNAL ON SELECTED AREAS IN …static.tongtianta.site/paper_pdf/f93e6104-554f-11e9-8339-00163e08… · review on tensors and the CP decomposition. More details regarding

1528 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 35, NO. 7, JULY 2017

where we let A = [a1 . . . aL], B = [b1 . . . bL ], andC = [c1 . . . cL ]. The above optimization can be efficientlysolved by an alternating least squares (ALS) procedure whichalternatively minimizes the data fitting error with respect toone of the factor matrices, with the other two factor matricesfixed

A(t+1) = arg min

A

∥∥∥Y T(1) − (C

(t) � B(t)

) AT∥∥∥

2

F(19)

B(t+1) = arg min

B

∥∥∥Y T(2) − (C

(t) � A(t+1)

)BT∥∥∥

2

F(20)

C(t+1) = arg min

C

∥∥∥Y T(3) − (B

(t+1) � A(t+1)

)CT∥∥∥

2

F(21)

Note that (19)–(21) are least squares problems whose solutionscan be easily obtained.

If the knowledge of the number of paths L is unavail-able, more sophisticated CP decomposition techniques(e.g. [41]–[43]) can be employed to jointly estimate themodel order and the factor matrices. The basic idea of theseCP decomposition techniques is to use sparsity-promotingpriors or functions to find a low-rank representation of theobserved tensor. Specifically, let L L denote an overesti-mated CP rank, the following optimization can be employedfor CP decomposition [41]

minA,B,C

‖YYY − XXX‖2F + μ

(tr( A A

H) + tr(B B

H) + tr(C C

H))

s.t. XXX =L∑

l=1

al ◦ bl ◦ cl (22)

where μ is a regularization parameter to control thetradeoff between low-rankness and the data fitting error,A = [a1 . . . a L], B = [b1 . . . bL ], and C = [c1 . . . cL ].The above optimization (22) can still be solved by an ALS pro-cedure as follows

A(t+1) = arg min

A

∥∥∥∥∥

[Y T

(1)

0

]−

[C

(t) � B(t)

√μI

]A

T∥∥∥∥∥

2

F

B(t+1) = arg min

B

∥∥∥∥∥

[Y T

(2)

0

]−

[C

(t) � A(t+1)

√μI

]B

T

∥∥∥∥∥

2

F

C(t+1) = arg min

C

∥∥∥∥∥

[Y T

(3)

0

]−

[B

(t+1) � A(t+1)

√μI

]C

T∥∥∥∥∥

2

F

The true CP rank of the tensor, L, can be estimated byremoving those negligible rank-one tensor components afterconvergence.

B. Channel Estimation

We discuss how to estimate the mmWave channels basedon the estimated factor matrices { A, B, C}. As shown in thenext subsection, the CP decomposition is unique up to scalingand permutation ambiguities under a mild condition. Moreprecisely, the estimated factor matrices and the true factor

matrices are related as

A = A�1� + E1 (23)

B = B�2� + E2 (24)

C = C�3� + E3 (25)

where {�1,�2,�3} are unknown nonsingular diagonal matri-ces which satisfy �1�2�3 = I ; � is an unknown permutationmatrix; and E1, E2, and E3 denote the estimation errorsassociated with the three estimated factor matrices, respec-tively. The permutation matrix � can be ignored because it iscommon to all three factor matrices. Note that each columnof A is characterized by the associated angle of arrival θl .Hence the angle of arrival θl can be estimated via a simplecorrelation-based scheme

θl = arg maxθl

|aHl aMS(θl)|

‖al‖2‖aMS(θl)‖2(26)

where al denotes the lth column of A. It can be shown inAppendix A that this simple correlation-based scheme is amaximum likelihood (ML) estimator, provided that entries inthe estimation error matrix, E1, follow an i.i.d. circularlysymmetric Gaussian distribution. The angle of departure φl

can be obtained similarly as

φl = arg maxφl

|bHl aBS(φl)|

‖bl‖2‖aBS(φl)‖2

(27)

where bl denotes the lth column of B. We now discuss how toestimate the time delay τl from the estimated factor matrix C .Note that the lth column of C is given by αl g(τl). Thereforethe time delay τl can be estimated via

τl = arg minτl

|cHl g(τl)|

‖cl‖2‖g(τl)‖2(28)

where cl denotes the lth column of C . The maximizationproblems (26)–(28) involve one-dimensional search which canbe efficiently performed by first employing a coarse grid andthen gradually refining the search in the vicinity of possiblegrid points. Substituting the estimated {θl} and {φl} back into(23) and (24), an estimate of the nonsingular diagonal matrices�1 and �2 can be obtained. An estimate of �3 can then becalculated from the equality �1�2�3 = I . Finally, the fadingcoefficients {αl} can be estimated from (25). The channelmatrices {Hk} can now be recovered from the estimatedparameters {θl, φl , τl , αl }.

C. Uniqueness

We discuss the uniqueness of the CP decomposition. It iswell known that the essential uniqueness of CP decompositioncan be guaranteed by Kruskal’s condition [44]. Let kX denotethe k-rank of a matrix X , which is defined as the largest valueof kX such that every subset of kX columns of the matrix Xis linearly independent. We have the following theorem.

Theorem 1: Let (X, Y , Z) be a CP solution which decom-poses a third-order tensor XXX ∈ CM×N×K into R rank-one

Page 6: 1524 IEEE JOURNAL ON SELECTED AREAS IN …static.tongtianta.site/paper_pdf/f93e6104-554f-11e9-8339-00163e08… · review on tensors and the CP decomposition. More details regarding

ZHOU et al.: LOW-RANK TENSOR DECOMPOSITION-AIDED CHANNEL ESTIMATION FOR MILLIMETER WAVE MIMO-OFDM SYSTEMS 1529

arrays, where X ∈ CM×R , Y ∈ CN×R , and Z ∈ CK×R .Suppose the following Kruskal’s condition

kX + kY + kZ ≥ 2R + 2 (29)

holds and there is an alternative CP solution (X, Y , Z) whichalso decomposes XXX into R rank-one arrays. Then we haveX = X��a , Y = Y��b, and Z = Z��c, where � is aunique permutation matrix and �a , �b, and �c are uniquediagonal matrices such that �a�b�c = I .

Proof: A rigorous proof can be found in [45].Note that Kruskal’s condition cannot hold when R = 1.

However, in that case the uniqueness has been proven byHarshman [46]. Kruskal’s sufficient condition is also necessaryfor R = 2 and R = 3, but not for R > 3 [45].

From the above theorem, we know that if

kA + kB + kC ≥ 2L + 2 (30)

then the CP decomposition of YYY is essentially unique.We first examine the k-rank of A. Note that

A = QT [aMS(θ1) . . . aMS(θL)] � QT AMS (31)

where AMS ∈ CNMS×L is a Vandermonte matrix when auniform linear array is employed. Suppose assumption A1holds valid. For a randomly generated Q whose entries arechosen uniformly from a unit circle, we can show that thek-rank of A is equal to (details can be found in Appendix B)

k A = min(M, L) (32)

with probability one. Consider the k-rank of B. We have

B = PT [aBS(φ1) . . . aBS(φL)] � PT ABS (33)

Similarly, for a randomly generated P whose entries areuniformly chosen from a unit circle, we can deduce that thek-rank of B is equal to

kB = min(T, L) (34)

with probability one. Now let us examine the k-rank of C .Recall that C can be expressed as

C = [g(τ1) . . . g(τL)]Dα (35)

where Dα � diag(α1, . . . , αL), and g(τl) is defined in (14).We see that C is a columnwise-scaled Vandermonte matrix.Therefore the k-rank of C is

kC = min(K , L) (36)

Since L is usually small, it is reasonable to assume thatthe number of subcarriers used for training is greater than L,i.e. K ≥ L. Hence we have kC = L. To meet Kruskal’scondition (30), we only need k A + kB ≥ L + 2. Recalling(32)–(34), we can either choose {T = L, M = 2} or{M = L, T = 2} to satisfy Kruskal’s condition. In summary,for randomly generated beamforming matrix P and combiningmatrix Q whose entries are chosen uniformly from a unitcircle, our proposed method only needs T = L (or T = 2) timeframes and M = 2 (or M = L) sub-frames to enable reliableestimation of channel parameters, thus achieving a substantialtraining overhead reduction. In practice, due to the observation

noise and estimation errors, we may need a slightly largerT and M to yield an accurate channel estimate.

Note that although we assume entries of P are chosen froma unit circle, the hybrid precoder P can be otherwise devised,as long as Kruskal’s condition is satisfied. Moreover, besidesrandom coding, coded beams [17] which steer the antennaarray towards multiple beam directions simultaneously canalso be used to serve as the beamforming and combiningvectors { pt } and {qm}. The k-ranks of A and B may stillobey (32) and (34) if P and Q are carefully designed. Thisis an important issue and will be explored in our future work.

V. CRB

In this section, we develop Cramér-Rao bound (CRB) resultsfor the channel parameter (i.e. {θl, φl , τl , αl }) estimation prob-lem considered in (13). Details of the derivation can be foundin Appendix C. Throughout our analysis, the observationnoise in (13) is assumed to be complex circularly symmetrici.i.d. Gaussian noise. As is well known, the CRB is a lowerbound on the variance of any unbiased estimator [47]. It pro-vides a benchmark for evaluating the performance of ourproposed method. In addition, the CRB results illustrate thebehavior of the resulting bounds, which helps understand theeffect of different system parameters, including the beamform-ing and combining matrices, on the estimation performance.

Note that our proposed method involves two steps: the firststep employs an ALS algorithm to perform the CP decom-position, and based on the decomposed factor matrices, thesecond step uses a simple correlation-based method to estimatethe channel parameters. For zero-mean i.i.d. Gaussian noise,the ALS yields maximum likelihood estimates [48], providedthat the global minimum is reached. Also, it can be provedthat the correlation-based method used in the second stepis a maximum likelihood estimator if the estimation errorsassociated with the factor matrices are i.i.d. Gaussian randomvariables. Therefore our proposed method can be deemedas a quasi-maximum likelihood estimator for the channelparameters. Under mild regularity conditions, the maximumlikelihood estimator is asymptotically (in terms of the samplesize) unbiased and asymptotically achieves the CRB. It there-fore makes sense to compare our proposed CP-decomposition-based method with the CRB results.

VI. COMPRESSED SENSING-BASED

CHANNEL ESTIMATION

By exploiting the sparse scattering nature, the downlinkchannel estimation problem considered in this paper can alsobe formulated as a sparse signal recovery problem. In thefollowing, we briefly discuss the compressed sensing-basedchannel estimation method.

Taking the mode-3 unfolding of YYY (c.f. (13)), we have

Y (3) = C(B � A)T + W (3)

= G Dα�T (PT ⊗ QT )T + W (3) (37)

where

G � [g(τ1) . . . g(τL)], Dα � diag(α1, . . . , αL),

� � [aBS(φ1) ⊗ aMS(θ1) · · · aBS(φL) ⊗ aMS(θL)]

Page 7: 1524 IEEE JOURNAL ON SELECTED AREAS IN …static.tongtianta.site/paper_pdf/f93e6104-554f-11e9-8339-00163e08… · review on tensors and the CP decomposition. More details regarding

1530 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 35, NO. 7, JULY 2017

Taking the transpose of Y (3), we have

Y T(3) = (PT ⊗ QT )� Dα GT + W T

(3) (38)

We see that both � and G are characterized by unknownparameters which need to be estimated. To convert the esti-mation problem into a sparse signal recovery problem, wediscretize the AoA-AoD space into an N1 × N2 grid, in whicheach grid point is given by {θi , φ j } for i = 1, ..., N1 andj = 1, ..., N2, where N1 L, and N2 L. The trueangles of arrival/departure are assumed to lie on the grid.Also, we discretize the time-delay domain into a finite setof grid points {τl}N3

l=1 (N3 L), and assume that the truetime-delays {τl} lie on the discretized grid. Thus (38) can bere-expressed as

Y T(3) = (PT ⊗ QT )� Dα G

T + W T(3) (39)

where � ∈ CNMS NBS×N1 N2 is an overcomplete dictionaryconsisting of N1 N2 columns, with its (i + ( j − 1)N1)thcolumn given by aBS(φ j ) ⊗ aMS(θi), and G ∈ CK×N3 is anovercomplete dictionary, with its nth column given by g(τn).Dα is a sparse matrix obtained by augmenting Dα with zerorows and columns. Let y � vec(Y T

(3)), (39) can be formulatedas a conventional sparse signal recovery problem

y = G ⊗ ((PT ⊗ QT )�)d + w (40)

where d � vec( Dα) is an unknown sparse vector. Manyefficient algorithms such as the orthogonal matching pur-suit (OMP) [31] or the fast iterative shrinkage-thresholdingalgorithm (FISTA) [49] can be employed to solve the abovesparse signal recovery problem.

Although all channel parameters can be jointly estimatedvia (40), this formulation involves a very large dictionarywhich in turn results in a high computational complexity.In fact, a more efficient compressed sensing scheme can bedeveloped by dealing with the AoAs, AoDs and time delaysseparately. For example, to estimate the time delay parameters,we rewrite (37) as

Y (3) = G Dτ + W (3) (41)

where Dτ � Dα�T (PT ⊗ QT )T , Dα is a row-sparse matrix,i.e. it has only a few nonzero rows, and the time delays canbe estimated from the indices of its nonzero rows. Note thatDα and Dτ share the same row sparsity pattern. Thereforethe problem reduces to estimating Dτ , which is a multiplemeasurement vector (MMV) compressed sensing problem thatcan be solve by the simultaneous-OMP (S-OMP) method [32],an extension of the OMP to the MMV framework. Similarly,we can estimate the AoAs and the AoDs from Y (2) and Y (1),respectively.

VII. COMPUTATIONAL COMPLEXITY ANALYSIS

We analyze the computational complexity of the proposedCP decomposition-based method and the compressed sensingmethod discussed in the previous section. The major computa-tional task of our proposed method involves solving the threeleast squares problems (19)–(21) at each iteration. Consideringthe calculation of A, we have

A(t+1) = Y (1)V ∗(V T V∗)−1 (42)

where V � C(t) � B

(t) ∈ CT K×L is a tall matrix sincewe usually have T K > L. Noting that Y (1) ∈ M × T K ,

the number of flops required to compute A(t+1)

is of orderO(MT K L + M K L2 + L3). When L is small, the dominantterm has a computational complexity of order O(MT K ),which scales linearly with the size of the observed tensor YYY .It can also be shown that solving the other two least squaresproblems requires flops of order O(MT K ) as well.

Now consider the computational complexity of the com-pressed sensing method. The joint compressed sensing methodinvolves finding a sparse solution to the linear equation (40).It can be easily verified that to solve (40), the computationalcomplexity of the OMP is of order O(MT K + N1 N2 N3). Forthe FISTA [49], the main computational task at each iterationis to evaluate the proximal operator whose computationalcomplexity is of the order O(n2), where n denotes the numberof columns of the overcomplete dictionary. For the problem(40), we have n = N1 N2 N3. Thus the required number offlops at each iteration is of order O(N2

1 N22 N2

3 ), which scalesquadratically with N1 N2 N3. In order to achieve a substantialoverhead reduction, the parameters {M, T, K } are usuallychosen such that the number of measurements is far less thanthe dimension of the sparse signal, i.e., MT K N1 N2 N3.Therefore this joint compressed sensing method has a highercomputational complexity than our proposed method.

For the separate compressed sensing scheme, it involvessolving (41) to obtain an estimate of time delays. SinceY (3) ∈ CK×MT and G ∈ CK×N3 , the number of flopsrequired to solve (41) using S-OMP is of order O(MT K N3).Similarly, we can arrive at the numbers of flops requiredto estimate AoAs and AoDs are of orders O(MT K N1) andO(MT K N2), respectively. Thus the total computational com-plexity for S-OMP is of order O(MT K (N1 + N2 + N3)).Therefore the separate compressed sensing scheme which usesS-OMP has a computational complexity similar to our pro-posed method.

VIII. SIMULATION RESULTS

We present simulation results to illustrate the performanceof our proposed CP decomposition-based method (referredto as CP). We consider a scenario where the BS employsa uniform linear array with NBS = 64 antennas and theMS employs a uniform linear array with NMS = 32 anten-nas. The distance between neighboring antenna elements isassumed to be half the wavelength of the signal. In oursimulations, the mmWave channel is generated according tothe wideband geometric channel model, in which the AoAsand AoDs are randomly distributed in [0, 2π], the number ofpaths is set equal to L = 4, the delay spread τl for each path isuniformly distributed between 0 and 100 nanoseconds, and thecomplex gain αl is a random variable following a circularly-symmetric Gaussian distribution αu,l ∼ CN (0, 1/ρ). Here ρ isgiven by ρ = (4π D fc/c)2, where c represents the speed oflight, D denotes the distance between the MS and the BS, andfc is the carrier frequency. We set D = 30m, fc = 28GHz.The total number of subcarriers is set to K = 128, out ofwhich K subcarriers are selected for training. The samplingrate is set to fs = 0.32GHz. Also, in our experiments, the

Page 8: 1524 IEEE JOURNAL ON SELECTED AREAS IN …static.tongtianta.site/paper_pdf/f93e6104-554f-11e9-8339-00163e08… · review on tensors and the CP decomposition. More details regarding

ZHOU et al.: LOW-RANK TENSOR DECOMPOSITION-AIDED CHANNEL ESTIMATION FOR MILLIMETER WAVE MIMO-OFDM SYSTEMS 1531

Fig. 3. MSEs and CRBs associated with different sets of parameters vs. thenumber of subcarriers, K .

beamforming matrix P and the combining matrix Q arerandomly generated with their entries uniformly chosen froma unit circle. We assume that the number of paths, L, is knowna priori in our simulations because the OMP algorithm usedfor comparison requires the knowledge of L to decide whento terminate the greedy process. As discussed in Section IV,our proposed method can also deal with the case where L isunknown. The signal-to-noise ratio (SNR) is defined as theratio of the signal component to the noise component, i.e.

SNR �‖YYY − WWW ‖2

F

‖WWW ‖2F

(43)

where YYY and WWW represent the received signal and the additivenoise in (13), respectively. In our simulations, we assumea reasonably large SNR (0dB-30dB) at the receiver. Due tothe significant path loss, the SNR could drop below 0dB,particularly if the distance between the MS and the BS, D,is large. To cope with this issue, the beamforming matrix Pand the combining matrix Q can be devised to form directionalbeams to compensate for the path loss.

We first examine the estimation accuracy of the channelparameters {θl , φl , τl , αl}. Mean square errors (MSEs) arecalculated separately for each set of parameters, i.e.

MSE(θ) = ‖θ − θ‖22 MSE(φ) = ‖φ − φ‖2

2

MSE(τ ) = ‖τ − τ‖22 MSE(α) = ‖α − α‖2

2

where θ � [θ1 . . . θL]T , φ � [φ1 . . . φL ]T , τ �[τ1 . . . τL ]T , and α � [α1 . . . αL ]T . Fig. 3 plots the MSEsof our proposed method as a function of the number of sub-carriers used for training, K , where we set M = 6, T = 6, andSNR = 10dB. The CRB results for different sets of parametersare also included for comparison. We see that our proposedmethod yields accurate estimates of the channel parameterseven for small values of M , T , and K . This result indicates thatour proposed method is able to achieve a substantial trainingoverhead reduction. We also notice that the MSEs attained

Fig. 4. MSEs and CRBs associated with different sets of parameters vs. SNR.

by our proposed method are very close to their correspondingCRBs, particularly for the AoA, AoD, and the time delay para-meters. This result corroborates the optimality of the proposedmethod. As indicated earlier, the optimality of the proposedmethod comes from the fact that the ASL and the correlation-based scheme used in our proposed method are all maximumlikelihood estimators under mild conditions. Specifically, it hasbeen shown in [48] that the ALS yields maximum likelihoodestimates and achieves its associated CRB in the presenceof zero-mean i.i.d. Gaussian noise. We also proved that thesimple correlation-based scheme employed in the second stageof our proposed method is a maximum likelihood estimator,provided that the estimator errors of the factor matrices arei.i.d. Gaussian random variables. Although the estimator errorsmay not strictly follow an i.i.d. Gaussian distribution, thecorrelation-based scheme is still an effective estimator thatachieves near-optimality. Lastly, we observe that our proposedmethod fails when the number of subcarriers K ≤ 2. This isbecause, for the case where T = 6 > L and M = 6 > L,Kruskal’s condition is satisfied only if K ≥ 2. Thus our resultroughly coincides with our previous analysis regarding theuniqueness of the CP decomposition. Fig. 4 depicts the MSEsand CRBs vs. SNR, where we set T = 6, M = 6, and K = 6.From Fig. 4, we see that the CRBs decrease exponentiallywith increasing SNR, and the estimation accuracy achievedby our proposed method has similar tendency as the CRBs.The MSEs attained by our proposed method, again, are closeto their corresponding CRBs, except in the low SNR regime.

We now examine the channel estimation performance ofour proposed method and its comparison with the compressedsensing method discussed in Section VI. The joint compressedsensing scheme is considered and an orthogonal matchingpursuit (OMP) algorithm is employed to solve the sparse signalrecovery problem (40). The separate estimation scheme (41)yields performance worse than (40), and thus is not included.Note that the dimension of the signal to be recovered in (40)is equal to N1 N2 N3, where N1, N2, and N3 denote the number

Page 9: 1524 IEEE JOURNAL ON SELECTED AREAS IN …static.tongtianta.site/paper_pdf/f93e6104-554f-11e9-8339-00163e08… · review on tensors and the CP decomposition. More details regarding

1532 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 35, NO. 7, JULY 2017

Fig. 5. NMSEs of respective algorithms vs. SNR, M = 6, T = 6, K = 6.

Fig. 6. NMSEs of respective algorithms vs. the number of sub-frames M,T = 6, K = 6.

of grid points used to discretize the AoA, AoD, and timedelay domain, respectively. For a typical choice of N1 = 32,N2 = 64 and N3 = 32, the dimension of the signal is oforder O(104). In this case, more sophisticated sparse recoveryalgorithms such as the fast iterative shrinkage-thresholdingalgorithm (FISTA) have a prohibitive computational complex-ity and thus are not included. Also, for the OMP, we employtwo different grids to discretize the continuous parameterspace: the first grid (referred to as Grid-I) discretizes theAoA-AoD-time delay space into 64 × 128 × 256 grid points,and the second grid (referred to as Grid-II) discretizes theAoA-AoD-time delay space into 128 × 256 × 512 grid points.

In Fig. 5, we show the NMSE results for our proposedmethod and the OMP algorithm as a function of SNR, wherewe set M = 6, T = 6, and K = 6. Here the NMSE iscalculated as

NMSE =∑K

k=1 ‖Hk − Hk‖2F∑K

k=1 ‖Hk‖2F

(44)

where Hk denotes the frequency-domain channel matrixassociated with the kth subcarrier, and Hk is its estimate.We see that our proposed method achieves a substantialperformance improvement over the compressed sensing algo-rithm. The performance gain is primarily due to the followingtwo reasons. First, unlike compressed sensing techniques,

Fig. 7. NMSEs of respective algorithms vs. the number of frames T , M = 6,K = 6.

Fig. 8. NMSEs of respective algorithms vs. the number of subcarriers fortraining K , M = 6, T = 6.

our proposed CP decomposition-based method is essentiallya gridless approach which is free from grid discretizationerrors. Second, the CP decomposition-based method capturesthe intrinsic multi-dimensional structure of the multiway data,which helps lead to a performance improvement. From Fig. 6to Fig. 8, we plot the NMSEs of respective methods vs. M ,T , and K , respectively, where the SNR is set to 20dB. Theseresults, again, demonstrate the superiority of the proposedmethod over the compressed sensing method. We also observethat these results corroborate our theoretical analysis concern-ing the uniqueness of the CP decomposition. For example,in Fig. 6, since we have T = 6 > L and K = 6 > L, we onlyneed M ≥ 2 to satisfy Kruskal’s condition. We see that ourproposed method achieves an accurate channel estimate onlywhen M > 2, which roughly coincides with our analysis.

Table I shows the average run times of our proposedmethod and the OMP method. To provide a glimpse of othermore sophisticated compressed sensing method’s computa-tional complexity, the average run times of the FISTA are alsoincluded, from which we can see that sophisticated compressedsensing methods have a prohibitive computational complexity,and thus are not suitable for our channel estimation problem.We also see that our proposed method has a computationalcomplexity as low as the OMP method. It takes similar run

Page 10: 1524 IEEE JOURNAL ON SELECTED AREAS IN …static.tongtianta.site/paper_pdf/f93e6104-554f-11e9-8339-00163e08… · review on tensors and the CP decomposition. More details regarding

ZHOU et al.: LOW-RANK TENSOR DECOMPOSITION-AIDED CHANNEL ESTIMATION FOR MILLIMETER WAVE MIMO-OFDM SYSTEMS 1533

TABLE I

AVERAGE RUN TIMES OF RESPECTIVE ALGORITHMS:T = 6, M = 6, K = 6, AND SNR = 20dB

times as the OMP method which employs the coarser grid ofthe two choices, meanwhile achieving a much higher estima-tion accuracy than the OMP method that uses the finer grid.

IX. CONCLUSIONS

We proposed a CP decomposition-based method for down-link channel estimation in mm-Wave MIMO-OFDM systems,where wideband mmWave channels with frequency selectivitywere considered. The proposed method exploited the intrinsicmulti-dimensional structure of the multiway data receivedat the BS. Specifically, the received signal at the BS wasexpressed as a third-order tensor. We showed that the tensorhas a form of a low-rank CP decomposition, and the channelparameters can be easily extracted from the decomposedfactor matrices. The uniqueness of the CP decompositionwas investigated, which revealed that the uniqueness of theCP decomposition can be guaranteed even with a small numberof measurements. Thus the proposed method is able to achievea substantial training overhead reduction. CRB results forchannel parameters were also developed. We compared ourproposed method with a compressed sensing-based channelestimation method. Simulation results showed that our pro-posed method presents a clear performance advantage overthe compressed sensing method in terms of both estimationaccuracy and computational complexity.

APPENDIX A

In (23), for the lth column, we have

al = λl aMS(θl) + el (45)

where θl and λl are unknown parameters. We assume el satis-fies circularly symmetric complex Gaussian distribution withzero mean and covariance matrix ε2 I . Thus the log-likelihoodfunction is given by

L(θl, λl ) = −M ln(πε2) − 1

ε2

∥∥al − λl aMS(θl)∥∥2

F

∝ − ∥∥al − λl aMS(θl)∥∥2

2

Given a fixed θl , the optimal λl can be obtained by taking thepartial derivative of the log-likelihood function with respectto λl and setting the partial derivative equal to zero, i.e.

∂L(θl , λl)

∂λ∗l

= (al − λl aMS(θl))T a∗

MS(θl) = 0

which leads to

λ�l = al

T a∗MS(θl)∥∥aMS(θl)

∥∥2

Note that (45) can be rewritten as∥∥al

∥∥2 = λl aHl aMS(θl) + aH

l el

Then the log-likelihood function becomes

L(θl , λl ) ∝ −∥∥∥∥∥al

∥∥2 − λl aHl aMS(θl)

∥∥∥2

(46)

Substituting λ�l into the above log-likelihood function, we

arrive at

L(θl , λ�l ) ∝ −

∥∥∥∥∥∥∥

∥∥al∥∥2 −

∣∣∣aHl aMS(θl)

∣∣∣2

∥∥aMS(θl)∥∥2

∥∥∥∥∥∥∥

2

(47)

∝ −

∥∥∥∥∥∥∥1 −

∣∣∣aHl aMS(θl)

∣∣∣2

∥∥aMS(θl)∥∥2∥∥al

∥∥2

∥∥∥∥∥∥∥

2

(48)

Due to the Cauchy-Schwarz inequality, we have

0 ≤∣∣∣aH

l aMS(θl)∣∣∣2

∥∥aMS(θl)∥∥2∥∥al

∥∥2 ≤ 1

Therefore maximizing the log-likelihood with respect to θl isequivalent to

θ�l = arg max

θl

L(θl , λ�l ) = arg max

θl

∣∣∣aHl aMS(θl)

∣∣∣2

∥∥al∥∥2∥∥aMS(θl)

∥∥2 (49)

The proof is completed here.

APPENDIX B

For a uniform linear array, the steering vector aMS(θi ) canbe written as

aMS(θi ) � [1 e j (2π/λ)dsin(θi ) . . . e j (NMS−1)(2π/λ)dsin(θi )]T

where λ is the signal wavelength, and d denotes the distancebetween neighboring antenna elements. We assume each entryof Q ∈ C

NMS×M is chosen uniformly from a unit circlescaled by a constant 1/NMS, i.e. qm,n = (1/NMS)e jϑm,n ,where ϑm,n ∈ [−π, π] follows a uniform distribution. Letam,i � qT

m aMS(θi ) denote the (m, i)th entry of A, in whichqm denotes the mth column of Q. It can be readily verifiedthat E[am,i ] = 0,∀m, i and

E[am,i a∗n, j ] =

⎧⎨

0 m �= n1

N2MS

aHMS(θ j )aMS(θi ) m = n

(50)

When the number of antennas at the MS is sufficientlylarge, the steering vectors {aMS(θi )} become mutually quasi-orthogonal, i.e. (1/NMS)aH

MS(θ j )aMS(θ j ) → δ(θi −θ j ), whichimplies that the entries of A are uncorrelated with eachother. On the other hand, according to the central limittheorem, we know that each entry am,i approximately follows

Page 11: 1524 IEEE JOURNAL ON SELECTED AREAS IN …static.tongtianta.site/paper_pdf/f93e6104-554f-11e9-8339-00163e08… · review on tensors and the CP decomposition. More details regarding

1534 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 35, NO. 7, JULY 2017

a Gaussian distribution. Therefore entries of A can be consid-ered as i.i.d. Gaussian variables with zero mean and variance1/NMS. Thus we can reach that the k-rank of A is equivalentto the number of columns or the number of rows, whicheveris smaller, with probability one.

APPENDIX CDERIVATION OF CRAMÉR RAO LOWER BOUND

Consider the M × T × K observation tensor YYY in (13)

YYY =L∑

l=1

αl aMS(θl) ◦ aBS(φl) ◦ g(τl) + WWW (51)

where {αl , θl , φl , τl} are the unknown channel parameters tobe estimated. We assume that entries of WWW are i.i.d zeromean, circular symmetric Gaussian random variables withvariance σ 2. For ease of exposition, let θ � [θ1 · · · θL ]T ,φ � [φ1 · · · φL ]T , τ � [τ1 · · · τL]T , α � [α1 · · · αL ]T ,and p � [θT φT τ T αT ]. Thus, the log-likelihood functionof p can be expressed as

L( p) = f (YYY ; A, B, C) (52)

where A, B and C , defined in (15), (16) and (17) respectively,are functions of the parameter vector p, and f (YYY ; A, B, C)is given by

f (YYY ; A, B, C)

= −MT K ln(πσ 2) − 1

σ 2

∥∥∥Y T(1) − (C � B)AT

∥∥∥2

F

= −MT K ln(πσ 2) − 1

σ 2

∥∥∥Y T(2) − (C � A)BT

∥∥∥2

F

= −MT K ln(πσ 2) − 1

σ 2

∥∥∥Y T(3) − (B � A)CT

∥∥∥2

F

The complex Fisher information matrix (FIM) for p is givenby [47], [48]

( p) = E

{(∂L( p)

∂ p

)H (∂L( p)

∂ p

)}. (53)

In the next, to calculate ( p), we first compute the partialderivative of L( p) with respect to p and then calculate theexpectation with respect to p(YYY ; p).

A. Partial Derivative of L( p) W.R.T p

The partial derivative of L( p) with respect to θl can becomputed as

∂L( p)

∂θl= tr

{(∂L( p)

∂ A

)T ∂ A∂θl

+(

∂L( p)

∂ A∗)T ∂ A∗

∂θl

}(54)

where

∂L( p)

∂ A= 1

σ 2 (Y T(1) − (C � B)AT )H (C � B)

∂L( p)

∂ A∗ =(

∂L( p)

∂ A

)∗

∂ A∂θl

= [0 · · · al · · · 0

](55)

For a uniform linear array with the element spacing equal tohalf of the signal wavelength, we have al � j QT Da aMS(θl),and

Da � π cos(θl)diag(0, 1, · · · , NMS − 1) (56)

Therefore, we have

∂L( p)

∂θl= eT

l1

σ 2 (C � B)T (Y T(1) − (C � B)AT )∗ al

+ eTl

1

σ 2 (C � B)H (Y T(1) − (C � B)AT )a∗

l

= 2Re{eTl

1

σ 2 (C � B)T (Y T(1) − (C � B)AT )∗ al}

= 2Re{eTl

1

σ 2 (C � B)T (Y T(1) − (C � B)AT )∗ Ael}

(57)

where Re{·} is an operator which takes the real part of acomplex number, el ∈ C

L×1 is the canonical vector whosenon-zero entry is indexed as l, and

A �[

a1 a2 · · · aL]

(58)

Similarly, we can obtain the partial derivatives with respect toother parameters as follows

∂L( p)

∂φl= 2Re{eT

l1

σ 2 (C � A)T (Y T(2) − (C � A)BT )∗ Bel}

∂L( p)

∂τl= 2Re{eT

l1

σ 2 (B � A)T (Y T(3) − (B � A)CT )∗Cel}

∂L( p)

∂αl= eT

l1

σ 2 (B � A)T (Y T(3) − (B � A)CT )∗Gel

where

B �[

b1 b2 · · · bL

](59)

C �[

c1 c2 · · · cL]

(60)

G �[

g1 g2 · · · gL]

(61)

in which bl � j PT DbaBS(φl), cl � j Dccl , and

Db � π cos(φl)diag(0, 1, · · · , NBS − 1) (62)

Dc � −2πdiag(0, fs/K , · · · , (K − 1) fs/K ) (63)

B. Calculation of Fisher Information Matrix

We first calculate the entries in the principal minorsof ( p). For instance, the (l1, l2)th entry of

E

{(∂L( p)

∂θ

)H (∂L( p)

∂θ

)}.

is given by

E

{(∂L( p)

∂θl1

)∗ (∂L( p)

∂θl2

)}

= 4E

[Re{eT

l1 Na el1}Re{eTl2 Na el2}

]

= E[(

Na(l1, l1) + Na(l1, l1)∗) (

Na(l2, l2) + Na(l2, l2)∗)]

Page 12: 1524 IEEE JOURNAL ON SELECTED AREAS IN …static.tongtianta.site/paper_pdf/f93e6104-554f-11e9-8339-00163e08… · review on tensors and the CP decomposition. More details regarding

ZHOU et al.: LOW-RANK TENSOR DECOMPOSITION-AIDED CHANNEL ESTIMATION FOR MILLIMETER WAVE MIMO-OFDM SYSTEMS 1535

where Na(l1, l1) stands for the (l1, l1)th entry of Na ∈ CL×L

and

Na � 1

σ 2 (C � B)T (Y T(1) − (C � B)AT )∗ A

= 1

σ 2 (C � B)T (W T(1))

∗ A.

Letting na � vec(Na), we have

na =(

AT ⊗ (C � B)T

)vec(W H

(1)). (64)

where W (1) is the mode-1 unfolding of WWW , thus vec(W H(1))

is a zero mean circularly symmetric complex Gaussian vectorwhose covariance matrix is given by σ 2 I . Since na is thelinear transformation of vec(W H

(1)), na also follows a circu-larly symmetric complex Gaussian distribution. Its covariancematrix Cna ∈ CL2×L2

and second-order moments Mna ∈CL2×L2

are respectively given by

Cna = E

[(na)(na)H

]

= 1

σ 2

(A

T ⊗ (C � B)T) (

A∗ ⊗ (C � B)∗

)

= 1

σ 2

(A

TA

∗) ⊗((C � B)T (C � B)∗

)(65)

and

Mna = E

[(na)(na)T

]= 0 (66)

Therefore, we have

E

{(∂L( p)

∂θl1

)∗ (∂L( p)

∂θl2

)}= 2Re{Cna (m, n)} (67)

where m � L(l1 − 1) + l1 and n � L(l2 − 1) + l2. Similarly,we can arrive at

E

{(∂L( p)

∂φl1

)∗ (∂L( p)

∂φl2

)}= 2Re{Cnb(m, n)}

E

{(∂L( p)

∂τl1

)∗ (∂L( p)

∂τl2

)}= 2Re{Cnc(m, n)}

E

{(∂L( p)

∂αl1

)∗ (∂L( p)

∂αl2

)}= Cnc(m, n)∗,

in which

Cnb � 1

σ 2

(B

TB

∗) ⊗((C � A)T (C � A)∗

)(68)

Cnc � 1

σ 2

(C

TC

∗) ⊗((B � A)T (B � A)∗

)(69)

Cnc � 1

σ 2

(GT G∗) ⊗

((B � A)T (B � A)∗

)(70)

For the elements in the off-principal minors of ( p), such asthe (l1, l2)th entry of

E

{(∂L( p)

∂θ

)H (∂L( p)

∂φ

)}.

is given by

E

{(∂L( p)

∂θl1

)∗ (∂L( p)

∂φl2

)}

= 4E

[Re

{eT

l1 Nael1

}Re{eT

l2 Nbel2}]

= E

[(Na(l1, l1) + Na(l1, l1)

∗)(Nb(l2, l2) + Nb(l2, l2)∗)

]

= 2Re{Cna,nb(m, n)

}

where

Cna ,nb � E

[(na)(nb)H

]

= 1

σ 4 ( A ⊗ (C � B))T Cw1,w2(B∗ ⊗ (C � A)∗) (71)

in which

Cw1,w2 � E{vec(W H(1))vec(W T

(2))T } (72)

Similarly, we can obtain

E

{(∂ f ( p)

∂θl1

)∗ (∂ f ( p)

∂τl2

)}= 2Re{Cna,nc(m, n)}

E

{(∂ f ( p)

∂θl1

)∗ (∂ f ( p)

∂αl2

)}= Cna ,nc (m, n)∗

E

{(∂ f ( p)

∂φl1

)∗ (∂ f ( p)

∂τl2

)}= 2Re{Cnb,nc(m, n)}

E

{(∂ f ( p)

∂φl1

)∗ (∂ f ( p)

∂αl2

)}= Cnb,nc (m, n)∗

E

{(∂ f ( p)

∂τl1

)∗ (∂ f ( p)

∂αl2

)}= Cnc,nc (m, n)∗

where

Cna,nc � 1

σ 4 ( A ⊗ (C � B))T Cw1,w3(C∗ ⊗ (B � A)∗)

Cna,nc � 1

σ 4 ( A ⊗ (C � B))T Cw1,w3(G∗ ⊗ (B � A)∗)

Cnb,nc � 1

σ 4 (B ⊗ (C � A))T Cw2,w3(C∗ ⊗ (B � A)∗)

Cnb,nc � 1

σ 4 (B ⊗ (C � A))T Cw2,w3(G∗ ⊗ (B � A)∗)

Cnc,nc � 1

σ 2 (C ⊗ (B � A))T (G∗ ⊗ (B � A)∗)

in which

Cw1,w3 � E{vec(W H(1))vec(W T

(3))T }

Cw2,w3 � E{vec(W H(2))vec(W T

(3))T }.

The computation of Cw1,w2 is elaborated as follows. Notethat the (m, t, k)th entry in WWW ∈ CM×T ×K corresponds tothe (m, t + (k − 1)T )th entry of W (1) and also correspondsto the (t, m + (k − 1)M)th entry of W (2). Furthermore,the (m, t + (k − 1)T )th entry of W (1) corresponds to the(t + (k − 1)T + (m − 1)T K )th entry of vec(W H

(1)) andthe (t, m + (k − 1)M)th entry of W (2) corresponds to the

Page 13: 1524 IEEE JOURNAL ON SELECTED AREAS IN …static.tongtianta.site/paper_pdf/f93e6104-554f-11e9-8339-00163e08… · review on tensors and the CP decomposition. More details regarding

1536 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 35, NO. 7, JULY 2017

(m + (k − 1)M + (t − 1)M K )th entry of vec(W T(2)). Since

entries in WWW are i.i.d. random variables, i.e.,

E{wm1,t1,k1 w∗m2,t2,k2

} ={

σ 2; m1 = m2, t1 = t2, k1 = k2

0; otherwise

where wm,t,k represents the (m, t, k)th entry of WWW .Therefore, in Cw1,w2 ∈ CT K M×M K T , the number ofnonzero entries is MT K and the corresponding indexes,{(n1, n2)|Cw1,w2(n1, n2) �= 0}, is equal to

{(n1, n2)|n1 = t + (k − 1)T + (m − 1)T K ,

n2 = m + (k − 1)M + (t − 1)M K ,∀m, t, k}Similarly, the index of the nonzero elements in Cw1,w3 ∈

CT K M×MT K and Cw2,w3 ∈ CM K T ×MT K are respectivelybelongs to

{(n1, n2)|n1 = t + (k − 1)T + (m − 1)T K ,

n2 = m + (t − 1)M + (k − 1)MT,∀m, t, k}and

{(n1, n2)|n1 = m + (k − 1)M + (t − 1)M K ,

n2 = m + (t − 1)M + (k − 1)MT,∀m, t, k}

C. Cramér-Rao Bound

After obtaining the fisher information matrix, the CRB forthe parameters p can be calculated as [47]

CRB( p) = −1( p). (73)

REFERENCES

[1] T. S. Rappaport, J. N. Murdock, and F. Gutierrez, Jr., “State of the art in60-GHz integrated circuits and systems for wireless communications,”Proc. IEEE, vol. 99, no. 8, pp. 1390–1436, Aug. 2011.

[2] S. Rangan, T. S. Rappaport, and E. Erkip, “Millimeter-wave cellularwireless networks: Potentials and challenges,” Proc. IEEE, vol. 102,no. 3, pp. 366–385, Mar. 2014.

[3] A. Ghosh et al., “Millimeter-wave enhanced local area systems: A high-data-rate approach for future wireless networks,” IEEE J. Sel. AreasCommun., vol. 32, no. 6, pp. 1152–1163, Jun. 2014.

[4] G. Yang, M. Xiao, J. Gross, H. Al-Zubaidy, and Y. Huang. (Jul. 2016).“Delay and backlog analysis for 60 GHz wireless networks.” [Online].Available: https://arxiv.org/abs/1608.00120

[5] A. L. Swindlehurst, E. Ayanoglu, P. Heydari, and F. Capolino,“Millimeter-wave massive MIMO: The next wireless revolution?” IEEECommun. Mag., vol. 52, no. 9, pp. 56–62, Sep. 2014.

[6] A. Alkhateeb, J. Mo, N. Gonzalez-Prelcic, and R. W. Heath, Jr., “MIMOprecoding and combining solutions for millimeter-wave systems,” IEEECommun. Mag., vol. 52, no. 12, pp. 122–131, Dec. 2014.

[7] Q. Xue, X. Fang, M. Xiao, and L. Yan, “Multi-user millimeter wavecommunications with nonorthogonal beams,” IEEE Trans. Veh. Technol.,to be published, doi: 10.1109/TVT.2016.2617083.

[8] G. Yang, J. Du, and M. Xiao, “Maximum throughput path selectionwith random blockage for indoor 60 GHz relay networks,” IEEE Trans.Commun., vol. 63, no. 10, pp. 3511–3524, Oct. 2015.

[9] O. El Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, and R. W. Heath, Jr.,“Spatially sparse precoding in millimeter wave MIMO systems,” IEEETrans. Wireless Commun., vol. 13, no. 3, pp. 1499–1513, Mar. 2014.

[10] A. Alkhateeb, G. Leus, and R. Heath, “Limited feedback hybrid pre-coding for multi-user millimeter wave systems,” IEEE Trans. WirelessCommun., vol. 14, no. 11, pp. 6481–6494, Nov. 2015.

[11] X. Gao, L. Dai, S. Han, C.-L. I, and R. W. Heath, “Energy-efficienthybrid analog and digital precoding for mmWave MIMO systems withlarge antenna arrays,” IEEE J. Sel. Areas Commun., vol. 34, no. 4,pp. 998–1009, Apr. 2016.

[12] M. N. Kulkarni, A. Ghosh, and J. G. Andrews, “A comparison of MIMOtechniques in downlink millimeter wave cellular networks with hybridbeamforming,” IEEE Trans. Commun., vol. 64, no. 5, pp. 1952–1967,May 2016.

[13] J. Zhang, L. Dai, X. Zhang, E. Björnson, and Z. Wang, “Achiev-able rate of Rician large-scale MIMO channels with transceiver hard-ware impairments,” IEEE Trans. Veh. Technol., vol. 65, no. 10,pp. 8800–8806, Oct. 2016.

[14] S. Hur, T. Kim, D. J. Love, J. V. Krogmeier, T. A. Thomas, andA. Ghosh, “Millimeter wave beamforming for wireless backhaul andaccess in small cell networks,” IEEE Trans. Commun., vol. 61, no. 10,pp. 4391–4403, Oct. 2013.

[15] T. Kim and D. J. Love, “Virtual AoA and AoD estimation for sparsemillimeter wave MIMO channels,” in Proc. 16th IEEE Int. WorkshopSignal Process. Adv. Wireless Commun. (SPAWC), Stockholm, Sweden,Jun./Jul. 2015, pp. 146–150.

[16] J. Wang et al., “Beam codebook based beamforming protocol for multi-Gbps millimeter-wave WPAN systems,” IEEE J. Sel. Areas Commun.,vol. 27, no. 8, pp. 1390–1399, Oct. 2009.

[17] Y. M. Tsang, A. S. Y. Poon, and S. Addepalli, “Coding the beams:Improving beamforming training in mmWave communication system,”in Proc. IEEE Globel Commun. Conf. (Globecom), Houston, TX, USA,Dec. 2011, pp. 1–6.

[18] A. Alkhateeb, G. Leus, and R. Heath, “Compressed sensing based multi-user millimeter wave systems: How many measurements are needed?”in Proc. 40th IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP),Brisbane, Australia, Apr. 2015, pp. 2909–2913.

[19] A. Alkhateeb, O. El Ayach, G. Leus, and R. W. Heath, Jr., “Channelestimation and hybrid precoding for millimeter wave cellular systems,”IEEE J. Sel. Topics Signal Process., vol. 8, no. 5, pp. 831–846,Oct. 2014.

[20] P. Schniter and A. Sayeed, “Channel estimation and precoder design formillimeter-wave communications: The sparse way,” in Proc. 48th Asilo-mar Conf. Signals, Syst. Comput., Pacific Grove, CA, USA, Nov. 2014,pp. 273–277.

[21] Z. Zhou, J. Fang, L. Yang, H. Li, Z. Chen, and S. Li, “Channel estimationfor millimeter-wave multiuser MIMO systems via PARAFAC decompo-sition,” IEEE Trans. Wireless Commun., vol. 15, no. 11, pp. 7501–7516,Nov. 2016.

[22] D. Ramasamy, S. Venkateswaran, and U. Madhow, “Compressive adap-tation of large steerable arrays,” in Proc. Inf. Theory Appl. Workshop(ITA), San Diego, CA, USA, Feb. 2012, pp. 234–239.

[23] D. Ramasamy, S. Venkateswaran, and U. Madhow, “Compressive track-ing with 1000-element arrays: A framework for multi-Gbps mm wavecellular downlinks,” in Proc. 50th Annu. Allerton Conf. Commun.,Control, Comput. (Allerton), Oct. 2012, pp. 690–697.

[24] Z. Marzi, D. Ramasamy, and U. Madhow, “Compressive channel estima-tion and tracking for large arrays in mm-Wave picocells,” IEEE J. Sel.Topics Signal Process., vol. 10, no. 3, pp. 514–527, Apr. 2016.

[25] H. Xie, F. Gao, S. Zhang, and S. Jin, “A unified transmission strategy forTDD/FDD massive MIMO systems with spatial basis expansion model,”IEEE Trans. Veh. Technol., vol. 66, no. 4, pp. 3170–3184, Apr. 2017.

[26] Z. Gao, L. Dai, Z. Wang, and S. Chen, “Spatially common sparsity basedadaptive channel estimation and feedback for FDD massive MIMO,”IEEE Trans. Signal Process., vol. 63, no. 23, pp. 6169–6183, Dec. 2015.

[27] X. Gao, L. Dai, and A. M. Sayeed. (Jul 2016). “Low RF-complexitytechnologies for 5G millimeter-wave MIMO systems with large antennaarrays.” [Online]. Available: https://arxiv.org/abs/1607.04559

[28] L. Dai, X. Gao, X. Su, S. Han, C.-L. I, and Z. Wang, “Low-complexitysoft-output signal detection based on Gauss–Seidel method for uplinkmultiuser large-scale MIMO systems,” IEEE Trans. Veh. Technol.,vol. 64, no. 10, pp. 4839–4845, Oct. 2015.

[29] A. Alkhateeb and R. W. Heath, “Frequency selective hybrid precodingfor limited feedback millimeter wave systems,” IEEE Trans. Commun.,vol. 64, no. 5, pp. 1801–1818, May 2016.

[30] Z. Gao, C. Hu, L. Dai, and Z. Wang, “Channel estimation for millimeter-wave massive MIMO with hybrid precoding over frequency-selectivefading channels,” IEEE Commun. Lett., vol. 20, no. 6, pp. 1259–1262,Jun. 2016.

[31] Y. C. Pati, R. Rezaiifar, and P. S. Krishnaprasad, “Orthogonal matchingpursuit: Recursive function approximation with applications to waveletdecomposition,” in Proc. Conf. Rec. 27th Asilomar Conf. Signals, Syst.Comput., vol. 1. Pacific Grove, CA, USA, Nov. 1993, pp. 40–44.

[32] J. A. Tropp, A. C. Gilbert, and M. J. Strauss, “Algorithms for simul-taneous sparse approximation. Part I: Greedy pursuit,” Signal Process.,vol. 86, no. 3, pp. 572–588, 2006.

Page 14: 1524 IEEE JOURNAL ON SELECTED AREAS IN …static.tongtianta.site/paper_pdf/f93e6104-554f-11e9-8339-00163e08… · review on tensors and the CP decomposition. More details regarding

ZHOU et al.: LOW-RANK TENSOR DECOMPOSITION-AIDED CHANNEL ESTIMATION FOR MILLIMETER WAVE MIMO-OFDM SYSTEMS 1537

[33] T. G. Kolda and B. W. Bader, “Tensor decompositions and applications,”SIAM Rev., vol. 51, no. 3, pp. 455–500, 2009.

[34] T. G. Kolda, Multilinear Operators for Higher-Order Decomposi-tions, United States Department of Energy, Washington, DC, USA,2006.

[35] A. Cichocki et al., “Tensor decompositions for signal processing appli-cations: From two-way to multiway component analysis,” IEEE SignalProcess. Mag., vol. 32, no. 2, pp. 145–163, Mar. 2015.

[36] L. D. Lathauwer, “Decompositions of a higher-order tensor in blockterms—Part II: Definitions and uniqueness,” SIAM. J. Matrix Anal.Appl., vol. 30, no. 3, pp. 1033–1066, Sep. 2008.

[37] L. Dai, Z. Wang, and Z. Yang, “Spectrally efficient time-frequencytraining OFDM for mobile large-scale MIMO systems,” IEEE J. Sel.Areas Commun., vol. 31, no. 2, pp. 251–263, Feb. 2013.

[38] W. Feng, Y. Wang, N. Ge, J. Lu, and J. Zhang, “Virtual MIMOin multi-cell distributed antenna systems: Coordinated transmissionswith large-scale CSIT,” IEEE J. Sel. Areas Commun., vol. 31, no. 10,pp. 2067–2081, Oct. 2013.

[39] L. Zhang, M. Xiao, G. Wu, S. Li, and Y.-C. Liang, “Energy-efficientcognitive transmission with imperfect spectrum sensing,” IEEE J. Sel.Areas Commun., vol. 34, no. 5, pp. 1320–1335, May 2016.

[40] M. R. Akdeniz et al., “Millimeter wave channel modeling and cellularcapacity evaluation,” IEEE J. Sel. Areas Commun., vol. 32, no. 6,pp. 1164–1179, Jun. 2014.

[41] J. A. Bazerque, G. Mateos, and G. B. Giannakis, “Rank regular-ization and Bayesian inference for tensor completion and extrapola-tion,” IEEE Trans. Signal Process., vol. 61, no. 22, pp. 5689–5703,Nov. 2013.

[42] P. Rai, Y. Wang, S. Guo, G. Chen, D. Dunson, and L. Carin, “ScalableBayesian low-rank decomposition of incomplete multiway tensors,” inProc. 31st Int. Conf. Mach. Learn. (ICML-14), Beijing, China, 2014,pp. 1800–1808.

[43] Q. Zhao, L. Zhang, and A. Cichocki, “Bayesian CP factorizationof incomplete tensors with automatic rank determination,” IEEETrans. Pattern Anal. Mach. Intell., vol. 37, no. 9, pp. 1751–1763,Sep. 2015.

[44] J. B. Kruskal, “Three-way arrays: Rank and uniqueness of trilineardecompositions, with application to arithmetic complexity and statistics,”Linear Algebra Appl., vol. 18, no. 2, pp. 95–138, 1977.

[45] A. Stegeman and N. D. Sidiropoulos, “On Kruskal’s uniqueness condi-tion for the candecomp/parafac decomposition,” Linear Algebra Appl.,vol. 420, nos. 2–3, pp. 540–552, Jan. 2007.

[46] R. A. Harshman, “Determination and proof of minimum uniqueness con-ditions for PARAFAC1,” UCLA Working Papers in Phonetics, vol. 22,pp. 111–117, (University Microfilms, Ann Arbor, No. 10,085), 1972.

[47] S. M. Kay, Fundamentals of Statistical Signal Processing: Esti-mation Theory. Upper Saddle River, NJ, USA: Prentice-Hall,1993.

[48] X. Liu and N. D. Sidiropoulos, “Cramer–Rao lower bounds for low-rank decomposition of multidimensional arrays,” IEEE Trans. SignalProcess., vol. 49, no. 9, pp. 2074–2086, Sep. 2001.

[49] A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholdingalgorithm for linear inverse problems,” SIAM J. Imag. Sci., vol. 2, no. 1,pp. 183–202, 2009.

Zhou Zhou received the B.Sc. degree from theUniversity of Electronic Science and Technology ofChina in 2011, where he is currently pursuing thePh.D. degree. His current research interests includecompressed sensing, sparse theory, and tensor signalprocessing.

Jun Fang (M’08) received the B.S. and M.S. degreesfrom Xidian University, Xi’an, China, in 1998 and2001, respectively, and the Ph.D. degree from theNational University of Singapore, Singapore, in2006, all in electrical engineering.

In 2006, he was a Post-Doctoral Research Asso-ciate with the Department of Electrical and Com-puter Engineering, Duke University. From 2007to 2010, he was a Research Associate with theDepartment of Electrical and Computer Engineering,Stevens Institute of Technology. Since 2011, he has

been with the University of Electronic Science and Technology of China.His research interests include compressed sensing and sparse theory, massiveMIMO/mmWave communications, and statistical inference.

Dr. Fang received the IEEE Jack Neubauer Memorial Award in 2013for the Best Systems Paper published in the IEEE TRANSACTIONS ON

VEHICULAR TECHNOLOGY. He is an Associate Technical Editor of the IEEECommunications Magazine and an Associate Editor of the IEEE SIGNAL

PROCESSING LETTERS.

Linxiao Yang received the B.S. degree fromSouthwest Jiaotong University, Chengdu, China, in2013. He is currently pursuing the Ph.D. degree withthe University of Electronic Science and Technologyof China. His current research interests include com-pressed sensing, sparse theory, tensor analysis, andmachine learning.

Hongbin Li (M’99–SM’08) received the B.S. andM.S. degrees from the University of Electronic Sci-ence and Technology of China in 1991 and 1994,respectively, and the Ph.D. degree from the Univer-sity of Florida, Gainesville, FL, USA, in 1999, allin electrical engineering.

From 1996 to 1999, he was a Research Assis-tant with the Department of Electrical and Com-puter Engineering, University of Florida. He wasa summer Visiting Faculty Member with the AirForce Research Laboratory in 2003, 2004, and 2009.

Since 1999, he has been with the Department of Electrical and ComputerEngineering, Stevens Institute of Technology, Hoboken, NJ, USA, where heis currently a Professor. His general research interests include statistical signalprocessing, wireless communications, and radars.

Dr. Li was a member of the IEEE SPS Sensor Array and MultichannelTechnical Committee (TC) from 2006 to 2012. He has been a memberof the IEEE SPS Signal Processing Theory and Methods TC since 2011.He is a member of Tau Beta Pi and Phi Kappa Phi. He received theIEEE Jack Neubauer Memorial Award in 2013 for the Best Systems Paperpublished in the IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, theOutstanding Paper Award from the IEEE AFICON Conference in 2011, theHarvey N. Davis Teaching Award in 2003, the Jess H. Davis Memorial Awardfor excellence in research in 2001 from the Stevens Institute of Technology,and the Sigma Xi Graduate Research Award from the University of Floridain 1999. He has been involved in various conference organization activities,including serving as a General Co-Chair of the 7th IEEE Sensor Array andMultichannel Signal Processing Workshop, Hoboken, NJ, in 2012. He was anAssociate Editor of the IEEE TRANSACTIONS ON SIGNAL PROCESSING from2006 to 2009, the IEEE SIGNAL PROCESSING LETTERS from 2005 to 2006,and the IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS from 2003to 2006. He has been an Associate Editor of Signal Processing (Elsevier) since2013 and the IEEE TRANSACTIONS ON SIGNAL PROCESSING since 2014,and a Guest Editor of the IEEE JOURNAL OF SELECTED TOPICS IN SIGNALPROCESSING and EURASIP Journal on Applied Signal Processing.

Page 15: 1524 IEEE JOURNAL ON SELECTED AREAS IN …static.tongtianta.site/paper_pdf/f93e6104-554f-11e9-8339-00163e08… · review on tensors and the CP decomposition. More details regarding

1538 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 35, NO. 7, JULY 2017

Zhi Chen received the B.Eng., M.Eng., and Ph.D.degrees in electrical engineering from the Universityof Electronic Science and Technology of China(UESTC) in 1997, 2000, and 2006, respectively.In 2006, he joined the National Key Laboratoryon Communications, UESTC, where he is currentlya Professor. He was a Visiting Scholar with theUniversity of California at Riverside, Riverside,USA, from 2010 to 2011. His current researchinterests include relay and cooperative communica-tions, multiuser beamforming in cellular networks,

interference coordination and cancellation, and terahertz communication. Hehas served as a Reviewer for various international journals and conferences,including the IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY and theIEEE TRANSACTIONS ON SIGNAL PROCESSING.

Rick S. Blum (F’05) received the B.S. degree fromPennsylvania State University in 1984 and the M.S.and Ph.D. degrees from the University of Pennsylva-nia in 1987 and 1991, respectively, all in electricalengineering. From 1984 to 1991, he was a mem-ber of Technical Staff, General Electric Aerospace,Valley Forge, PA, USA, and he graduated fromGE’s Advanced Course in Engineering. Since 1991,he has been with the Electrical and Computer Engi-neering Department, Lehigh University, Bethlehem,PA, USA, where he is currently a Professor and

holds the Robert W. Wieseman Chaired Research Professorship in electricalengineering. His research interests include signal processing for smart grid,communications, sensor networking, radar, and sensor processing. He was amember of the Signal Processing for Communications Technical Committee(TC) of the IEEE Signal Processing Society. He is a member of the SAMTC of the IEEE Signal Processing Society and the Communications TheoryTC of the IEEE Communication Society. He was on the Awards Committeeof the IEEE Communication Society. He was an Associate Editor of theIEEE Transactions on Signal Processing and the IEEE COMMUNICATIONS

LETTERS. He has edited special issues for the IEEE TRANSACTIONS ONSIGNAL PROCESSING, the IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL

PROCESSING, and the IEEE JOURNAL ON SELECTED AREAS IN COMMU-NICATIONS. He is on the Editorial Board of the Journal of Advances inInformation Fusion of the International Society of Information Fusion.

Dr. Blum holds several patents. He was an IEEE Signal Processing SocietyDistinguished Lecturer. He is a member of Eta Kappa Nu and Sigma Xi.He received the ONR Young Investigator Award and the NSF ResearchInitiation Award. He received the IEEE Third Millennium Medal. His IEEEFellow Citation for scientific contributions to detection, data fusion and signalprocessing with multiple sensors acknowledges contributions to the field ofsensor networking.