A SDMA–OFDMA Greedy Scheduling Algorithm for WiMAX Networks

download A SDMA–OFDMA Greedy Scheduling Algorithm for WiMAX Networks

of 20

description

IEEE paper on SDMA in WImax

Transcript of A SDMA–OFDMA Greedy Scheduling Algorithm for WiMAX Networks

  • dul

    l Ca

    Heide

    Accepted 21 July 2012Available online 27 July 2012

    Keywords:SDMAOFDMA802.16

    Future wireless networks need to address the predicted growth in mobile trafc volume,

    Access (OFDMA) combined with Space Division Multiple Access (SDMA) techniques are key

    technologies to increase current spectral efciencies. Theperformance of MU-MIMO relies on the precoding capabil-ity of the transmitter and, in order to take full advantage of

    Therefore, in order to reduce the complexity of such a sys-tem the overall problem can be divided into several sub-problems, each conquering the different dimensionsseparately [1]. However, these subproblems still hold ahigh level of complexity and thus, suboptimal strategiesare often proposed in the literature [2].

    In contrast to related work approaches, in this paperwe take a comprehensive view at the complete SDMA

    1389-1286/$ - see front matter 2012 Elsevier B.V. All rights reserved.

    Corresponding author. Tel.: +49 3020933181.E-mail addresses: [email protected] (A. Zubow),marotzke@

    informatik.hu-berlin.de (J. Marotzke), [email protected](D. Camps-Mur), [email protected] (X.P. Costa).

    Computer Networks 56 (2012) 35113530

    Contents lists available at SciVerse ScienceDirect

    Computer N

    journal homepage: www.elshttp://dx.doi.org/10.1016/j.comnet.2012.07.009Mobile wireless networks are faced with an everincreasing demand for higher data rates, expected to growexponentially in the next years due to the smartphonemarket success and its corresponding set of bandwidthhungry applications, specially video and web. Transmis-sion schemes based on Orthogonal Frequency DivisionMultiple Access (OFDMA) and Multi-User Multiple InputMultiple Output (MU-MIMO) techniques are promising

    State Information (CSI) on the properties of the communi-cation link to each Mobile Station (MS). A well-knownMU-MIMO mode is Space-Division Multiple Access (SDMA)which can be used in the downlink direction to allow agroup of spatially separable MSs to share the same time/frequency resources.

    SDMAOFDMA systems have to allocate resources intime, frequency and space dimensions to different MSsresulting in a highly complex resource allocation problem.QoSScheduler

    1. Introductionpromising technologies to increase current spectral efciencies. A Joint SDMAOFDMA sys-tem has to allocate resources in time, frequency and space dimensions to different mobilestations, resulting in a highly complex resource allocation problem. In contrast to relatedwork approaches, in this paper we take a comprehensive view at the complete SDMAOFDMA scheduling challenge and propose a SDMAOFDMA Greedy Scheduling Algorithm(sGSA) for WiMAX systems. The proposed solution considers feasibility constraints in orderto allocate resources for multiple mobile stations on a per packet basis by using (i) a lowcomplexity SINR prediction algorithm, (ii) a cluster-based SDMA grouping algorithm and(iii) a computationally efcient frame layout scheme which allocates multiple SDMAgroups per frame according to their packet QoS utility. A performance evaluation of theproposed sGSA solution as compared to state of the art solutions is provided, based on acomprehensive WiMAX simulation tool.

    2012 Elsevier B.V. All rights reserved.

    such a technique, a Base Station (BS) needs full ChannelReceived 1 February 2012Received in revised form 4 June 2012

    expected to have an explosive growth in the next 5 years mainly driven by video andweb applications. Transmission schemes based on Orthogonal Frequency Division MultiplesGSA: A SDMAOFDMA greedy sche

    Anatolij Zubow a,, Johannes Marotzke a, DanieaHumboldt Universitt zu Berlin, Unter den Linden 6, Berlin, GermanybNEC Laboratories Europe, Network Research Division, Kurfrsten-Anlage 36,

    a r t i c l e i n f o

    Article history:

    a b s t r a c ting algorithm for WiMAX networks

    mps-Mur b, Xavier Perez Costa b

    lberg, Germany

    etworks

    evier .com/ locate/comnet

  • for each PUSC cluster (i.e. 14 contiguous subcarriers). How-ever, BSs are allowed to use different beamformingweights for different PUSC clusters.

    Fig. 2 illustrates how SDMA can be implemented overan 802.16e TDD OFDMA DL subframe. Note that the initialMAP messages used to signal the data allocations to eachMS in the DL subframe are transmitted omnidirectionally.However, successive time frequency allocations withinthe OFDMA frame can be shared among different MSs, ifthese are spatially separable. For instance, we can see inFig. 2 how the BS chooses to allocate the same time fre-quency resources for MS1 and MS2 because they can beseparated in the space domain. The lower part of Fig. 2illustrates how the individual PUSC clusters randomlymap to each logical subchannel, and how the BS, by apply-ing a different pre-coding matrix to each cluster, is able toconcurrently steer different spatial beams in different timefrequency regions within the OFDMA frame. Notice that in

    3512 A. Zubow et al. / Computer Networks 56 (2012) 35113530OFDMA scheduling challenge and propose a SDMAOFDMA Greedy Scheduling Algorithm (sGSA) for WiMAXsystems. Specically, we focus on the WiMAX IEEE802.16-2009 [3] standard, which provides scalable OFDMAcapabilities as well as additional MIMO PHY features.According to our proposed sGSA approach the SDMAOFD-MA scheduling solution is divided into three main steps:

    1. SINR prediction estimation of the SINR experienced byeach MS in an SDMA group.

    2. SDMA grouping partitioning of the MSs into a set ofSDMA groups, where spatially compatible MSs arearranged in the same SDMA group.

    3. Frame partitioning partitioning of an OFDMA frameinto multiple containers each representing a singleSDMA group.

    This paper is an extension of the work from the authorsin [4] which is currently under submission. The paper athands extends [4] by (i) designing a low complexity SINRprediction algorithm to reduce the computational load ofthe overall SDMAOFDMA scheduling solution, (ii) evalu-ating the computational load reduction and correspondingSINR prediction error impact, (iii) recalculating all resultsin the performance evaluation section accordingly and(iv) considering additional factors in the system perfor-mance evaluation as number of operations, signaling over-head and buffer size limitations.

    The rest of the paper is structured as follows. Sections 2and 3 provide the IEEE 802.16-2009 basic elements consid-ered in our solution and summarize the most relevantwork in the area. Next, Section 4 formalizes the problemdescription and formulates the SDMAOFDMA schedulingas an optimization problem. Section 5 describes our pro-posed sGSA solution along with the designed algorithmsrequired. In Section 6 a performance evaluation is con-ducted through simulations. Finally, Section 7 concludesthe paper.

    2. SDMAOFDMA support in IEEE 802.16e

    The focus of this paper is an IEEE 802.16e TDD OFDMAsystem where Base Stations (BSs) access the channel bymeans of Downlink (DL) and Uplink (UL) subframes. Eachsubframe is split into a set of OFDMA symbols in timeand logical subchannels in frequency, where the mappingbetween physical subcarriers and logical subchannels isdetermined by a permutation scheme. 802.16e supports dif-ferent permutation schemes being Partial Usage of SubCar-riers (PUSC) the most widely used. PUSC is a distributedpermutation scheme that achieves a frequency diversitygain by mapping a set of subcarriers randomly spreadedacross the available bandwidth into each logical subchan-nel. In the rest of the paper we will consider PUSC as thepermutation scheme used by our system.

    PUSC works in the following way. First, the availablephysical subcarriers are divided into clusters of 14 contigu-ous subcarriers, containing each cluster 2 pilot subcarriersand 12 data subcarriers (as depicted in Fig. 1). The pilotsubcarriers in each cluster are used by a receiver to esti-mate the channel between transmitter and receiver andthus demodulate the received data subcarriers. Then, inorder to achieve the intended diversity gain, physical clus-ters are randomly renumbered, using a random seed localto each BS, and sequentially mapped to each logical sub-channel. Each logical subchannel is hence composed oftwo PUSC clusters.

    Leveraging PUSC, a BS can implement beamforming orSDMA in a transparent way for the Mobile Stations (MSs)by simply using dedicated pilots. When beamforming orSDMA is utilized a BS uses a set of beamforming weightsor a precoding matrix,W, in order to shape its antenna pat-tern towards its intended receivers. Thus, if the samebeamforming weights are used for the pilot subcarriersand the data subcarriers within each cluster, a receiver willsimply perceive the beamformed transmission as a modi-ed channel, and will transparently demodulate the beam-formed data. In order to do so, a BS signals the use ofdedicated pilots to a MS, such that the MS only uses the pi-lot subcarriers belonging to its own data allocation todemodulate the received data. Given the use of dedicatedpilots and the cluster structure in PUSC, 802.16e BSs mustuse the same beamforming weights or pre-coding matrix

    Fig. 1. DL PUSC cluster structure.

  • A. Zubow et al. / Computer Networks 56 (2012) 35113530 3513order to allocate the same time frequency resource to dif-ferent MSs, the BS simply adds for each MS an entry in theMAP pointing to the intended region.

    In order to derive the pre-coding matrix, W(f), to be ap-plied to each cluster the BS needs Channel State Information(CSI) from each MS. Indeed, in a wideband channel like theone considered for IEEE 802.16e systems the spatial channelis typically dependent on frequency, i.e. H(u, f) where uindicates the MS and f the subcarrier. Several options areavailable in 802.16e for the MSs to provide the requiredfeedback to the BS. In this paperwewill assumeUL soundingas the employed feedback mechanism and we describe itnext.

    In an 802.16e TDD system, BSs and MSs can run a cali-bration protocol that ensures that the downlink and uplinkchannels are symmetric. Therefore, the BS can estimate thedownlink channel towards each MS by observing the dis-tortion experienced by sounding signals sent by the MSsin the uplink. For this purpose, the BS allocates an ULsounding region in the uplink subframe, as depicted inFig. 2, and indicates in the UL MAP which MS should trans-mit sounding signals on which frequency band within thisregion. If a MS moves, the BS can command a periodictransmission of sounding signals to be able to track thevarying channel. Finally, in order to reduce signalingoverhead, several MSs can be multiplexed within thesame UL Sounding region, e.g., making use of decimationwhereby each MS only sends a sounding signal every Dthsubcarrier.

    Fig. 2. SDMA over 8023. Related work

    In this section we discuss related work relevant to thecontributions presented in this paper. This section isdivided in four subsections: (i) SDMA Transmittertechniques, (ii) SDMA Grouping Algorithms, (iii) SDMAGrouping Metrics, and (iv) Full SDMAOFDMA Solutions.

    3.1. SDMA Transmitter techniques

    In SDMA multiple antennas are used to send signals toseveral spatially separated MSs using the same timefre-quency resources. For this purpose knowledge of the spa-tial signatures of each MS is required, that can be utilizedin different ways. Next, we describe two major SDMATransmitter techniques: (i) Dirty Paper Coding (DPC), and(ii) Beamforming.

    First, DPC is a technique that using knowledge about theinterference at the transmitter, pre-codes the transmittedsignal in order to cancel the effect of such interference. In-stances of DPC techniques include Costa precoding [5],TomlinsonHarashima precoding [6] and the vector per-turbation technique [7].

    Second, beamforming allows the separation of multi-ple MSs in space by means of allocating a beamformingdirection to a given MS while treating other MSs asnoise. Common beamforming techniques are the Zero-Forcing technique [1] that introduces nulls in the direc-

    .16e using PUSC.

  • of the spatial channel cannot longer be described by asingle channel transfer matrix. Due to frequency selectiv-

    are grouped according to their channel conditions,

    3514 A. Zubow et al. / Computer Networks 56 (2012) 35113530tions of the interferers, and techniques that maximizethe receiver SIR like Minimum Variance DistortionlessResponse, Maximum SIR or Minimum Mean Square Error[8].

    Compared to DPC, beamforming techniques are subop-timal [9]. However, beamforming has a lower computa-tional cost than DPC, and therefore is a valuablealternative in practical systems. In addition, under somecircumstances beamforming can achieve the same capacityas DPC [10].

    3.2. SDMA grouping algorithm

    Randomly selecting the MSs that are multiplexed inspace using SDMA may result in a high interference be-tween MSs. Therefore, in practice a grouping algorithm isused that groups MSs into suitable subsets that can beeffectively separated in space.

    The task of a grouping algorithm is thus to determine anefcient SDMA group with a complexity as low as possible.Next, we describe some of the SDMA grouping algorithmscommonly used in the state of the art. The Best Fit Algo-rithm (BFA) [2] is a greedy algorithm that constructs anSDMA group starting with the MS that experiences thelowest SNR/SINR, and sequentially extending this groupby admitting the MS providing the highest increase of a gi-ven grouping metric (refer to Section 3.3). Once the groupreaches its target size, or no more MSs can increase thegrouping metric, the SDMA group is considered complete[1]. Another closely related algorithm with even lowercomplexity is the First Fit Algorithm (FFA). Its only differ-ence compared to the BFA, is that instead of admittingthe MS providing the highest increase of a given groupingmetric, FFA adds to the SDMA group the rst MS that pro-vides any increase in the grouping metric [2].

    3.3. SDMA grouping metric

    As mentioned before, grouping algorithms require agrouping metric in order to compare candidate SDMAgroups with each other. In general, a grouping metricmakes use of Channel State Information (CSI) in order tomap the characteristics of the spatial channels of the MSsto a scalar value that quanties how efciently these MSscan be separated in space [1]. The most commonly usedgroup metrics are group capacity and group minimum SINR.The former considers the Shannon-Hartley capacity of anSDMA group as the spatial compatibility metric, and thelatter returns the lowest SINR of an MS in a given SDMAgroup. In order to compute the previous metrics though,the actual beamforming weights or the power allocationhave to be considered which involve complex vector/ma-trix operations. Therefore, the performance achieved bysuch metrics comes at the expense of an increased com-plexity [11]. In order to decrease complexity, groupingmetrics that are based only on the spatial correlation andon the channel gains of the MSs, involving much simplervector/matrix operations, have been proposed [12,2,1].

    Finally, the previous grouping metrics usually considera narrow-band channel, however in a wideband or fre-quency selective channel like 802.16e the characteristicswhereas, within a group, MSs are prioritized accordingto packet deadlines. Depending on the intra-beam inter-ference MSs are assigned to a beam or moved to the reg-ular non-AAS zone. The improvement achievable whenusing AAS is evaluated for FTP and VoIP services.

    The main difference between these approaches and thework presented in this paper is the assumption of an ideal-ized LOS channel allowing beamsteering based on the esti-mation of the angle of direction of an MS. Instead, the workpresented in this paper explicitly models the wirelesschannel coefcients, taking advantage of the LOS compo-nent as well as possible scattering and multi-path effects.

    1 IEEE EESM Reference, www.ieee802.org/16/tge/contrib/C80216e-05_141r3.pdf.ity the channel transfer matrix is different for differentsubcarriers; hence the spatial compatibility among MSsin an SDMA group depends on the used subcarriers. Inorder to address this fact, in this paper we considerPHY abstraction models like the Exponential EffectiveSIR Mapping (EESM1) that allow to collapse a set of per-subcarrier SINR measurements into an effective SINR mea-surement, that can then be used to calculate a groupingmetric like group capacity. The idea of using EESM as amulti-carrier grouping metric is not far-fetched, as theauthors of [13] concurrently to the work presented in thispaper chose to do the same thing.

    3.4. Full SDMAOFDMA solutions

    To the best of the authors knowledge only a few pro-posals in the state of the art evaluate the performance ofSDMA in OFDMA-based systems like 802.16e. Next, we dis-cuss these proposals.

    Nascimento and Rodriguez [13] proposed a joint util-ity packet scheduler and SDMA-based resource allocationscheme for 802.16e. Similar to the work presented inthis paper, [13] uses the EESM compression method tocapture frequency selective channels and build SDMAgroups consisting of spatially uncorrelated MSs. The pro-posed scheduling solution therein assigns MSs to beams,using an FFA-like grouping algorithm and spatial correla-tion (computed based on the EESM SINR) as a groupingmetric. In addition, QoS is achieved through a prioritizedassignment based on a utility framework. In order to re-duce complexity, the authors in [13] reduce the SDMAcapabilities to one third of the OFDMA frame leavingthe rest for non-SDMA transmissions. Performance isevaluated under the assumptions of a full queue trafcmodel.

    Similar to the previous work, Yao et al. [14] evaluatedthe MAC performance of an OFDMA system based on thesuperseded 802.16a-2003 standard, which suggests aframe layout separated in Adaptive Antenna System(AAS) and non-AAS capable zones. The scheduling solu-tion presented therein prioritizes MSs based on theirchannel conditions and packet delays. Specically, MSs

  • h2,u, . . . , hM,u]T, where M is the number of available anten-

    s = 1, . . . , S, and the corresponding channel coefcients,the SINR of MS u can be computed as [1]:

    A. Zubow et al. / Computer Networks 56 (2012) 35113530 3515nas at the BS and Hu holds all spatial correlations and mul-ti-path effects of the MS u channel. The BS uses precodingweights to adjust the transmitted signal in order to miti-gate the propagation effects of the channel and to controlthe interference among MSs in a SDMA group. Therefore,every channel coefcient hi,j is associated with a beam-

    T4. Problem statement

    This section describes the physical and QoS modelsused throughout this paper and formulates our suggestedSDMAOFDMA scheduling solution as an optimizationproblem.

    4.1. PHY model

    We consider the DL of a single BS with a maximumtransmit power that is being equally distributed amongthe served MSs. The BS is equipped with an Uniform LinearArray (ULA) and there are K single-antenna MSs associatedwith the BS (Fig. 3). We consider OFDMA where the chan-nel bandwidth is divided into S orthogonal subcarriers. Inaddition, physical subcarriers are mapped to logical sub-channels using PUSC. Regarding CSI knowledge in the BS,we assume that for each MS u the BS knows the channeltransfer matrix (Hu,s) of every Dth subcarrier, hence result-ing in SD Hu;s coefcients per MS.

    2 In addition, we assumethat the BS has full CSI on each Hu,s, i.e. there are no estima-tion errors. The required channel coefcients are generatedwith the WIM simulator [15], where only low to medium-speed mobility is considered.

    Next, we introduce the format used for the channeltransfer matrix, the beamforming weights, and the SINRcalculation. For an MS u (u = 1, . . . , K) the channel coef-cient hi,u denotes the sampled frequency response of thechannel between the BS antenna element i and the receiverantenna of the u-th MS as depicted in Fig. 3. Thus, thechannel coefcients belonging to a given MS u can begrouped into a column vector M 1, i.e. Hu = [h1,u,

    Fig. 3. System model: DL of a single BS equipped with Antenna Array andmultiple single-antenna MSs.forming weight wi,j, and Wu = [w1,u, w2,u, . . . , wM,u] is thenormalized weight vector for a given MS u.

    The effective channel gain at the MS can then be com-

    puted as WHu Hu 2

    2, where ()H denotes the conjugate

    transpose of a vector/matrix. Notice, that multiple MSsbeing served during the same timefrequency resourcecause interference upon each other (intra-cell interfer-ence). Therefore, given a certain frequency resource

    2 In simulations D was set to 24, i.e. the number of Hu,s per MS was 30.cu;s Ru;s WHu;s Hu;s

    22

    r2 PMv1;uvRu;s WHv;s Hu;s 22

    1

    where Ru;s represents the average received power3 at MS uon subcarrier s. We assume that the BS allocates the samepower to all subcarriers.

    In this paper, we assume that the BS uses the MinimumMean Square Error technique (MinMSE [8]) in order to com-pute beamforming weights. MinMSE works in the follow-ing way. For a given frequency resource unit s the matrixHs contains in each u-th column the channel coefcients(Hu,s) of the u-th MS. Let Rss Hs HHs be the autocorrela-tion matrix of all MSs, and Rnn the noise correlation matrix.Then, the weight matrix Ws containing the beamformingweights for every u-th MS in an SDMA group, can be com-puted as:

    Ws Rss Rnn1 Hs; 2where the u-th column in Ws contains the weights for theu-th MS for the given frequency resource unit s.

    For comparison reasons, we will also consider in thispaper the possibility of beamforming to a single MS in-stead of using SDMA. In this case, we assume that the BSuses Maximum Ratio Combining (MRC [16]) to calculatethe required beamforming weights:

    WHu;s HHu;s

    kHu;sk223

    In case of MRC beamforming, the achieved SNR of a singleMS u is also given by Eq. (1), using the MRC beamformingweights and ignoring the intra-cell interference term. Thedenominator of Eq. (3) normalizes the weights to unityso that the average total transmit energy remains thesame. The same normalization is performed for theweights given by the MinMSE technique.

    Finally, the set of per-subcarrier SINR values can be col-lapsed into a single equivalent SINR value for each MSmaking use of the EESM mapping technique.

    4.2. Base station architecture and QoS model

    We consider an ofine BS architecture as formally de-ned in [17]. In an ofine architecture there are two mainbuilding blocks that cooperate in order to maximize theutility of the scarce radio resources; these blocks are theQoS Scheduler and the SDMAOFDMA scheduler. The taskof the QoS Scheduler is to select from the higher layerpackets available in each ows buffer a candidate list ofpackets to be transmitted in the next DL subframe. In addi-tion, the QoS scheduler tags each individual packet Pi witha utility value ui. Typical QoS schedulers that can be accom-modated in this architecture are Proportional Fair or De-cit Round Robin [18]. Thus, after receiving a set ofcandidate packets from the QoS Scheduler, the task of the

    3 Path loss and Shadowing are included in Ru;s and not in Hu.

  • ( )3516 A. Zubow et al. / Computer Networks 56 (2012) 35113530SDMAOFDMA scheduler is designing the SDMAOFDMAframe layout, i.e. assign the time, frequency and space re-sources, in order to maximize the amount of utility carriedin the SDMAOFDMA frame. The considered ofine BSarchitecture is illustrated in Fig. 4.

    4.3. Problem formulation

    We now introduce the DL SDMAOFDMA schedulingproblem considered in this paper. As previously stated,the main objective of a DL SDMAOFDMA scheduler is tomaximize the total utility carried in the DL subframe.However, optimally assigning time, frequency and spaceresources is a very complex task to be efciently imple-mented in practice. Therefore, in order to reduce complex-ity we divide the overall optimization problem into twoindependent major tasks:

    1. SDMA group formation: we look at the problem of creat-ing spatially compatible SDMA groups, where eachgroup consists of a set of stations which can be sepa-rated in the space domain.

    2. OFDMA frame construction: given a set of SDMA groups,

    Fig. 4. QoS architecture considered in the BS.we consider the problem of scheduling the subset ofthese groups within the OFDMA DL subframe that max-imizes the overall utility carried in the OFDMA frame.

    4.3.1. SDMA group formationThe objective of this problem can be formalized in the

    following way:Instance: A set U of MSs having buffered packets as

    delivered by the QoS scheduler (Fig. 4).Objective: Find the optimal partition of U into a set

    G fG1;G2; . . . ;Gmg of non-overlapping and non-emptysets of MSs that cover all of U, i.e.

    Smi1Gi U and Gi \ Gj = ;,

    "i j, such that the average group capacity is maximized.Here Gi = {ui,1, ui,2, . . . } represents a single SDMA groupwhere ui,j represents the j-th MS in group i. Moreover,cu;Gi is the SINR of MS u in group Gi.

    This optimization problem can be formulated asfollows:

    tainer

    uled w

    This optimization problem can be formalized as follows:

    fSg arg maxfSgXjGji1

    Xu2Gi

    Xpu 2 P0uP0u# Pu

    upu

    8>>>>>>>>>:

    9>>>>>=>>>>>;

    7ithin the frame.container such that the total utility, u(), carried by theOFDMA frame is maximized, i.e. S 2 s0; s1; . . . ; sjGj, whereDLslots is the total number of time slots in the DL frame.Here, the rst container of width s0 represents the spacereserved for the DL-MAP to signal the data bursts sched-. Find the optimal width si 2 [0 . . . DLslots] for eachG arg maxG 1jGjXjGji1

    Xu2Gi

    log1 cu;Gi 4

    subject to:

    (I) A maximum group size limited to M, i.e. the numberof available antennas at the BS:

    jGij 6 M;8Gi 2 G 5(II) A minimum SINR constraint, that makes sure that

    each MS in an SDMA group can be served at leaston the lowest Modulation and Coding Scheme(MCS):

    cu;Gi P cMIN;8Gi 2 G;8u 2 Gi 6Notice in Eq. (4) that the target metric is divided by thenumber of SDMA groups, jGj, in order to favor solutionswith a fewer number of groups. The reason is that solu-tions with fewer groups tend to be more efcient becausedifferent groups can only be scheduled in different timefrequency resources.

    Complexity: The optimal solution for this group forma-tion problem was shown to be NP-complete in [2].

    4.3.2. OFDMA frame constructionBefore formalizing this problem, we introduce an addi-

    tional design decision that consists of forcing an allocatedSDMA group to span always an integer number of columnsin the OFDMA frame. The main reason for this decision isthat it enables the previous SDMA group formation prob-lem to be agnostic to the actual frequency region wherethe data will be transmitted, i.e. the required SINR valuesin Eq. (4) can be derived assuming that an allocation spansthe whole frequency range. In addition, this design deci-sion turns the packing of SDMA groups within an OFDMAframe into a one-dimensional packing problem instead ofa two-dimensional packing problem. In Section 6 we willshow that this simplication allows to design well per-forming practical solutions. Thus, our OFDMA frame con-struction problem is formalized in the following way:

    Instance: A partition of the set U of MSs into a set G ofnon-overlapping and non-empty sets of MSs that cover allof U.

    Objective: The packing area for each SDMA group Gi 2 Gin the OFDMA frame is controlled by an associated con-

  • Table 1Simulation parameters.

    MS nCellWIMNum

    scFadinPack

    A. Zubow et al. / Computer Networks 56 (2012) 35113530 3517subject to:

    No. of placement seeds 256(I)

    whereresentthe nuto thecalculand SI

    (II)

    whereheads

    CoEqs. (Knaps

    4 Pachemeg outage margin 3 dBet buffer size K 12.96 KiB(M) by half wavelengthMSs placement UniformMSs speed 2 km/hOFDMA frame duration 5 msTx Precoding algorithm MinMSENo. of Hu,s per MS 30 (D = 24)DL/UL ratio 35/12MAP QPSK 1/2F0 xed_dl_map_overhead (72 bits)F1 dl_map_ie_xed_overhead (44 bits)F2 dl_map_ie_cid_size (16 bits)SC num_subcarriers_slot (48)MRep 1 MAP repetitionWiMAX permutation PUSCoise density (dBm/Hz) 167 dBm/Hzradius 288 mscenario C2 (urban macro-cell, LOS)ber of antennas at BS 15 Omni elements separatedParameter Value

    System bandwidth 10 MHzSubcarrier bandwidth 10.9375 kHzFFT size 1024Center frequency 2.5 GHzFrequency reuse pattern 3 1 1Transmit power 46 dBmA container layer space constraint that makes surethat the total size of packed data bursts P0u in eachcontainer layer does not exceed the available space:Xpu2P0u

    p2burst pu; cu;Gi 6 Si B 8

    cu;Gi represents the SINR of MS u in group Gi; Si rep-s the width of the i-th container, and B representsmber of frequency resources in one time slot (equalnumber of subchannels). The function p2burst ()

    ates the burst size in slots for a given packet sizeNR value.A DL-MAP space constraint that makes sure thatthere is sufcient space for the DL-MAP, so that allpacked data bursts P0u in each layer of each containercan be properly signaled. For this purpose, the con-tainer at position zero is reserved for the DL-MAP.4

    F0 P

    u2UF1 PjP0u j

    p1F2SC

    & MRep 6 S0 B 9

    F0, F1, F2, SC and MRep represent various MAP over-(Section 6, Table 1).

    mplexity: The optimization problem described by7)(9) is a variant of the Multiple-Choice Nestedack Problem [19], because we have to obtain the sub-

    king data bursts below DL-MAP is not allowed in our model.compatible SDMA groups, sGSA contains three differentmodules:

    1. The SINR Predictor Algorithm which allows to obtain anset of MPDUs for each layer in each SDMA container thatmaximizes the utility carried in the OFDMA frame. Thisproblem has been proved to be NP-complete [20].

    5. SDMA Greedy Scheduling Algorithm (sGSA)

    In this section we present the design of our DL SDMAOFDMA Greedy Scheduling Solution (sGSA). As describedin the previous section, in order to implement the assign-ment of time, frequency and space resources in an efcientway, sGSA addresses sequentially the following two sub-problems: (i) rst, the problem of creating spatially com-patible SDMA groups, and (ii) second the problem of sched-uling a given set of SDMA groups within an OFDMA frame.

    In order to address the problem of creating spatially

    Fig. 5. Overview of the proposed sGSA scheduling solution.estimate of the SINR experienced by each MS in a can-didate SDMA group at a lower complexity than state-of-the-art approaches.

    2. The Cluster-based Grouping Algorithm (CBA), that is aheuristic method to solve the SDMA group formationproblem described in Eq. (4).

    3. The Adaptive Modulation and Coding (AMC) module, thatselects the appropriate MCS to be assigned to each MSwithin an SDMA group.

    Finally, given a set of SDMA groups, sGSA solves theOFDMA frame construction problem described in Eq. (7)making use of an SDMAOFDMA packing algorithm.

    The interactions between the different modules thatcompose sGSA are depicted in Fig. 5. Next, we provide a de-tailed description of each module.

    5.1. SINR prediction algorithm

    As described in Section 3.3 we can nd in the state ofthe art grouping metrics like group capacity which

  • resource block b that is obtained assuming a non-SDMAtransmission (MRC beamforming, refer to Eq. (3)). Then,this initial SINR estimation is corrected by the expected in-tra-SDMA group interference caused by the other MSs inthe group. In particular, the applied correction factor iseither vbu if vbu 6 1, or a constant factor b, which in our casewe set to 13.5 dB (0.0443 in linear), if v > 1. The reason forthe constant factor is that our simulations showed that forv > 1, the MSs SINR was on average 13.5 dB below itsSNRMRC value. This applied constant factor though couldbe tuned in order for the SINR Predictor method to be ap-plied in other scenarios.

    In addition, in our system a minimum threshold forpacket reception of 5 dB is required to transmit using thelowest MCS. Therefore, a MS u having a SNRMRC(u) < 18.5and v > 1 will be considered to be in outage by the SINRPredictor.

    In addition, for a given frequency resource block b, the

    3518 A. Zubow et al. / Computer Networks 56 (2012) 35113530provide an accurate representation of the interferencethat each MS in an SDMA group is subject to but involvecomplex vector/matrix operations required to computethe actual pre-coding matrixes (beamforming weights).In addition, low complexity grouping metrics are alsoavailable that, however, do not accurately measure inter-ference within an SDMA group. In this section we pres-ent our SINR Predictor which is a heuristic method toresolve this trade-off.

    In particular, the SINR Predictor is a low-complexityalgorithm for predicting the SINR value of each MS in agiven SDMA group without computing the precoding ma-trix. Notice that as explained in Section 3.3, if per-subcar-rier SINRs are known, then the EESM mapping techniquecan be used to derive a single equivalent SINR to computemetrics like group capacity or group minimum SINR. Thekey question to resolve thus is: how to obtain an accurateestimation of the per-subcarrier SINR without performingexpensive pre-coding matrix computations.

    Fig. 6 illustrates the basic difference between calculat-ing the SINR using a precoding matrix and our proposedSINR Prediction algorithm. When calculating the SINR ofa MS (red) within a SDMA group the proposed SINR Pre-diction method considers only the spatial correlation be-tween the MS of interest and the other MSs within thatgroup. Hence, unlike the case of using a precoding matrix,in the SINR Predictor case the spatial correlations be-tween the other MSs within that group are not consid-

    Fig. 6. Illustration of two methods for calculating the SINR of a user: (i)from the actual precoding matrix derived by any arbitrary MIMOtechnique, (ii) using the SINR prediction algorithm.ered. Therefore, the SINR Predictor is a lower complexitysolution but introduces an estimation error in the per-subcarrier SINR.

    In particular, the SINR Predictor estimates the SINR forany SDMA group G and MS i 2 G on frequency resourceblock b in the following way:

    ~cbu;G SNRMRCu 1 vbu; if vbu 6 1

    b; else

    (10

    where vbu P

    u02G;u0usbu;u0 , i.e. the sum of the squared spa-tial correlations between MS u and the other MSs in groupG on frequency resource block b.

    The idea of the SINR Predictor is to consider an initialSINR for a MS u in an SDMA group G and frequencyterm ssu;u0 in Eq. (10) is computed as:

    su;u0 pu;u0 2 11

    where given the channels hu and hu0 of MSs u and u0,respectively, p represents the spatial correlation betweenthese two MSs on a given frequency resource block andis given by the maximum normalized scalar product[21,11]:

    pu;u0 jhHu hu0 j

    khuk2khu0 k212

    Complexity: An arbitrary MIMO beamformer techniquethat optimizes beamforming weights under a certaincriteria does so by solving a system of M linear equa-tions. Using Gaussian elimination for this purpose re-sults in an arithmetic complexity of OM3 [22], whereM is the number of antennas in the BS. Next, we willshow that the complexity of our proposed SINR Predic-tor is OM.

    Algorithm 1. Cluster-based SDMA grouping algorithm.

  • our CBA grouper) the complexity could be reduced evenfurther. The reason is that every time a new MS is addedto a group, every MS within the group only needs to addits correlation to the added MS, whereas the newly addedMS sums the correlation to every other MS, hence resultingin (g 1 + g 1) lookups and a complexity of OM, i.e.ON in common complexity terminology where N corre-sponds in this case to the number of MSs.

    5.2. Cluster-based grouping algorithm

    The SDMA group formation problem described in Eq. (4)has a high computational complexity. Therefore, in thissection we present a heuristic method to reduce it. Ideally,the proposed heuristic solution should have the followingproperties:

    Create SDMA groups that maximize a given groupcapacity metric.

    Keep the number of SDMA groups low, because differ-

    functions (clusterSwap and clusterJump) (lines 67 inAlgorithm 1), used to exchange the membership ofMSs between groups. ClusterSwap exchanges two MSsfrom different clusters with each other, whereas cluster-Jump allows a MS to leave one cluster and join anotherone. In order to bound execution time, the number ofswap and jump operations is bounded by K, i.e. thenumber MSs (line 5 in Algorithm 1). In addition, if atthe end of one iteration, some MSs are found to bein outage (meaning that the number of clusters is toosmall to handle all MSs) new clusters are added andthe in-outage MSs are distributed among them (lines1113 in Algorithm 1). Notice that in each iterationthe number of clusters is increased by at least one. Fi-nally, CBA terminates when no more MSs are inoutage.

    The clusterSwap operation is described in Algorithm 2.Its main task is to nd the swap between two MSs fromdifferent clusters, where the improvement in terms of agiven group metric c() is largest. Notice, that when com-puting the group metric, only MSs having a SINR abovethe minimum threshold are considered, where SINR iscalculated using our SINR Prediction algorithm (refer toSection 5.1). In addition, the swapping operation pre-serves valid groups; i.e. if before the swap none of the

    A. Zubow et al. / Computer Networks 56 (2012) 35113530 3519timefrequency resources.

    ent groups can only be scheduled in non-overlappingThe complexity of computing the term s itself is linearon the number of antennas in the BS, i.e. OM. In addition,within the context of an SDMA group evaluation, smust becalculated for each MS to every other interfering MS. Thismeans that for a given group size g there will be a totalof g(g 1) evaluations of the term su;u0 , where g 6M, henceresulting in an overall complexity of OM3. However, theterm su;u0 can be pre-computed and stored. Pre-generatingthe values for every MS pair reduces the complexity ofevaluating an SDMA group to g(g 1) simple lookups,and, given that g 6M, the overall complexity is OM2. Fi-nally, for any SDMA grouper that evaluates SDMA groupsiteratively through sequentially adding MSs (e.g. BFA orAlgorithm 2. Cluster swap operation in CBA. Ensure that MSs in an SDMA group are not in outage, i.e.their SINR is above a minimum threshold.

    Our solution to the SDMA group formation problemis the Cluster-Based grouping Algorithm (CBA) describedin Algorithm 1. The core idea behind CBA is the follow-ing. Initially, CBA creates dK/Me clusters, assuming thatwith M antennas we can spatially separate M MSs, andrandomly distributes MSs among these clusters (line 3in Algorithm 1). Afterwards, the initial random groupsare iteratively improved by means of two elementary

  • timefrequency resources (line 11 in Algorithm 3).

    AMC module is to correct any errors introduced by theSINR predictor.

    In order to derive the MCS to be assigned to each MS,

    grouping.

    3520 A. Zubow et al. / Computer Networks 56 (2012) 35113530Complexity: Given that clusterSwap and clusterJumphave a complexity of OK2, a maximum of K swap andjump operations are allowed (line 5), and in theworst case there is always at least one MS in outageafter each iteration, the worst case computationalcomplexity of CBA regarding its execution time isOK4, i.e. ON4 in common complexity terminologywhere N corresponds in this case to the number ofMSs. For comparison, BFA has an execution time com-plexity of ON2.

    5.3. Adaptive Modulation and Coding (AMC) module

    The Adaptive Modulation and Coding (AMC) modulereceives a set of proposed SDMA groups from CBA andis in charge of deciding the Modulation and CodingScheme (MCS) to be assigned to each MS in each spatialgroup. Notice though that the received SDMA groupswere created based on the SINR estimation providedby the SINR Predictor which, as we will later show,tends to be optimistic especially as the number ofantennas grows. Therefore, one of the goals of theFinally, a jump is only allowed if after the jump all MSin the new cluster are not in outage (line 1214 in Algo-rithm 3).

    Algorithm 3. Cluster jump operation in CBA.MSs within a cluster is in outage then this property ispreserved after performing the swap (lines 1012 inAlgorithm 2).

    The clusterJump operation is similar to clusterSwapexcept that a MS leaves a cluster and joins another clus-ter so that the improvement in terms of group metric islargest. ClusterJump is described in Algorithm 3. Inaddition, in clusterJump, the computed group metric, C0,is divided by the number of clusters in order to accountfor the fact that having MSs grouped in less clusters ismore efcient, because they can share the sameNotice that being the SINR Predictor algorithm optimis-tic, the AMC module might detect some of the MSs in theproposed SDMA groups to be in outage. This situation isdealt with in Algorithm 4, where for each SDMA group Ghaving MSs in outage, the worst MS u (largestSNRMRC(u) c(u, G)), is removed from the group. Then, anew precoding matrix is calculated for the MSs left in thegroup. The removed MSs are collected and handed backto CBA that proposes additional SDMA groups containingonly the in-outage MSs (Fig. 5). In order to decrease thelikelihood of these MSs being in outage again, the maxi-mum allowed group size is reduced by one.5 In the worstcase the algorithm terminates when a maximum group sizeof two is reached.

    5 In Section 6 we will show that the estimation accuracy of the SINRPredictor is better for small group sizes.the AMC module computes the actual pre-coding ma-trixes per frequency block (e.g. PUSC cluster) that willbe used to steer separated beams for each MS in an SDMAgroup. Once the pre-coding matrixes are computed, theAMC module can obtain a reliable SINR estimate for eachMS and frequency block by means of Eq. (1). Then, theAMC module collapses the set of per-frequency blockSINR values, {cu,b}, into a single equivalent SINR valuefor each MS making use of the EESM mapping technique.Finally, a single scalar SINR value is available for each MSand given a target block error rate (1% in our case), theAMC module selects the proper MCS by doing a simplelook up operation on a pre-computed Block Error Rate-SNR table.

    Algorithm 4. Required preprocessing step in the AMCmodule when SINR prediction algorithm is used for

  • Algorithm 5. The SDMAOFDMA packing algorithm par-titions a frame into containers representing the packingarea for SDMA groups.

    ers. Each container is associated with an SDMA group,

    AOFDMA frame layout.

    A. Zubow et al. / Computer Networks 56 (2012) 35113530 3521Fig. 7. WiMAX SDMComplexity: Given a set of SDMA groups, the number ofprecoding calculations performed by the AMC module cor-responds to the product between the number of groups, jGj,and the number of frequency blocks, B. However, addi-tional precoding calculations may be required if there areMSs in outage. In the worst case the SINR predictor con-structs jGj groups; each consisting of M highly correlatedMSs. In that case, in each iteration of Algorithm 4 eachgroup is reduced to a group consisting of only one MS,resulting in a total of OjGjM2B precoding calculations.Therefore, if jGj dK=Me the worst case time-complexityregarding the number of precoding calculations isOBMK, i.e. ON in common complexity terminologywhere N corresponds in this case to the number of MSs.In Section 6 we will show that in practical scenarios, thisalgorithm operates well below its worst case complexity.5.4. SDMAOFDMA packing algorithm

    Having addressed in the previous sections the SDMAgroup formation problem described in Eq. (4), we presentin this section our heuristic solution to the SDMAOFDMAframe construction problem described in Eq. (7).

    Given a set of SDMA groups we need to select a subsetof groups to be scheduled in the OFDMA frame and allocatefor each group a certain space within the frame, which wehereafter refer to as a container. In addition, we also needto allocate a certain amount of space for signaling (MAPmessages) that varies depending on the selected SDMAgroups. Fig. 7 illustrates an example frame layout to begenerated by sGSA, where we can recognize three contain-

  • be calculated for each group due to a changing constraintfor the maximum allowed signaling (MAPmax). Thus the to-tal number of times the container metric is executed is Ktimes for the initial K containers and DLslots K times untilthe algorithm terminates. The container metric makes useof the packing estimator (Algorithm 7) which itself hascomputational complexity of OP, where P is the number

    3522 A. Zubow et al. / Computer Networks 56 (2012) 35113530whereas the container at position zero is reserved for thetransmission of signaling data (Preamble, FCH, MAPs). Eachcontainer has a variable width (time slots) but alwaysspans all subchannels (subcarriers) of a given time slot.Moreover, each container consists of multiple spatial lay-ers carrying data bursts belonging to the MSs of the asso-ciated SDMA group. In the example in Fig. 7 the numberof spatial layers is two and three for the two data contain-ers respectively.

    Thus, our SDMAOFDMA packing algorithm fullls thetasks of selecting and allocating SDMA groups within theOFDMA frame, in order to maximize the overall carriedutility. Algorithm 5 presents our SDMAOFDMA packingalgorithm that consists of two phases: (i) the preparationphase (lines 34), and (ii) the extension phase (lines 516).

    Preparation phase A container can be either in statusscheduled or unscheduledwhere at the beginning all contain-ers are set to unscheduled. The initial width of a container isthe maximum between one time slot, and the minimumnumber of time slots required to carry at least one packet.

    Extension phase In this phase, upon each iteration, acontainer metric representing the value of each candidatecontainer is used to evaluate different containers (lines612). The way to compute this metric is described inAlgorithm 6, where the basic idea is to greedily select thecontainer that brings the highest increase in terms of util-ity per slot. In addition, a container is considered a suitablecandidate only if there is enough space in the frame to in-clude its required signaling (MAP message, line 9). Thus, ineach iteration of Algorithm 5 (lines 1316) either anunscheduled container is added to the frame (set to statusscheduled) or an already scheduled container is increasedin size by one time slot. This phase is repeated until alltime slots in the frame have been assigned.

    Algorithm 6. The Container metric is the ratio betweenthe additional utility resulting from the enlargement of thecontainer and the required additional frame space for theMAP and the container itself, i.e. the additional utility perslot gain.

    Table 2Parameter used for the calculation of the DL-MAP size.

    Variable Value

    F0 xed_dl_map_overhead (72 bits)F1 dl_map_ie_xed_overhead (44 bits)F2 dl_map_ie_cid_size (16 bits)SC num_subcarriers_slot (48)after DLslots iterations, where DLslots is the number of timeFinally, the container metric procedure depicted inAlgorithm 6 needs a way to obtain the list of the packetsthat can be packed in the spatial layers of a given containerwhile respecting a maximum MAP size (lines 45, Algo-rithm 6). For this purpose, Algorithm 7 greedily iteratesover the list of packets addressed to the MSs of the corre-sponding SDMA group and adds them to their respectivespatial layers assuming burst concatenation6 until theavailable capacity is reached. The considered packet list ispre-sorted according to utility per slot in descending orderto ensure that the most valuable packets are packed rst.In case the required MAP size is larger than the maximumallowed MAP size the excess number of packets is calculatedand the packets with the lowest utility are removed (lines1314 in Algorithm 7).

    Complexity In the worst case all MS are fully correlatedand we have K containers, i.e. one MS per SDMA group.According to Algorithm 5 in each iteration the width ofan already scheduled container is increased by one or anunscheduled container with its initial width is added tothe frame (lines 1316). Therefore, Algorithm 5 terminates

    slots in the downlink OFDMA frame. In the worst case ineach iteration the container metric (Algorithm 6) has to

    Fig. 8. Frequency reuse patterns 3 1 1 used in simulations. Theroman numerals depict the used frequencies.of packets addressed to a given SDMA group (container).Therefore the overall complexity of our SDMAOFDMAframe construction algorithm is OK DLslots K OP.Notice though that DLslots is bounded by the frame size,and if required the number of packets P can be limited

    6 Concatenated packets are packets addressed to the same MS that arecoded and modulated together. In this way a single entry in the MAP isrequired that reduces overhead. In sGSA the MCS of the resultingconcatenated burst is set to the lowest MCS among the concatenatedpackets, as given by the AMC module.

  • which results in an 802.16 DL subframe composed of 30logical subchannels (60 PUSC clusters). An equal transmit

    A. Zubow et al. / Computer Networks 56 (2012) 35113530 35236. Performance evaluation

    The performance of our proposed sGSA algorithm isanalyzed in this section by means of simulations. First,we describe our simulation framework providing informa-tion about the level of detail considered in the modeling aswell as the conguration parameters and settings consid-ered. Second, a thorough performance evaluation of thedifferent sGSA algorithm components described in the pre-vious section is provided followed by an overall systemstudy.

    6.1. Simulation frameworkby the QoS scheduler. Thus, resulting in a complexity ofOK, i.e. ON in common complexity terminology whereN corresponds in this case to the number of MSs in thesystem.

    Algorithm 7. Algorithm estimates the set of packets whichcan be packed in a given SDMA container where themaximum map size is restricted.

    Table 3MPDU size distribution

    Percentage (%) MPDU Size (bytes)

    18.89 4012.09 15006.14 624.67 14204.60 52

    53.61 Uniform (40, 1500)6.1.1. Simulation parametersA BS with a Uniform Linear Array (ULA) of M elements

    separated by 0.5k is assumed. In each simulation an802.16 OFDMA frame is considered having a duration5 ms, containing a single PUSC zone and a 35/12 DL/UL ra-tio. We explicitly simulate PUSC using the permutationscheme dened in the standard [3]. For robustness, oneMAP repetition and QPSK 1/2 was used for the DL-MAP,which is transmitted omnidirectionally. The size of theDL-MAP was calculated according to the parameters givenin Table 2. A system bandwidth of 10 MHz is considered,power and the same average noise power is assumed forall subcarriers. Since we assume low mobility, the channeltransfer matrix is considered stable for the duration of oneOFDMA frame, which is supported by the fact that the typ-ical coherence time in 802.16 for a carrier frequency of2.5 GHz and a mobility speed of 2 km/h is 200 ms [23],and is a common assumption in the SDMA literature [1].

    We consider that the BS acquires channel feedback fromeach MS using UL Sounding with decimation, as describedin Section 2. In particular, we assume that for each MS theBS knows the channel transfer matrix (hu,f) of every 28thsubcarrier, hence resulting in 30hu,f coefcients per MS,one per logical subchannel. We assume that the BS has fullCSI on each hu,f, i.e. there are no estimation errors.

    MSs are placed uniformly within the cell around the BS.The transmit power of the BS is adjusted so that the meanSNR of MSs placed at the cell edge is equal to the minimumSNR needed for transmission plus an additional outagemargin of 5 dB. The objective is to keep outage due to fad-ing effects7 low. The spatial and temporal characteristics ofeach signal path between MSs and BS are modeled accordingto the LoS C1 Suburban macro-cell scenario from the WIN-NER Phase II Project [24]. The most important parametersused in our simulations are summarized in Table 1.

    6.1.2. Error modelThe performance of an SDMAOFDMA Scheduler has to

    be evaluated in terms of the data that can be successfullydecoded by the spatially multiplexed MSs. For this pur-pose, we implement the following error model. First, be-sides the intra-group interference, we explicitly simulatethe effect of inter-cell interference using a typical3 1 1 deployment. Our evaluation scenario is depictedin Fig. 8, where we analyze the performance achieved inthe target cell while explicitly considering the tier-oneinterfering cells. The interfering cells also implement oursGSA scheme, hence creating random interference on thetarget cell.

    Specically, our error model is implemented as follows.For each burst packed by sGSA in the OFDMA frame, thereal SINR for all used subcarriers considering the actual in-tra-cell as well as inter-cell interference is computed as:

    ~cu;b Pu;b kWHu;b Hu;bk22

    du;b P#BSIntf

    c1 Puc ;b kHuc ;bk22; 13

    where du,b represents the intra-cell interference plus noisegiven by the denominator term of Eq. (1) and #BSIntf is thenumber of tier-one interfering BSs. Then, the set of ob-tained SINR values is collapsed into an equivalent SNR overan AWGN channel using the EESM mapping. Given theequivalent SNR and the used MCS, a block error rate is ob-tained from a pre-computed table look up (BuER). Finally,after sGSA completes the construction of an SDMAOFD-MA frame, the utility/Bytes carried in each burst is multi-plied by (1-BuER) to account for retransmissions.

    7 Such a high outage margin was necessary due to fairness comparisonreasons when comparing with the baseline setup (single antenna BS).

  • 6.1.3. Trafc modelThe following trafc model is used in our evaluation.

    The BS maintain per-MS packet buffers, and packets aregenerated based on a distribution derived from a data col-

    Fig. 9. Performance of the proposed SINR prediction algorithm. The SINRPredictor tends to be optimistic when the number of antennas grows.However, these estimation errors have a negligible impact on theperformance of an SDMA grouping algorithm.

    3524 A. Zubow et al. / Computer Networks 56 (2012) 35113530lection campaign by SPRINT.8 The obtained packet size dis-tribution, given in Table 3, is dominated by TCP ows. We donot considered the offered load to be equally distributedamong all active MSs, but instead consider a more unbal-anced situation where 50% of the MSs generate 80% of thetrafc at the MAC layer.9 The effect of user density will bestudied in the performance evaluation.

    Thus, in order to collect the performance metrics thatwill be shown in the next section, we run 256 independentsimulations (i.e. MSs placement drops) of duration 1 s,explicitly simulating the OFDMA frame construction pro-cess for all OFDMA frames transmitted within 1 s.

    6.1.4. QoS Scheduler modelThe QoS scheduler is in charge of selecting the packets

    available in the BS queues, and assigning to each packet autility value, ui. Throughout our evaluation two differentutility functions will be considered that capture possiblepolicies considered by a service provider: Proportional FairUtility (PF) [18], upf() The widely used Proportional Fairpolicy assigns to a given packet a utility of Upacket = B/T,where T is an Exponentially Weighted Moving Average(EWMA) of the Bytes that have been scheduled for the owthis packet belongs to in previous frames, i.e.: T(i) =(1 a) T(i 1) + a N(i 1), where N(i 1) is the actualnumber of Bytes that were scheduled for this ow in framei 1, being N(i 1) = 0 if there is no data to send in framei 1.10 In addition, B is the packet size in Bytes to be sched-uled for this ow in frame i. Random Utility, urnd() With thisutility function we model an arbitrary policy set by a serviceprovider. Each packet is assigned a random utility betweenzero and B, i.e. Upacket = unif(0, B), where B is the packet sizein Bytes.

    6.2. Simulation results

    In this section we present a thorough evaluation of theperformance of sGSA that is organized as follows. First, inSection 6.2.1, we study the performance and justify theneed of our novel SINR Predictor module. Second, in Sec-tion 6.2.2, we evaluate the performance of our proposedCBA grouping algorithm as compared to other groupingalgorithms in the state of the art. Third, in Section 6.2.3we validate the assumptions behind our SDMAOFDMApacking algorithm. Finally, in Section 6.2.4, we evaluatethe overall performance of sGSA against that of other algo-rithms in the state of the art, under a variety of systemparameters.

    6.2.1. SINR predictionIn order to illustrate the accuracy of the SINR Predictor

    module introduced in Section 5.1, Fig. 9a depicts the mean

    8 https://research.sprintlabs.com/packstat/.9 We also evaluated the homogeneous case where all MS generate the

    same amount of trafc and the results were similar.10 The value for a in the EWMA lter is set to 0.2.

  • error between the predicted and the real per-subcarrierSINR (after precoding) achieved by the SINR Predictor fordifferent number of BS antennas and SDMA group sizes.

    Fig. 10. Performance of the proposed CBA grouping algorithm comparedwith FFA and BFA when using the SINR prediction algorithm. CBA is themost complex algorithm and achieves the highest throughput.

    A. Zubow et al. / Computer Networks 56 (2012) 35113530 3525As depicted in the gure, the SINR Predictor tends to over-estimate the per-subcarrier SINR, especially as the numberof antennas in the BS grows, i.e. the SINR Predictor tends tobe optimistic (up to 3 dB for M = 5 antennas). The reasonfor this error is the fact that our proposed SINR Predictordoes not consider the spatial correlations between allMSs in the SDMA group, but only the spatial correlation be-tween the new MS to be added to the group and the rest ofMSs that are already part of the SDMA group, as depicted inFig. 6. However, as we illustrate next, this estimation errordoes not have a signicant impact on performance.

    Fig. 9b illustrates, for a scenario with M = 5 antennas inthe BS and K = 20 MSs in the cell, the achieved servicethroughput for different SDMA grouping algorithms,namely CBA, BFA and FFA. These SDMA grouping algorithmsuse group capacity as its grouping metric, and in order toobtain the per-MS SINRs required to derive the groupcapacity metric, we compare two different methods: (i)SINR Precoding, where a full precoding matrix is computedevery time a new SDMA group has to be evaluated, and (ii)our SINR Predictor. As observed in Fig. 9b, for all consideredgrouping algorithms, the SINR Predictor has a very smallimpact on performance.11 Note that when using the SINRPredictor FFA behaves better than with precoding. The rea-son for this is that FFA does not consider some groups tobe in outage (because it is optimistic; Ref. Section 5.1). How-ever, this mistake is corrected by the AMC module after-wards. Therefore, we conclude that the estimation errorsintroduced by the SINR Predictor module have a negligibleimpact on the performance of an SDMA grouping algorithm.

    In order to evaluate the reduction in complexityachieved by the SINR Predictor with respect to the tradi-tional SINR Precoding method, we dene as our measureof complexity the total number of executed oating pointoperations (FLOPs) when using each of our algorithms un-der study. The number of FLOPs required in typical matrixoperations are obtained from [25].

    Fig. 9c shows the corresponding total number of FLOPsexecuted with the CBA, BFA and FFA SDMA grouping algo-rithms, when using the SINR Precodingmethod, i.e. full pre-coding matrix computation, and our SINR Predictor. Asclearly observed in the gure, the reduction in the numberof FLOPs achieved by the SINR Predictor is very signicantfor all grouping algorithms, resulting in a reduction of al-most two orders of magnitude in the case of CBA. The mainreasons for the smaller complexity required by the SINRPredictor method are: (i) the reduced number of precodingmatrix calculations, i.e. Eq. (2), that in this case are only re-quired in the AMC module, and (ii) the possibility of reus-ing the correlation coefcients between pairs of MSs whenevaluating different candidate SDMA groups.

    Hereafter, we consider that the per-MS SINR computa-tions required to derive the group capacity metric are al-ways computed using our proposed SINR Predictor.

    11 We have seen similar results across a wide range of system parameters.We do not report these results here for the sake of brevity.

  • may represent a real-life scenario where a service providerprioritizes some MSs over the rest.

    Fig. 11 depicts the results of an experiment whereK = 20 MSs are considered, and the number of BS antennasvaries from M = 2 to M = 5. As we can see in the gure theMulti Container approach results in signicant gains, up toa fourfold increase, compared to the Single Container ap-proach. Similarly, Fig. 11b depicts the same results for anexperiment where the number of antennas in the BS isxed to M = 5, and the number of MSs in the cell increasesfrom K = 10 to K = 30. As observed in the gure, the MultiContainer approach outperforms the Single Container ap-proach, especially as K grows. Notice that the main reasonbehind the gains achieved by our Multi Container approachis its higher exibility in being able to conveniently sche-dule high priority packets from different MSs, which maynot be spatially compatible, in different SDMA groups.

    Regarding computational complexity, the Single Con-tainer algorithm has of course lower complexity than the

    Fig. 11. Efciency comparison (utility per second) for different SDMAscheduling policies. The Multi Container approach results in signicantgains compared to the Single Container approach.

    3526 A. Zubow et al. / Computer Networks 56 (2012) 351135306.2.2. Cluster-based Grouping Algorithm (CBA)In this section we investigate further into the perfor-

    mance trade-offs of our CBA grouping algorithm, as com-pared to the traditional BFA and FFA algorithms. Inparticular, we are interested in studying the effect thatthe number of MSs in the cell, K, has on the performance/complexity trade-off of these SDMA grouping algorithms.Notice, that as described in Section 5.2, CBA has a worst-case complexity of OK4, while BFA has a worst-case per-formance of OK2. In addition, all the considered SDMAgrouping algorithms use group capacity as grouping metric.

    Fig. 10a illustrates the service throughput achieved bythe algorithms under study, when considering M = 5antennas in the BS and K = 5, 10, 20, 30 MSs in the targetcell. As we can see in the gure, CBA always achieves thehighest service throughput, followed respectively by BFAand FFA. In particular, the gain of CBA is higher as the num-ber of MSs in the cell, K, increases, and results in around a10% gain for K = 30 MSs. Notice that as K grows, the task ofnding optimal groups becomes harder.

    In order to evaluate the computational complexity ofthe different grouping algorithms, we depict in Fig. 10bthe number of FLOPs executed by each algorithm. As ex-pected, CBA is the most complex algorithm, followed byBFA and FFA, with its relative complexity increasing asthe number of MSs in the cell increases. To further illus-trate the reasons behind the performance-complexitytrade-off exhibited by the different grouping algorithms,Fig. 10c illustrates for each algorithm the number of candi-date SDMA groups evaluated. We can clearly see that CBA,by means of its clusterJump and clusterSwap operations, isthe algorithm evaluating the highest number of candidateSDMA groups, which hence translates into a higher perfor-mance but also a higher complexity.

    Therefore, we conclude that CBA and BFA provide acomplementary performance/complexity trade-off, bothbeing better than FFA, where a particular vendor could se-lect CBA if it is more interested in performance and BFA if itis more interested in reducing complexity. In the rest of thepaper we assume that CBA is the SDMA grouping algorithmbeing used.

    6.2.3. SDMAOFDMA packing algorithmIn this section we evaluate one of the main assumptions

    behind our SDMAOFDMA packing algorithm, described inSection 5.4. This assumption is our claim that packing sev-eral SDMA groups within a single OFDMA frame, instead ofpacking only the best group, delivers signicant perfor-mance gains.

    For this purpose, Fig. 11 compares the performance of ascheduling algorithm that only packs the best SDMA groupin each OFDMA frame, Single Container, and our proposedpacking algorithm, Multi Container, that selects from allthe available SDMA groups the ones that have to be packedin each OFDMA frame. For the purpose of this experiment arandom utility policy is used, and we vary the number ofantennas in the BS, M, and the number of MSs in the cell,K. In particular, the selected random utility policy consid-ers that 20% of the MS are premium MSs, and thus multi-plies their packet utility by a factor of ten in order toaccount for their importance. Notice that this utility policy

  • Multi Container algorithm. In particular, the Single Con-tainer algorithm has a worst-case complexity of

    Transmit Beamforming. Thus, instead of transmitting tomultiple MSs at the same time, sGSA + TxBf schedules

    Fig. 12. Comparison under the PF utility for different tx antennas at BSand precoding techniques in LOS scenario for K = 10 MSs. sGSA + SDMAclearly outperforms sGSA + TxBf and performs very close to the optimalalgorithm.

    A. Zubow et al. / Computer Networks 56 (2012) 35113530 3527a single MS in each timefrequency region but booststhe SINR perceived by this MS using beamforming.13

    Hence, with a higher SINR, MSs can use more efcientModulation and Coding Schemes that should increasecapacity.

    GSA: GSA [17] is an OFDMA scheduling solution thatuses two-dimensional packing. GSA though can onlybe used with M = 1 antennas in the BS.

    In order to analyze the previous algorithms we look atthe following aspects: i) how performance scales with thenumber of antennas available in the BS, and ii) how the per-formance of each algorithm depends on the offered loadavailable in the BS. For all the experiments presented in thissection we consider that the QoS Scheduler uses a Propor-tional Fair (PF) utility.

    12 Notice that P can be controlled by the QoS scheduler.13 In this case beamforming weights are obtained using Eq. (3).OK OP, and the Multi Container algorithm a complex-ity of OK 1 DLslots OP, where K is the number ofMSs in the cell, P the number of packets,12 and DLslots thenumber of OFDMA slots in the DL subframe. However, thesedifferences are not signicant when considering the overallcomplexity of an SDMAOFDMA scheduler, because the big-gest part of the complexity corresponds to the SDMA group-ing algorithm that in the case of sGSA is optimized using theSINR Predictor.

    Thus, given the observed results, we conclude that ourproposed approach of packing multiple SDMA groupswithin a single OFDMA frame may translate in signicantperformance gains in practical scenarios. In the rest ofthe paper we consider the SDMAOFDMA packing algo-rithm described in Section 5.4 as the packing solution em-ployed by sGSA.

    6.2.4. Overall performance of sGSAIn this section we evaluate the overall performance of

    the following algorithms:

    Opt + SDMA: This scheduler is an algorithm that solvesthe same problem than sGSA but using exhaustivemethods. In particular: (i) this scheduler assumes a per-fect knowledge of the per-MS SINR in order to computethe group capacity metric, (ii) it generates SDMA groupsby doing an exhaustive search among all possiblegroups, and (iii) performs an exhaustive search in orderto select the SDMA groups that are nally packed in theOFDMA frame. Notice that this algorithm is not imple-mentable in practice, but can be used to benchmarkthe performance of sGSA.

    sGSA + SDMA: This is our sGSA algorithm, composed byall the modules described in Section 5.

    sGSA + TxBf: If M > 1 antennas are available in the BS,instead of doing SDMA another possibility is to do

  • As we can see in Fig. 13a and b, the algorithms understudy are more sensitive to small buffer sizes than to a re-duced number of MSs in the cell. The reason for this is re-lated to the packing algorithm employed by sGSA thatallocates full columns to each individual SDMA layer, andhence can result in wasted resources if a scheduled MSdoes not have enough data to ll at least one column. No-tice that enhanced algorithms could be devised, thatdepending on the load offered by each MS would decideon whether this MS should be transmitted using sG-SA + SDMA or using another scheduler that allows a moreefcient packing of small amounts of data, like GSA. Weleave the study of these algorithms as future work. Lookingat Fig. 13b, we see that increasing the number of MSs in thecell also leads to capacity gains, especially as the number ofantennas in the BS grows, due to the ability of sGSA to comeup with better SDMA groups.

    It is also interesting to see how sGSA + SDMA is moreefcient than sGSA + TxBf even with a reduced number ofantennas in the BS, e.g. sGSA + SDMA with M = 2 antennasachieves a higher capacity than sGSA + TxBf with M = 5 for

    Fig. 13. Impact of buffer size and number of MSs. The algorithms aremore sensitive to small buffer sizes than to a reduced number of MSs inthe cell.

    3528 A. Zubow et al. / Computer Networks 56 (2012) 351135306.2.5. Impact of the number of antennas in the BSFig. 12a depicts the service throughput achieved by our

    algorithms under study in a scenario where we have K = 10active MSs in the target cell. The most remarkable result inthis gure is that sGSA + SDMA clearly outperforms sG-SA + TxBf, up to a threefold gain for M = 5 antennas in theBS, and performs very close to the Opt + SDMA algorithm.We analyze the previous two results separately.

    First, the performance of sGSA + TxBf is low in this sce-nario, because with K = 10 MSs in the cell and the em-ployed PF utility, the Multi-User Diversity gain is alreadyenough for the BS to transmit to MSs that experience goodchannel conditions. Hence, transmit beamforming resultsin small gains in a scenario where MSs already experiencerelatively good channel conditions. Instead, a dramaticgain is obtained when the BS uses SDMA and is able tosimultaneously transmit to more than one MS.

    Second, although the performance of sGSA + SDMA isvery close to the performance of the Opt + SDMA algorithmin Fig. 12a, we want to be cautious in generalizing this re-sult. The reason is that as the number of MSs in the cell, K,increases, we expect the gain of Opt + SDMA over sG-SA + SDMA to increase, because as K increases it becomesmore difcult for sGSA to generate SDMA groups that areclose to the optimal ones. Given the computationalrequirements of the Opt + SDMA algorithm though, weleave as future work the study of its performance for largeKs.

    In order to gain a deeper insight on the previous results,Fig. 12b and c depict respectively for each algorithm thespatial gain, i.e. gain due to having more antennas avail-able in the BS, and the incurred signaling (DL-MAP) over-head. As expected, sGSA + SDMA delivers a linear spatialgain, which is slightly below M. For example, the spatialgain is around 3 forM = 5 antennas, meaning that with veantennas sGSA + SDMA comes up on average with SDMAgroups of size three. Given the observed gain, sGSA + SD-MA seems to be a logical extension to GSA when multipleantennas are available at the BS. Finally, regarding the sig-naling overhead depicted in Fig. 12c, the algorithms usingSDMA logically result in a higher overhead because moreMSs need to be signaled in the MAP. However, sGSA + SD-MA performs signicantly close to Opt + SDMA. Finally, itis also interesting to notice that due to the heavy use ofconcatenation, a very small signaling overhead is intro-duced by sGSA + TxBf.

    6.2.6. Impact of the offered load in the BSWe now evaluate the performance of the different algo-

    rithms under study when varying the amount of offeredload available in the BS. In particular, we vary the offeredload in the BS by varying: (i) the amount of data availablein the per-MS buffers held in the BS when consideringK = 10 MSs in the cell, in Fig. 13a, and (ii) the number ofMSs in the cell when considering a xed per-MS buffer sizeof B = 12.66 KB, in Fig. 13b. In addition, we compare theperformance of GSA with M = 1 BS antennas, sGSA + TxBfwith M = 5 BS antennas, and of sGSA + SDMA for M = 2 toM = 5 BS antennas. For the sake of clarity, we have not in-cluded the performance of the Opt + SDMA algorithmwhich was slightly above the one of sGSA + SDMA.

  • a wide range of MSs in the cell and considered buffer sizes.

    where signicant additional capacity gains could beachieved, specially for a large number of antennas. More

    References

    [1] T.F. Maciel, Suboptimal Resource Allocation for Multi-User MIMO-

    A. Zubow et al. / Computer Networks 56 (2012) 35113530 3529efcient MAP signaling schemes could be feasible if pre-coding techniques would be used for transmitting MAPS(i.e. private MAPs). Second, the impact of noise and delayin the received CSI feedback on the performance of sGSAshould be studied. Finally, although in this paper we havefocused in 802.16 systems, we believe that our proposedalgorithms can be easily adapted to other technologiesusing SDMAOFDMA, like LTE.We believe that the reasons for this behavior are tightlybound to the used Proportional Fair (PF) utility, that re-duces the capacity gains achieved by sGSA + TxBf as thenumber of MSs increases, because in order to maintainfairness the PF scheduler eventually selects all MSs. In-stead, if enough antennas are available and SDMA is used,fairness can be maintained and still capacity gains are pos-sible due to scheduling MSs in the spatial domain.

    7. Conclusions and future work

    To address future wireless networks forecasted growthin mobile trafc volume, in this paper we took a compre-hensive view at SDMAOFDMA systems which areexpected to be a key technology building block to increasecurrent spectral efciencies. SDMAOFDMA systems haveto allocate resources in time, frequency and space dimen-sions to different mobile stations, resulting in a highlycomplex resource allocation problem. In our work, we de-signed and analyzed a SDMAOFDMA Greedy SchedulingAlgorithm (sGSA) for WiMAX networks. Our solution con-siders feasibility constraints in order to allocate resourcesfor multiple mobile stations on a per packet basis by using(i) a low complexity SINR prediction algorithm, (ii) a clus-ter-based SDMA grouping algorithm and (iii) a computa-tionally efcient frame layout scheme which allocatesmultiple SDMA groups per frame according to their packetQoS utility. The overall performance of the different sGSAalgorithm components was evaluated by system-level sim-ulations and compared to state of the art approaches.

    The main contributions of our work in this paper aresummarized as follows. First, a complete WiMAX SDMAOFDMA downlink scheduling solution was presented andits performance benets quantied comparing against re-lated work solutions. Second, a novel low complexity algo-rithm for predicting the SINR of mobile stations in SDMAgroups was designed and the corresponding complexityreduction evaluated, reaching up to two orders of magni-tude improvement in some cases. Third, a computationallyefcient frame layout scheme was proposed which allo-cates multiple SDMA groups per frame according to theirpacket QoS utility at low complexity costs. Finally, theSDMAOFDMA system signaling overhead was analyzedwith respect to the usage of the overall radio resources.

    As future work we consider the following steps. First,the MAP overhead reduction is a promising directionOFDMA Systems, Ph.D. thesis, Technical University of Darmstadt,2008

    [2] F. Shad, T.D. Todd, V. Kezys, J. Litva, Dynamic Slot Allocation (DSA) inIndoor SDMA/TDMA Using a Smart Antenna Basestation, IEEE/ACMTransactions on Networking 9.

    [3] IEEE Standard for Local and Metropolitan Area Networks. Part 16: AirInterface for Broadband Wireless Access Systems, IEEE 802.16-2009Standard, May 2009.

    [4] A. Zubow, J. Marotzke, D. Camps-Mur, X. Prez-Costa, sGSA: AnSDFMA-OFDMA Scheduling Solution, submitted to EuropeanWireless Conference, 2012.

    [5] M. Costa, Writing on dirty paper (Corresp.), Information Theory, IEEETransactions 29(3) (1983) 439441. doi:10.1109/TIT.1983.1056659..

    [6] R.D. Wesel, J.M. Ciof, Achievable rates for TomlinsonHarashimaprecoding, IEEE Transactions on, Information Theory 44 (2) (1998)824831, http://dx.doi.org/10.1109/18.661530. .

    [7] C.B. Peel, B.M. Hochwald, A.L. Swindlehurst, A vector-perturbationtechnique for near-capacity multiantenna multiuser communication Part I: channel inversion and regularization, IEEE Transactions onCommunications 53 (1) (2005) 195202, http://dx.doi.org/10.1109/TCOMM.2004.840638. .

    [8] F. Gross, Smart Antennas for Wireless Communications: WithMATLAB (Professional Engineering), 1st ed., McGraw-HillProfessional, 2005.

    [9] T. Yoo, A. Goldsmith, On the optimality of multiantenna broadcastscheduling using zero-forcing beamforming, Selected Areas inCommunications, IEEE Journal on 24 (3) (2006) 528541, http://dx.doi.org/10.1109/JSAC.2005.862421. .

    [10] M. Sharif, B. Hassibi, A comparison of time-sharing, DPC, andBeamforming for MIMO broadcast channels with many users,Communications, IEEE Transactions 55 (1) (2007) 1115, http://dx.doi.org/10.1109/TCOMM.2006.887480. .

    [11] I. Frigyes, J. Bito, P. Bakki (Eds.), Advances in Mobile and WirelessCommunications: Views of the 16th IST Mobile and WirelessCommunication Summit (Lecture Notes in Electrical Engineering),rst ed., Springer, 2008. .

    [12] C. Farsakh, J.A. Nossek, A Real Time Downlink Channel AllocationScheme for an SDMA Mobile Radio System, IEEE InternationalSymposium on Personal, Indoor and Mobile Radio Communications3 (1996) 12161220. doi:10.1109/PIMRC.1996.568477. .

    [13] A. Nascimento, J. Rodriguez, A Joint Utility Scheduler and SDMAResource Allocation for Mobile WiMAX Networks, in: Proceedingsof the 5th IEEE international conference on Wirelesspervasive computing, ISWPC10, IEEE Press, Piscataway, NJ, USA,2010. pp. 589594. .

    [14] C.Y. Huang, C.-Y. Chang, P.-H. Chen, M.-H. Wu, PerformanceEvaluation of SDMA Based Mobile WiMAX Systems, InternationalWireless Communications and Mobile Computing Conference(2008) 10421046. doi:10.1109/IWCMC.2008.181. .

    [15] L. Hentil, P. Kysti, M. Kske, M. Narandzic, M. Alatossava, MATLABimplementation of the WINNER Phase II Channel Model ver1.1..

    [16] C. Oestges, B. Clerckx, MIMO Wireless Communications: From Real-World Propagation to Space-Time Code Design, 1st ed., AcademicPress, 2007.

    [17] A. Zubow, D. Camps-Mur, X. Prez-Costa, P. Favaro, GreedyScheduling Algorithm (GSA) design and evaluation of an efcientand exible WiMAX OFDMA scheduling solution, Elsevier ComputerNetworks Journal 54 (10) (2010).

    [18] J. Lakkakorpi, A. Sayenko, J. Moilanen, Comparison of DifferentScheduling Algorithms for WiMAX Base Station: Decit Round-Robin vs. Proportional Fair vs. Weighted Decit Round-Robin, IEEEWireless Communications and Networking Conference (2008)

  • 19911996. doi:10.1109/WCNC.2008.354. .

    [19] R.D. Armstrong, P. Sinha, A.A. Zoltners, The multiple-choice nestedknapsack model, Management Science 28 (1) (1982) 3443. .

    [20] H. Kellerer, U. Pferschy, D. Pisinger, Knapsack Problems, rst ed.,Springer, 2004.

    [21] D.B. Calvo, Fairness Analysis of Wireless Beamforming Schedulers,Ph.D. thesis, Technical University of Catalonia, Spain, 2004.

    [22] R. Farebrother, Linear Least Squares Computations (Statistics: ASeries of Textbooks and Monographs), 1st ed., CRC Press, 1988.

    [23] J.G. Andrews, A. Ghosh, R. Muhamed, Fundamentals of WiMAX:Understanding Broadband Wireless Networking, 1st ed., PrenticeHall, 2007.

    [24] P. Kysti, J. Meinil, L. Hentil, WINNER II Channel Models, IST-4-027756 WINNER II D1.1.2 V1.1. .

    [25] R. Hunger, Floating Point Operations in Matrix-Vector Calculus,Tech. Rep., Technische Universitt Mnchen, Associate Institute forSignal Processing, 2007.

    Anatolij Zubow is a senior researcher in theWireless Networking Group at the depart-

    Daniel Camps Mur is a senior researcher inthe Mobile and Wireless Networks group atNEC Network Laboratories in Heidelberg,Germany. He holds since 2004 a Mastersdegree and since 2012 a Ph.D. from the Poly-technic University of Catalonia (UPC). Hiscurrent research interests include QoS andenergy efciency in wireless networks. Danielis also an active contributor to the IEEE 802.11group and to the Wi-Fi Alliance. In addition,his master thesis work received the MobileInternet and 3G Mobile Solutions award from

    the Spanish Association of Telecommunication Engineers (COIT), and inthe eld of network simulations he also received the OPNETs SignicantTechnical Challenge Solved Award.

    Xavier Prez Costa is Chief Researcher at NECLaboratories Europe in Heidelberg, Germany,where he has managed several projects rela-ted to wireless networks. In the WiFi area, heled a project contributing to 3G/WiFi mobilephones evolution and received NECs Rn& DAward for his work on N900iL, NECs rst 3G/WiFi phone. In the LTE area he managed ateam researching on future productenhancements and received NECs Rn& DAward for related contributions to WiMAXproducts. In the Future Wireless Internet area,

    he is responsible for a team contributing to the EU FP7 project FlexibleArchitecture for Virtualizable Wireless Future Internet Access (FLAVIA).He has participated in IEEE standardization working groups as well astheir corresponding certication bodies, and has been included in themajor contributor list of IEEE 802.11e and 802.11v. He received both hisM.Sc. and Ph.D. degrees in Telecommunications from the Polytechnic

    3530 A. Zubow et al. / Computer Networks 56 (2012) 35113530Humboldt University since 2010, where he iscurrently completing his Diploma thesis inComputer Science. His main area of interestslie within the eld of wireless MIMO com-munication systems and WiFi/WiMAX net-works.ment of Computer Science at Humboldt Uni-versity Berlin, Germany. He received his Ph.D.degree from Humboldt University in 2009. Hismain research interest is cooperative meshnetworking in wireless ad hoc networks. Inaddition he worked on several WiFi andWiMAX related projects.

    Johannes Marotzke is a research assistant atthe department of Computer Science atUniversity of Catalonia (UPC) and was the recipient of the national awardfor the best Ph.D. thesis on multimedia convergence in telecommunica-tions from COIT.

    sGSA: A SDMAOFDMA greedy scheduling algorithm for WiMAX networks1 Introduction2 SDMAOFDMA support in IEEE 802.16e3 Related work3.1 SDMA Transmitter techniques3.2 SDMA grouping algorithm3.3 SDMA grouping metric3.4 Full SDMAOFDMA solutions

    4 Problem statement4.1 PHY model4.2 Base station architecture and QoS model4.3 Problem formulation4.3.1 SDMA group formation4.3.2 OFDMA frame construction

    5 SDMA Greedy Scheduling Algorithm (sGSA)5.1 SINR prediction algorithm5.2 Cluster-based grouping algorithm5.3 Adaptive Modulation and Coding (AMC) module5.4 SDMAOFDMA packing algorithm

    6 Performance evaluation6.1 Simulation framework6.1.1 Simulation parameters6.1.2 Error model6.1.3 Traffic model6.1.4 QoS Scheduler model

    6.2 Simulation results6.2.1 SINR prediction6.2.2 Cluster-based Grouping Algorithm (CBA)6.2.3 SDMAOFDMA packing algorithm6.2.4 Overall performance of sGSA6.2.5 Impact of the number of antennas in the BS6.2.6 Impact of the offered load in the BS

    7 Conclusions and future workReferences