Deep Learning at The Edge For Channel Estimation in Beyond-5G Massive MIMO

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

EDGE INTELLIGENCE FOR BEYOND 5G NETWORKS

Deep Learning at the Edge for Channel


Estimation in Beyond-5G Massive MIMO
Mauro Belgiovine, Kunal Sankhe, Carlos Bocanegra, Debashri Roy, and Kaushik R. Chowdhury

Abstract our desire to decouple the scale of deployment


with the limits of classical processing, especially as
Massive multiple-input multiple-output (mMIMO) it pertains to the task of understanding the channel
is a critical component in upcoming 5G wire- between a given antenna-receiver antenna-element
less deployment as an enabler for high data rate pair for millimeter-wave (mmWave) communi-
communications. mMIMO is effective when each cation. We accomplish this via training a deep
corresponding antenna pair of the respective trans- learning (DL) architecture that offers the ability to
mitter-receiver arrays experiences an independent produce a robust and high fidelity channel matrix
channel. While increasing the number of anten- between the mobile user and the mMIMO BS in
na elements increases the achievable data rate, at a single forward pass. Since the overhead of the
the same time computing the channel state infor- DL-based channel estimation becomes irrespective
mation (CSI) becomes prohibitively expensive. In of the size of the antenna array, we believe this
this article, we propose to use deep learning via a approach will enable a fundamental leap toward
multi-layer perceptron architecture that exceeds the beyond 5G (B5G) standards where thousands of
performance of traditional CSI processing methods coordinated antennas will become the new norm.
like least square (LS) and linear minimum mean Emerging B5G networks are envisioned to support
square error (LMMSE) estimation, thus leading to edge computing, which will enable rapid optimiza-
a beyond fifth generation (B5G) networking par- tion and reconfiguration of the network architec-
adigm wherein machine learning fully drives net- ture. This is a critical first step toward supporting
working optimization. By computing the CSI of requirements of emerging high-bandwidth and
all pairwise channels simultaneously via our deep low-latency applications. Machine learning (ML)
learning approach, our method scales with large and artificial intelligence (AI) algorithms running
antenna arrays as opposed to traditional estima- at the edge computing servers help to (i) scale
tion methods. The key insight here is to design the the optimization problem without proportional
learning architecture such that it is implementable increase in complexity and (ii) enable fast response
on massively parallel architectures, such as GPU or close to the BS, thus meeting strict demands of a
FPGA. We validate our approach by simulating a time-varying wireless channel. We believe our use
32-element array base station and a user equipment case of DL-enabled mmWave mMIMO demon-
with a 4-element array operating on millimeter-wave strates the need for tightly integrating AI into
frequency band. Results reveal an improvement up emerging wireless standards, which remains a gap
to five and two orders of magnitude in BER with even in the ongoing 5G rollout today.
respect to fastest LS estimation and optimal LMMSE,
respectively, substantially improving the end-to-end Challenge in Channel Estimation
system performance and providing higher spatial Channel estimation is the first step in the larger
diversity for lower SNR regions, achieving up to 4 processing chain associated with decoding the
dB gain in received power signal compared to per- data packet. Its objective is to identify the com-
formance obtained through LMMSE estimation. plex signal transformation imposed on the emitted
wireless signal by the channel, and this is inferred
Introduction via special information bits embedded in the pack-
Large antenna arrays are revolutionizing wireless et preamble. For a spatially multiplexed system,
communications and sensing, with manifestations this complex transformation is captured via the
in programmable surfaces, gesture monitoring, so-called channel state information (CSI). Knowing
and high rate data delivery through incorporation the CSI allows the transmitter to perform addition-
in the form of massive multiple-input multiple-out- al precoding functions that maximize the signal
put (mMIMO) systems. Already envisaged as a key energy in the direction of interest. Thus, delayed
component of 5G, mMIMO utilizes a number of computation of CSI, or worse, an incorrect com-
antennas that can be one to two orders of magni- putation can quickly degrade the performance in
tude higher than the classical MIMO WiFi access systems like mMIMO, where the CSI computation
points and LTE base stations (BSs) available today. needs to be repeated several dozen times.
However, despite the significant advances in edge In the context of the B5G use case we explore
computing capabilities, there are practical chal- in this article, we consider time-division duplexing
lenges in processing needs associated with such (TDD) for mMIMO and assume that the channel
large antenna arrays. This article is motivated by varies slowly (coherence time of 10–100 ms [1]).
Digital Object Identifier:
The authors are with Northeastern University, Boston. 10.1109/MWC.001.2000322

IEEE Wireless Communications • April 2021 1536-1284/21/$25.00 © 2021 IEEE 19

Authorized licensed use limited to: PES University Bengaluru. Downloaded on March 14,2024 at 16:10:18 UTC from IEEE Xplore. Restrictions apply.
BELGIOVINE_LAYOUT.indd 19 4/14/21 2:29 PM
mMIMO ing similarities in channel dynamics across spatial
Trad. Channel Estimation dimension and using an efficiently tuned DNN
Base Station 𝑁𝑁𝑇𝑇
model whose weights are trained in order to be
OFDM Channel 𝑁𝑁𝑅𝑅 shared across the entire antenna array. Thus, we
1 Rx Estimation
1 ⋮ aim to retrieve the complete three-dimensional
OFDM Channel
𝐾𝐾 CSI matrix, where each dimension corresponds
𝑁𝑁𝑇𝑇 𝑁𝑁𝑅𝑅 Rx Estimation
to the number of receiver antennas, the number
Channel
Estimated CSI of transmitter antennas, and the number of usable
sounding sub-carriers, by grouping all the received pream-
frame Deep Learning Channel Estimation bles in a single batch and processing it in a sin-
𝑁𝑁𝑇𝑇
Φ1 𝐾𝐾 gle forward step, as shown in Fig. 1. We design a
1 ⋮ 𝑁𝑁𝑅𝑅 compact multi-layer perceptron (MLP) with only
three hidden layers to jointly exploit the hierarchi-

𝑁𝑁𝑅𝑅 cal representational power of DNNs while keep-
𝐾𝐾
⋮ Φ𝑁𝑁𝑇𝑇 ing the execution time associated to its forward
𝑇𝑇𝑃𝑃 step low. To further reduce the computational
Estimated CSI
burden associated with channel estimation, we
train our model by taking as input the received
𝐾𝐾: Number of subcarriers; 𝑁𝑁𝑇𝑇 : Number of transmit antenna; 𝑁𝑁𝑅𝑅 : Number of receive antennas;
time-domain preamble sequence, avoiding com-
Φ𝑛𝑛𝑇𝑇 : Orthogonal sequence of length 𝑁𝑁𝑇𝑇 for 𝑛𝑛 𝑇𝑇 − th transmit antenna
𝑇𝑇𝑃𝑃 : Number of time domain samples in the channel sounding frame
pletely the prior demodulation step in orthogonal
frequency-division multiplexing (OFDM) systems.
FIGURE 1. Overview of deep-learning-based channel estimation for B5G mas- The model is trained in a regression fashion in
sive MIMO. order to predict for each mMIMO sub-channel
the CSI in the frequency domain for the com-
plete set of OFDM pilot and data sub-carriers.
In this regime of operation, two phases involving This allows learning directly a mapping from the
the BS and user equipment (UE) precede down- time-domain signal to its correspondent CSI in the
link transmissions: Channel Sounding, in which frequency domain. The proposed DNN model
the UE performs CSI estimation for the complete architecture is presented later.
MIMO channel and sends it back to the BS, and By training the model on true CSI values
Data Transfer, in which the BS uses the received obtained at high signal-to-noise ratio (SNR) level, we
CSI estimation to compute precoding weights for observe that the proposed method generalizes well
directional beams. Thus, the CSI estimation must be for low SNR scenarios and outperforms the practi-
completed quickly in order to allow both the Chan- cal least square (LS) estimation in terms of accura-
nel Sounding and Data Transfer phases to be com- cy, while approaching or exceeding performance
pleted within the channel coherence time. Such a of linear minimum mean square error (LMMSE) and
hard threshold on timeliness ensures that the BS improving the end-to-end system performance in
can turn around its radio front-end and leverage low SNR regimes, critical for frequencies above 6
channel reciprocity for the downlink transmission. GHz band such as mmWave or THz bands.
Furthermore, by focusing on reducing the overhead Moreover, to fully take advantage of this
associated with the CSI estimation step, it may be data-driven approach and increase robustness
possible to reduce the Channel Sounding phase. of the DL pipeline, we add a denoising training
This in turn will allow more data to be transferred step, in which we apply controlled additional white
in the given channel coherence time, ultimately Gaussian noise on the training samples.
increasing the overall throughput of the system.
summAry of contrIbutIons
solutIon overvIeW • We propose a deep-learning-based CSI estima-
Our proposed approach of using DL aims to tion method for mMIMO that incurs a fixed
address the above issues by constructing a chan- computational cost, irrespective of the number
nel estimator that is able to obtain the complete of antenna elements, by exploiting the inherent-
MIMO channel matrix by processing the incoming ly parallel nature of DNNs.
preambles in a single forward pass, irrespective of • We discuss the limitations of traditional esti-
the number of antenna elements involved in the mation techniques and compare the infer-
system. For downlink, the BS sounds the channel ence time complexity of the state of the art
by using a reference transmission, which allows in DL-based channel estimation with the pro-
the UE to estimate the channel using the proposed posed approach, demonstrating its suitability
DL block. The UE transmits the channel estima- for edge applications.
tion information back to the BS for calculation of • We validate the performance of CSI estimation
the precoding needed for the subsequent data by simulating downlink transmissions between
transmission. We generate the dataset in MATLAB, a BS with N T = 32 uniform rectangular array
which we also release along with the simulation (URA) antennas and a single UE equipped with
code to accelerate further research on this topic. NR = 4 uniform linear array (ULA) antennas.
• By focusing on low SNR conditions, our denois-
the benefIt of deep leArnIng ing training approach allows better accuracy
Our goal is to leverage the massively parallel for CSI estimation, approaching or exceeding
nature of a type of DL called deep neural net- the end-to-end performance of an LMMSE esti-
works (DNNs). Specifically, the key idea behind mator. Thus, our method matches one of the
our proposed method is to estimate each of the most accurate estimators for this problem, but
sub-channels in the mMIMO channel matrix inde- eliminates the computational burden that limits
pendent from each other. We do so by exploit- the deployment of LMMSE.

20 IEEE Wireless Communications • April 2021

Authorized licensed use limited to: PES University Bengaluru. Downloaded on March 14,2024 at 16:10:18 UTC from IEEE Xplore. Restrictions apply.
BELGIOVINE_LAYOUT.indd 20 4/14/21 2:29 PM
Technology Limitations for B5G mMIMO and often require additional prior or post-estimation
steps. Although use of very deep architectures is a
While ML- and
DL-based architectures
CSI characterizes how signals propagate through growing trend, their complexity usually limits use in have been tradition-
a wireless channel between the transmitter and edge devices that are typically constrained in power
ally deployed in the
receiver [2]. Thus, CSI matrices used in mMIMO and processing capability. Reference [9] uses convo-
capture the channel variations in the time and fre- lutional neural networks (CNNs) to improve the qual- image, video, speech,
quency domains. We consider an mMIMO-OFDM ity of a coarse initial estimate of the channel matrix natural language pro-
system intended for mmWave communications. The in a method called Tentative Estimation. To exploit cessing, and healthcare
mMIMO channel is computed not only for each adjacent sub-carrier frequency correlations, the domains, there have
of the NR  NT pairs, but also for every sub-carri- coarse estimate matrices are concatenated in large
also been efforts in
er, during the explicit Channel Sounding stage input tensors and processed by a neural network
provisioned within the 5G standard. Incorrect com- consisting of 10 convolutional layers. Reference [10] solving challenging
putation of CSI matrices can degrade the beams proposes a 10-layer LDAMP architecture, based on tasks in the RF domain,
formed between the mMIMO BS and UEs, resulting the unfolding of an iterative D-AMP algorithm. As such as modulation
in increased bit error rate (BER) during data trans- the estimated channel is treated as a noisy 2D image, recognition, radio iden-
mission [2]. Moreover, if CSI matrices are not com- each layer relies on an additional denoising CNN,
tification, and network
puted in a timely manner (i.e., within the channel which is 20 layers deep and used to update the
coherence time), it will adversely impact the follow- channel estimated in the previous layer. Although resource allocation.
ing data transfer because the channel coefficients CNNs are efficient in terms of number of parame-
used for beamforming are already outdated. ters, the resulting complexity poses a challenge for
Using hybrid mMIMO beamforming [3], the deep architectures when deployed on edge devices.
BS transmits channel sounding frames in parallel Therefore, the large CNNs in both [9, 10] have lim-
over all the NT transmitter antennas. Each channel itations in real-time implementations. In the context
sounding frame, within the long-training field (LTF) of single-carrier systems, [11] devises an uplink (UL)
sequence of the preamble, spans over L OFDM sym- transmission for single-antenna users and multiple-an-
bols with additional orthogonal mapping sequences tenna BSs using a six-layer MLP to first estimate direc-
employed to avoid interference. The receiver esti- tion of arrival (DoA) and then determine the channel
mates the CSI matrix using the received signal, after for each user, by expressing the channel estimate
OFDM demodulation and orthogonal demapping, as a function of DoA and solving an additional lin-
using either LS estimation or LMMSE. LS estimation ear system of equations. Recently, [12] described an
is a widely adopted channel estimator, as it requires online training method based on the Deep Image
only O(NTNRK) element-wise divisions for all antenna Prior scheme, using a 6-layer architecture based on
pairs, where K is the number of sub-carriers, and its 1  1 convolutions and upsampling, which performs
computation is dominated by the OFDM demod- denoising of the received signal before a traditional
ulation step, which relies on fast Fourier transform LS estimation. Although the number of parameters is
(FFT) operation having complexity O(KlogK). Unfor- low, this method requires training the network during
tunately, LS estimation suffers from noise distortion every transmission for thousands of epochs, without
and high mean squared error (MSE), particularly at any guarantee that this step completes within the
low SNR. LS estimation can be refined by computing channel coherence time. For single-carrier solutions,
the LMMSE [4] estimation, although it requires prior K separate models should be trained and deployed
knowledge of channel and noise statistics and solv- to be applied in OFDM systems. Table 1 summarizes
ing a linear system whose complexity grows as much the time complexity of existing methods and com-
as O(NTNRK3) for MIMO systems due to a matrix pares how our proposed approach results in a much
inversion step performed on the channel correlation simpler model that is suitable for edge architectures.
matrix. Therefore, finding fast and accurate ways to
perform CSI estimation is crucial in mMIMO, espe-
cially as the number of antennas may grow to the
Deep Learning Solution for mmWave mMIMO
order of thousands in B5G networks. Model Architecture
We design a compact DNN model to keep compu-
Related Works for DL in mMIMO tation time low and train it to learn a joint approx-
While ML- and DL-based architectures have been imation of OFDM demodulation and LS/LMMSE
traditionally deployed in the image, video, speech, channel estimation methods. Figure 2 shows how
natural language processing, and healthcare [5] the DNN model will replace the processing blocks
domains, there have also been efforts in solving associated with demodulation and channel estima-
challenging tasks in the RF domain, such as mod- tion in a typical mMIMO Channel Sounding pro-
ulation recognition, radio identification [6], and cess. Different from the state of the art presented
network resource allocation. In the area of chan- earlier, we design our training process to use the
nel estimation, [7] presents an end-to-end OFDM received time-domain waveform corresponding to
symbol decoding method using MLP by treating a the LTF obtained after synchronization as input to
single-input single-output (SISO) channel model as the model. This allows us to avoid completely the
a black box. In the context of mMIMO, [8] propos- OFDM demodulation necessary for CSI estimation,
es a compressive method for generating CSI feed- further reducing the computation burden associat-
back based on encoder-decoder DL architecture. ed with this step for systems with large bandwidth
Applying DL-based approaches for CSI estima- that require a high number of sub-carriers. We let
tion in mMIMO is still at a nascent stage. Due to the model perform inference from the spectral com-
the high dimensionality in mMIMO, especially when ponents of the received time-domain signal, without
involving OFDM techniques, the majority of existing performing demodulation explicitly. Through this
solutions use complex and deep architectures to esti- approach, we design a DNN that learns the map-
mate large channel matrices. These solutions treat ping from the time-domain LTF waveform to the
the multi-dimensional input signal as a single entity desired CSI estimation in the frequency domain.

IEEE Wireless Communications • April 2021 21

Authorized licensed use limited to: PES University Bengaluru. Downloaded on March 14,2024 at 16:10:18 UTC from IEEE Xplore. Restrictions apply.
BELGIOVINE_LAYOUT.indd 21 4/14/21 2:29 PM
In order to reduce the Additional
size of the input and Method Type of DL model L Inference complexity OFDM
comments
capture the effect of
K models needed to
channel on amplitude DOA estimation [11] MLP 6 O(Sl=1
L
NlIl + G) No
operate on OFDM
and phase components
of the received signals, Deep CNN [9] CNN 10 O(KT + NTNRSl=1
L
FlNl–1Nl) Yes†
we choose to treat K models needed to
independently the real
Beamspace mmWave [10] LDAMP + CNN 10 O(Sl=1
L
L + LS20c=1WcHcFc2Nc–1Nc) No
operate on OFDM
and imaginary compo- E has no upper
nent of the input, and Untrained DNN [12] CNN + upsampling 6 O(E(W1H1N0Nk + Sl=2
L
2Wl–12Hl–1Nl–1Nk)) Yes†
bound
therefore we create
two identical and inde-
Proposed MLP 3 O(Sl=1
L
NlIl) Yes‡

pendent specialized Notation: NT = number of transmitter antennas, NR = number of receiver antennas, K = number of sub-carriers, L = number of hidden
models accepting the layers, Ii = number of input features of layer i, Ni = number of neurons (or kernels, in the case of CNNs) in the ith layer, Fi = kernel size
real and imaginary of the ith convolutional layer (assuming square kernels), Wi = width of input volume for the ith convolutional layer, Hi = height of input
volume of the ith convolutional layer, E = number of epochs, L = complexity of the LDAMP layer (linear system) in [10], T = complexity
components respec- of Tentative Estimation (linear system, including matrix multiplications and inversions) in [9], G = complexity of additional linear system
tively, both in real needed to compute complex channel coefficients from DOA estimation (requires matrix inversion) in [11]. †: method requires OFDM
valued format. demodulation; ‡: method does not require OFDM demodulation.
TABLE 1. A coarse computational complexity comparison between existing methods and proposed chan-
nel estimator.

In order to reduce the size of the input and cap- regression task. We perform batch normalization
ture the effect of channel on amplitude and phase after each hidden layer and add a Dropout layer
components of the received signals, we choose with drop probability 15 percent between the first
to treat the real and imaginary component of the and second hidden layers to avoid overfitting. The
input independently; therefore, we create two iden- proposed architecture is also depicted in Fig. 2.
tical and independent specialized models accepting Therefore, to retrieve the complete MIMO chan-
the real and imaginary components, respectively, nel matrix, it is possible to construct a single input
both in real valued format. The corresponding real batch of size NT  NR inputs to process in parallel
valued output of each model is then recast back all the necessary XnR,nT inputs and produce as out-
into a complex representation to produce the final put the K  NT  NR channel matrix.
channel estimation output. Since the two models The choice of MLP over other architectures,
are independent of each other, they could poten- such as CNN, is due to its forward step reduced
tially run in parallel; hence, for the sake of simplicity, complexity, that is, it requires less operations.
we refer to both models as incurring a common For instance, if we consider each channel pre-
temporal processing overhead in the rest of the diction independently, the computational complex-
article. For our experiments, we rely on an MLP ity of a single fully connected layer is dominated by
architecture that accepts an input XnR,nT obtained matrix-vector multiplication, which has a serial com-
by concatenating an LTF signal ynR arriving at a par- putation complexity of O(NiIi), where Ni is the num-
ticular receiver antenna nR, and the orthogonal cod- ber of neurons in the ith layer and Ii is the number
ing sequence F nT, known to both transmitter and of features input to it. On the other hand, convo-
receiver, associated with a given transmit antenna lution complexity is O(WiHiFi2Ni–1Ni) (for notation,
nT. The size of input tensor is [TP + TC  1], where see Table 1) that, depending on input and output
TP is the number of symbols belonging to the LTF features arrangement, would incur a much higher
sequence in the time domain and TC is the length number of operations and output features to be
of the coding sequence. The reason we concat- processed by a fully connected regression layer, as
enate these additional features to the input is as in our case. Although MLP has a simpler forward
follows: For an LTF signal received at a given receiv- step, it can be further accelerated by taking advan-
er antenna, we must recover all the channel states tage of massively parallel computing architectures
relative to each transmitter antenna. Without the — graphical processing units (GPUs) or field pro-
orthogonal coding sequence, it would be impossi- grammable gate arrays (FPGAs), so it is crucial to
ble for the model to produce the channel states for minimize Ni and Ii, besides the number of layers, in
a given nR receiver antenna, as well as all the other order to provide a fast and compact model.
nT transmitter antenna pairs. This is because the
input signal ynR would be completely identical for all Training Procedure
these cases. On the other hand, the output of the The MLP models are trained via a regression
proposed model H ^ nR,nT is a tensor of size [K  1], approach, using gradient descent optimization
DNN
where K is the number of sub-carriers of the target and MSE loss function in order to minimize the
system, and corresponds to the model prediction error between each individual CSI estimation
of the channel frequency response experienced predicted by the neural network and the perfect
between transmitter antenna nT and receiver anten- channel estimation for a given input signal, which
na nR during propagation of the input LTF signal. we consider to be the output of a classic deter-
The configuration chosen for the MLP archi- ministic channel estimator (either LS or LMMSE)
tecture has only 2 hidden layers, each with 1024 under ideal noiseless conditions.
neurons and ReLU activation function, and an out- The neural network is trained using the Adam
put layer using linear activation function for the optimization method with a learning rate of 10–4,

22 IEEE Wireless Communications • April 2021

Authorized licensed use limited to: PES University Bengaluru. Downloaded on March 14,2024 at 16:10:18 UTC from IEEE Xplore. Restrictions apply.
BELGIOVINE_LAYOUT.indd 22 4/14/21 2:29 PM
Since we want our
”ƒ†‹–‹‘ƒŽ Šƒ‡Ž•‘—†‹‰ˆ‘”  †‘™Ž‹ model to be robust
   ‡””‡’”‘ ‡••‹‰ to noise variations at
  lower SNR levels, we
”‡ƒ„Ž‡ȋšȌ
 
‡‘† ȋšȌ incorporate a denoising
Šƒ‡Ž
approach. Specifically,
›„”‹† ‡‡†„ƒ   Šƒ‡Ž
‡ƒˆ‘”‹‰ •–‹ƒ–‹‘
during each training
epoch, for each ran-
dom mini-batch of
input signals generated
”‘’‘•‡† Šƒ‡Ž•‘—†‹‰ˆ‘”  †‘™Ž‹
from the dataset, we
   ‡””‡’”‘ ‡••‹‰ augment the input
  data by applying Addi-
”‡ƒ„Ž‡ȋšȌ Šƒ‡Ž
Šƒ‡Ž tive White Gaussian
•–‹ƒ–‘”
›„”‹† ‡‡†„ƒ  Noise (AWGN) with
‡ƒˆ‘”‹‰ an increasing noise
variance, reflecting the
intended SNR range
MLP Channel Estimator Block ƒ– Š ”‘’ ƒ– Š of the deployment
ͳͲʹͶ ‘” —– ͳͲʹͶ ‘”
scenario.
𝐾𝐾
‡‰”‡••‹‘
Žƒ›‡”

ƒ– Š•‹œ‡

‡ ‡‹˜‡† ȋ–‡’‘”ƒŽ†‘ƒ‹ȌΪ‘”–Š‘‰‘ƒŽ•‡“Ǥ

FIGURE 2. Classical and proposed channel sounding architectures for B5G mMIMO.

which is reduced by another factor of 10 when equipped with NT = 32 URA antennas and a UE
validation loss reaches a plateau for more than 15 with NR = 4 ULA antennas, resulting in a 32  4
epochs. Moreover, an early stopping criterion is MIMO channel. Devices operate on a carrier fre-
adopted to terminate the training process if valida- quency of 28 GHz, using 100 MHz bandwidth and
tion loss does not improve within the last 20 epochs. FFT size of 256, resulting in 234 usable sub-carriers.
We use a geometric scattering channel model with-
denoIsIng ApproAch out a line of sight (LoS) path with 100 scatterers
As explained previously, since we want our model that, for every transmission, are randomly placed
to be robust to noise variations at lower SNR lev- on a spherical surface around the UE, which has
els, we incorporate a denoising approach. Spe- a radius of 10 percent of the distance between
cifically, during each training epoch, for each UE and BS, while the position of UE and BS are
random mini-batch of input signals generated assumed to be fixed, with a distance of 500 m.
from the dataset, we augment the input data by We generate Channel Sounding preamble frames,
applying additive white Gaussian noise (AWGN) as explained earlier, having length L = NT OFDM
with an increasing noise variance, reflecting the symbols, and simulate transmission through a
intended SNR range of the deployment scenario. multi-path scattering channel model with Ns = 100
For our experiments, we choose SNR levels of scatterers, as well as adding thermal noise. Since
[–20, –10, 0, 10, 20, 30] dB during training. A mmWave signals experience orders of magnitude
predefined noise power is associated with each of more path loss than the microwave signals, the CSI
these noise levels, based on the average power of computed at the receiver is used to compute pre-
all the received signals in the training dataset. coding weights with orthogonal matching pursuit
In this way, by only collecting low noise input (OMP) [13], an algorithm that approximates opti-
samples, we are able to augment the data during mal unconstrained precoders and combiners for
training to effectively make the model more a geometric scattering channel model, such that it
robust to different noise levels. The benefit of this can be implemented in low-cost RF hardware and
approach is that we force our model to produce operate under very low SNR scenarios.
an output channel estimation that is close to the Quadrature phase shift keying (QPSK) modu-
ideal noiseless one. This reduces errors in end-to- lation is used at data transmission time. For train-
end transmissions due to poorly estimated chan- ing, we simulate 9000 complete transmissions,
nels under low SNR conditions. that is, including both Channel Sounding and Data
Transfer phases, which are divided in 85 percent
dAtAset creAtIon for trAInIng/testIng and 15 percent ratios for training and validation.
We use Communication Toolbox and Phased Array In order to generate enough variation in channel
Toolbox within MATLABTM to set up an mMIMO realizations, we uniformly sample random seeds
transmitter/receiver scenario. Specifically, we simu- in the range U[1, 107], used to generate unique
late a downlink end-to-end transmission from a BS channel states. For each transmission, we store the

IEEE Wireless Communications • April 2021 23

Authorized licensed use limited to: PES University Bengaluru. Downloaded on March 14,2024 at 16:10:18 UTC from IEEE Xplore. Restrictions apply.
BELGIOVINE_LAYOUT.indd 23 4/14/21 2:29 PM
The quality of CSI ynR received LTF preambles, after the channel and 3
10
estimation, usually noise application at each nR receiver antenna and LS
LMMSE
assumed perfect in the relative channel estimation performed on the Proposed
transmitter side, for all antennas and usable OFDM 10 2
the literature, impacts
sub-carriers. In total, our training dataset consists of
directly on the quality 1,152,000 LTF preambles. 10 1
of the Data Transfer To test our model under different noise con-

NMSE
phase, as it forms the ditions, we generate separate test datasets on a
range of SNR levels, each composed of 500 trans- 10 0
initial information
from which the BS will missions. Due to the ability of the OMP precod-
ing method to operate on extremely low SNR, we 10
-1
compute precoding consider SNR ranging from –22 up to 10 dB. Since
and combiner param- we want to assess the robustness of our model to
noise variation, we add different levels of AWGN 10 -2
eters. Hence, higher -25 -20 -15 -10 -5 0 5 10
spatial diversity can during data generation according to the desired SNR (dB)

be achieved when SNR levels under which we want to test our model.
The entire dataset and the software related to the FIGURE 3. NMSE between each channel estima-
employing beamform- data generation pipeline will be released for further tion method and ideal channel estimation.
ing under accurate CSI use by the research community. Note that as noise power increases, LMMSE
estimation. output coefficients are close to 0 due to a large
Performance Evaluation amount of noise corrupting the samples, so
NMSE approaches 1 as SNR tends to –∞.
In this section, we evaluate the performance of
the proposed DL-based CSI estimation technique.
First, the normalized MSE (NMSE) is used to mea- to each channel in a single input batch. This allows
sure the accuracy of the channel estimation. Sec- convenient scaling up to higher order mMIMO
ond, the impact of model predictions is verified systems if massively parallel hardware accelera-
by means of bit error rate (BER) and beamforming tors are employed at inference time. Moreover,
gains, and later compared against traditional esti- for those instances where a single forward pass
mation techniques. We also present a discussion cannot fit all the batch samples, our system allows
on how the proposed approach decouples the the arrangement of smaller batches that could be
CSI channel estimation overhead from the anten- processed in parallel using independent accelera-
na array dimensions. tors, that is, F accelerators provide a F speedup
increase compared to conventional streamlined
Prediction Accuracy systems. To give an idea of the effectiveness of the
First, we wish to assess the quality of the proposed proposed method, our system estimates a full 32 
channel estimator model under unseen channel 4 mMIMO channel over 234 usable sub-carriers in
conditions, and compare it to the ones obtained 5.985 · 10–4 s using an NVIDIA RTX 2080 Ti GPU,
through traditional methods. Figure 3 depicts the thus proving high accuracy and execution times
NMSE of channel estimations for each method with below the envisioned channel coherence time for
respect to perfect (i.e., noiseless) estimation. Our moderate mobility scenarios (i.e., 1–10 ms at 28
method not only generalizes well on unseen chan- GHz frequency range).
nel conditions, but provides a high-quality estima-
tion when the signal is corrupted by a large amount Open Research Challenges
of noise power, approaching or even exceeding Implementation of edge intelligence for emerging
LMMSE accuracy in [–15, 10] dB SNR range, val- communication networks is still at the nascent
idating the robustness of our denoising training stage with many open challenges:
method against blocking and jamming effects. • AI/ML-enabled channel estimation needs pub-
The quality of CSI estimation, usually assumed licly available, representative datasets, where
perfect in the literature, impacts directly on the qual- different types of pilots, channel conditions,
ity of the Data Transfer phase, as it forms the initial antenna configurations, and scenarios are con-
information from which the BS will compute pre- sidered holistically. Thus, new tools are need-
coding and combiner parameters. Hence, higher ed to generate and disseminate such datasets.
spatial diversity can be achieved when employing The Platforms for Advanced Wireless Research
beamforming under accurate CSI estimation. Figure (PAWR) [14] program has mMIMO BS installa-
4 shows how the proposed method, despite present- tions that can be used for this.
ing presumably a higher NMSE in the lowest SNR • Real-time execution of channel estimation
regions, still outperforms optimal LMMSE estima- schemes need carefully designed edge com-
tion under all SNR levels considered, providing bet- puting architectures in the form of FPGAs.
ter quality channel estimation. Finally, Fig. 5 further Thus, when limited training is also done on site
proves how the channel estimation provided by the using GPUs, there needs to be an automated
proposed approach is superior to optimal LMMSE pathway that takes the trained models to gen-
in terms of end-to-end performance, providing zero erate and test compatible FPGA code without
BER starting from –19 dB, where LMMSE shows simi- human intervention.
lar performance only starting from –17 dB. • The design of deep architectures cannot
be divorced from impact on inference time.
Scalability Recently, several works on joint training and
As stated before, with the proposed method every compression via pruning [15] have been
NT  NR channel estimation is performed indepen- demonstrated for RF applications. Furthermore,
dent from one another and using the same DNN quantization of the weights speeds up FPGA
model, by grouping the LTF incoming signal relative processing.

24 IEEE Wireless Communications • April 2021

Authorized licensed use limited to: PES University Bengaluru. Downloaded on March 14,2024 at 16:10:18 UTC from IEEE Xplore. Restrictions apply.
BELGIOVINE_LAYOUT.indd 24 4/14/21 2:29 PM
The overall wireless
40 10 0
environment changes
10
-1 with time and location.
35
Hence, use of transfer
Beamforming gain (dB)

Bit error rate (BER)


-2
10 learning and federated
30
learning are needed to
10 -3
cope with scenarios
25
10 -4 not encountered at
training time, includ-
20 10 -5
LS + OMP LS + OMP
LMMSE + OMP
ing hardware updates
LMMSE + OMP
Proposed + OMP Proposed + OMP like addition of more
15 10 -6
-25 -20 -15 -10 -5 0 5 10
-22 -20 -18 -16 -14 -12 antennas, or software
SNR (dB)
SNR (dB) changes, such as new
FIGURE 5. BER measured over different SNR level protocols that requires
FIGURE 4. Gain in dB of received signal observed
for LS estimation and proposed DNN-based different pilot arrange-
during Data Transfer phase (i.e., after beam-
method. Value not showed results in BER = 0.
forming) compared to signal power observed ments.
during Channel Sounding phase. [5] W. Taylor et al., “An Intelligent Non-Invasive Real-Time
Human Activity Recognition System for Next-Generation
Healthcare,” Sensors, vol. 20, no. 9, 2020, p. 2653.
• The overall wireless environment changes with [6] K. Sankhe et al., “ORACLE: Optimized Radio clAssification
time and location. Hence, use of transfer learn- through Convolutional NeuraL nEtworks,” IEEE INFOCOM
ing and federated learning are needed to cope 2019, 2019, pp. 370–78.
with scenarios not encountered at training time, [7] H. Ye, G. Y. Li, and B. Juang, “Power of Deep Learning for
Channel Estimation and Signal Detection in OFDM Systems,”
including hardware updates like addition of more IEEE Wireless Commun. Letters, vol. 7, no. 1, 2018, pp. 114–17.
antennas, or software changes, such as new pro- [8] C. K. Wen, W. T. Shih, and S. Jin, “Deep Learning for Mas-
tocols that require different pilot arrangements. sive MIMO CSI feedback,” IEEE Wireless Commun. Letters,
vol. 7, no. 5, 2018, pp. 748–51.
Conclusion [9] P. Dong et al., “Deep CNN-Based Channel Estimation for
mmWave Massive MIMO Systems,” IEEE J. Selected Topics
We present a DL-based CSI estimation technique in Signal Processing, vol. 13, no. 5, 2019, pp. 989–1000.
for massive MIMO antenna arrays, which will facil- [10] H. He et al., “Deep Learning-Based Channel Estimation
itate faster channel sounding for beyond 5G wire- for Beamspace mmWave Massive MIMO Systems,” IEEE
Wireless Commun. Letters, vol. 7, no. 5, 2018, pp. 852–55.
less networks. It will also achieve higher throughput [11] H. Huang et al., “Deep Learning for Super- Resolution
for extremely low SNR scenarios, as is generally DOA Estimation in Massive MIMO Systems,” IEEE VTE-Fall
also applicable for mmWave and THz bands. The 2018, Aug. 2018, no. 9, pp. 8549–60.
proposed DNN uses two hidden MLP layers and [12] E. Balevi, A. Doshi, and J. G. Andrews, “Massive MIMO Chan-
nel Estimation with an Untrained Deep Neural Network,” IEEE
a linear output layer to jointly perform the task of Trans. Wireless Commun., vol. 19, no. 3, 2020, pp. 2079–90.
OFDM demodulation and CSI matrix generation [13] O. E. Ayach et al., “Spatially Sparse Precoding in Millimeter
for mMIMO downlink transmission. We substan- Wave MIMO Systems,” IEEE Trans. Wireless Commun., vol.
tially improve the end-to-end system performance, 13, no. 3, 2014, pp. 1499–1513.
[14] A. Gosain, “Platforms for Advanced Wireless Research:
achieving up to 5 and 2 orders of magnitude Helping Define a New Edge Computing Paradigm,” Proc.
reduction in BER with respect to practical LS and 2018 Technologies for the Wireless Edge Wksp., 2018, p. 33.
optimal LMMSE, respectively, and higher spatial [15] Z. Wang et al., “Learn-Prune-Share for Lifelong Learning.”
diversity for lower SNR regions, achieving up to Proc. IEEE Int’l. Conf. Data Mining, Nov. 2020.
4 dB gain in received power signal compared to
performance obtained through LMMSE estimation. Biographies
Mauro Belgiovine is pursuing his Ph.D. at the Electrical and
Finally, we discuss the importance of model com- Computer Engineering Department at Northeastern University,
pression techniques to be applied on trained mod- Boston, Massachusetts, under the guidance of Prof. Kaushik
els in order to be easily deployed in edge devices, Chowdhury. His current research interests involve deep learn-
enabling higher data rates for edge computing ing, wireless communication, and heterogeneous computing.
over B5G mmWave communication. Kunal Sankhe is currently pursuing a Ph.D. degree in computer
engineering at Northeastern University under the supervision of
Acknowledgments Prof. K. Chowdhury. His current research efforts are focused on
This material is based on work supported by the implementing deep learning in the wireless domain and developing
Defense Advanced Research Projects Agency a cross-layer communication framework for the Internet of Things.
(DARPA) under Agreement No. HR00112090055 C arlos B ocanegra is a Ph.D. candidate working under the
and the U.S. National Science Foundation under guidance of Prof. Kaushik R. Chowdhury at Northeastern Uni-
award no. CNS #1923789. versity. He has experience in the areas of multi-antenna frame-
works for centralized or distributed systems, machine learning
References for wireless applications, mobile sensing and computing, and
coexistence of heterogeneous wireless systems.
[1] S. Haghighatshoar and G. Caire, “Massive MIMO Channel
Subspace Estimation from Low-Dimensional Projections,” IEEE D ebashri R oy received her Ph.D. degree in computer sci-
Trans. Signal Processing, vol. 65, no. 2, 2017, pp. 303–18. ence from the University of Central Florida. She is currently a
[2] Y. Ma, G. Zhou, and S. Wang, “WiFi Sensing with Channel postdoctoral fellow at Northeastern University. Her research
State Information: A Survey,” ACM Comp. Surv., vol. 52, interests are in the areas of experiential AI and ML in wireless
no. 3, 2019. communication.
[3] A. F. Molisch et al., “Hybrid Beamforming for Massive
MIMO: A Survey,” IEEE Commun. Mag., vol. 55, no. 9, Sept. Kaushik Roy Chowdhury [M’09, SM’15] is a professor at North-
2017, pp. 134–41. eastern University. His current research interests involve systems
[4] V. Savaux and Y. Louët, “LMMSE Channel Estimation in aspects of networked robotics, machine learning for agile spec-
OFDM Context: A Review,” IET Signal Processing, vol. 11, trum sensing/access, wireless energy transfer, and large-scale
no. 2, 2017, pp. 123–34. experimental deployment of emerging wireless technologies.

IEEE Wireless Communications • April 2021 25

Authorized licensed use limited to: PES University Bengaluru. Downloaded on March 14,2024 at 16:10:18 UTC from IEEE Xplore. Restrictions apply.
BELGIOVINE_LAYOUT.indd 25 4/14/21 2:29 PM

You might also like