Learning To Reflect and To Beamform For Intelligent Reflecting Surface With Implicit Channel Estimation
Learning To Reflect and To Beamform For Intelligent Reflecting Surface With Implicit Channel Estimation
Learning To Reflect and To Beamform For Intelligent Reflecting Surface With Implicit Channel Estimation
Abstract—Intelligent reflecting surface (IRS), which consists to directly design the beamforming and reflective patterns to
of a large number of tunable reflective elements, is capable of optimize a system-wide objective.
enhancing the wireless propagation environment in a cellular The promise of IRS stems from its ability to manipulate
network by intelligently reflecting the electromagnetic waves from
incident electromagnetic waves toward the intended directions
arXiv:2009.14404v4 [eess.SP] 9 Jun 2021
ordering of the users induces the same permutation on the shifts for a single-user system to reduce the algorithm run-
BS beamformers but keeps the IRS reflective pattern fixed time complexity of optimizing the IRS. Furthermore, deep
(known as permutation equivariant and permutation invariant reinforcement learning is leveraged to optimize the phase shifts
properties, respectively). Such an architecture allows the possi- for the single-user system in [23], and for the multiuser case
bility of generalizability to networks with an arbitrary number in [24].
of users. Numerical results show that the proposed approach This paper proposes to use a data-driven approach for
produces solutions that can be easily interpreted. Overall, this optimizing the IRS. The proposed approach is motivated by the
paper shows that by bypassing the explicit channel estimation success of using deep learning to optimize wireless communi-
phase altogether, a machine learning approach can achieve cation systems without explicitly estimating the channel [25]–
a significantly higher transmission rate than the conventional [27]. In particular, [25] shows that the beamforming vectors
channel estimation based approach, especially in the limited learned from the received pilot signals can approach the
pilot length regime. achievable rate of the optimal beamforming vectors for highly-
mobile mmWave systems. Further, [26] shows that based on
the geographical locations of the users, the deep learning
A. Related Works
approach is able to learn the optimal scheduling without
Many of the existing works on the optimization of IRS channel estimation. Location information is also utilized in
assume that perfect channel state information (CSI) is avail- [27] to configure the IRS for indoor signal focusing using a
able at the BS. Based on the perfect CSI assumption, joint deep learning approach. In this paper, we primarily use the
optimization of the IRS reflection and BS beamforming can be received pilots as the input to the neural network, because
carried out for different network objectives, e.g., minimizing the received pilot signal contains rich information about both
the energy consumption [7], maximizing the spectral efficiency the large-scale and small-scale fading, but we also show that
[5], or maximizing the minimum rate [12]. In practice, CSI incorporating the location information can further reduce the
needs to be estimated. Due to the passive nature of the IRS required pilot length.
elements, directly estimating the CSI for IRS is not feasible.
Instead, CSI estimation needs to be carried out either at the
BS or at the users based on the end-to-end reflected signals. In B. Main Contributions
this direction, [13] proposes to solve the channel estimation This paper casts the problem of designing the beamforming
problem based on a binary reflection method by turning on and reflective patterns in an IRS system as a variational
the IRS elements one at a time, while [14] proposes a channel optimization problem whose optimization variables are func-
estimation method based on parallel factor decomposition for tionals, i.e., mappings from the received pilots to the phase
estimating the BS-IRS channel and the individual IRS-user shifts at IRS and beamforming matrix at the BS. We propose
channels. However, as the number of elements in an IRS is to parameterize this mapping using a neural network and to
typically quite large (in order to achieve higher beamforming train the neural network based on the training data to directly
gain [15]), these channel estimation methods typically require maximize a network utility function.
a large pilot overhead. To address this issue, [16] proposes While a fully connected neural network has already been
to group the IRS elements into sub-surfaces, but at a cost of shown to be able to significantly reduce pilot length for sum-
reduced beamforming capability. More recently, [17] proposes rate maximization in the conference version of this paper
a compressed sensing based channel estimation method for [1], a further contribution of this journal paper is that we
the multiuser IRS-aided system, which reduces the training propose a GNN architecture to better model the interference
overhead significantly but requires the assumption of channel among the different users in the network. The proposed
sparsity. Further, [18] proposes to reduce the training overhead GNN is permutation invariant/equivariant across the users and
by exploiting the common reflective channels among all the therefore provides better scalability and generalization ability.
users. All these works fall into the paradigm of first estimating For example, while a fully connected neural network would
the channels from the received pilot signals, then solving not have been able to generalize when the number of users
the reflection optimization problems based on the estimated in the network changes (except by re-training a new neural
channels. network), a GNN structure can easily have shared parameters
Recently, data-driven approaches have been introduced to across the components for different users, thereby achieving
address the challenges either in channel estimation or beam- generalizability. It is worth noting that GNNs have been
forming [19]. For the channel estimation problem, [20] pro- proposed to solve radio resource allocation problems in [28]–
poses a deep denoising neural network to enhance the per- [30], but these prior works all require perfect CSI and are not
formance of the model-based compressive channel estimation designed for IRS systems.
for mmWave IRS systems. The authors of [21] propose a A key benefit of the data-driven approach for communica-
convolutional neural network to estimate both direct and tion system design is that it can easily incorporate different
cascaded channels from the received pilot signals through types of data as inputs to the neural network. In this paper,
end-to-end training. Given perfect channels, the beamforming we propose to incorporate the locations of the users, so that the
problem has been investigated from the perspective of the data- neural network can focus on learning the small-scale fading
driven approach in [22]–[24]. In particular, [22] proposes a component of the wireless channels. The numerical results
multi-layer fully connected neural network to learn the phase show that this significantly improves the utility maximization
3
pilot sequences xHk = [xk (1), xk (2), · · · , xk (L0 )] of length L0 The optimal solution to problem (11) is given by
to the BS, repeated over τ sub-frames. The pilot sequences
f (Ỹk ) = E[Fk |Ỹk ]. (12)
of all users are designed to be orthogonal to each other so
that they can be decorrelated at the BS, i.e., xH k1 xk2 = 0 However, for general channel fading distributions, the optimal
if k1 6= k2 and xHki xki = L0 Pu where Pu is the uplink pilot solution is computationally intensive to implement. A low-
transmission power. In the meanwhile, the IRS keeps the phase complexity approach is to constrain the estimator f (·) to be
shifts fixed within each sub-frame, but uses different phase linear, which results in the linear MMSE (LMMSE) method.
shifts in different sub-frames so that both the users-to-IRS When the rows of Fk and the rows of Ñ are i.i.d., the LMMSE
and the IRS-to-BS channels can be measured. estimator is as follows [33]
The BS decorrelates the received pilots in each sub-frame −1
by matching the pilot sequence for each user. Let Ȳ (t) = Fˆk = (Ỹk − E[Ỹk ]) E[(Ỹk − E[Ỹk ])H (Ỹk − E[Ỹk ])]
[y((t − 1)L0 + 1), · · · , y(tL0 )] denote the received pilots in
E[(Ỹk − E[Ỹk ])H (Fk − E[Fk ])] + E[Fk ]. (13)
sub-frame t. Let v̄(t) be the phase shifts at the IRS in sub-
frame t. Then, Ȳ (t) can be expressed as: The estimates of hdk , Ak can then be obtained from Fˆk . Note
K that the LMMSE estimator is an optimal solution to (11)
only if the unknown Fk is Gaussian distributed. We note that
X
Ȳ (t) = (hdk + Ak v̄(t))xH
k + N̄ (t), t = 1, · · · , τ, (7)
k=1
this LMMSE channel estimation method for multiuser MISO
system is also proposed in [12].
where N̄ (t) is a noise matrix with each column independently
distributed as CN (0, σ12 I). By the orthogonality of the pilots,
IV. P ROPOSED D EEP L EARNING F RAMEWORK
we can form ȳk (t) ∈ CM , i.e., the contribution from user k
at the t-th sub-frame, as given in [17]: The conventional channel estimation approach aims to solve
(11), i.e., to recover entries of Fk given the received pilots Ỹk
1
ȳk (t) = Ȳ (t)xk = hdk + Ak v̄(t) + n̄(t) (8) using a mean squared error metric. However, recovering the
L0 channels is not the goal. Our ultimate objective is to maximize
, Fk q(t) + n̄(t), (9) the network utility as in (6). The main idea of this paper is to
bypass explicit channel estimation and to solve problem (6)
where n̄(t) = L10 N̄ (t)xk , and the combined channel matrix
directly.
is defined as Fk , [hdk , Ak ] and the combined phase shifts Specifically, we aim to use a neural network to represent
is defined as q(t) , [1, v̄(t)> ]> . Recall that we have τ sub- the mapping function g(·) in problem (6), and to pursue a
frames in total. Then, by denoting Ỹk = [ȳk (1), · · · , ȳk (τ )] data-driven approach to train the neural network so that it
as a matrix of received pilots across τ sub-frames, we have mimics the optimal mapping from the received pilot signals
Ỹk = Fk Q + Ñ , (10) to the beamformers and the phase shifts for network utility
maximization. This overall framework is depicted in Fig. 2(a).
where Q = [q(1), · · · , q(τ )] and Ñ = [n̄(1), · · · , n̄(τ )]. In this section, we describe the neural network architecture
The channel estimation problem aims to estimate the com- suited for this task.
bined matrix Fk for k = 1, . . . , K. Typically, to ensure that the
matrix Q is full rank so that Fk can be recovered successfully,
A. Graphical Representation of Users and IRS
we need at least τ = N + 1, i.e., a total (N + 1)K pilot
symbols are needed. When τ = N + 1, one choice of Q is A central task in a multiuser cellular network is the manage-
a DFT matrix as suggested in [16]. For comparison purposes, ment of the interference between the users. Toward this end,
we also consider the more general case where τ 6= N + 1. In the beamformers at the BS and the phase shifts at the IRS must
this case, Q is not a square matrix. One way to construct Q is be coordinated so that the mutual interference is minimized.
to first form a d × d DFT matrix Q0 with d = max(τ, N + 1), This paper proposes to use a neural network architecture,
then truncate Q0 to the first τ columns or the first N + 1 called GNN, which is based on a graph representation of
rows. Another possibility is to independently construct vectors the beamformers for the users and the phase shifts at IRS,
v̄(t), t = 1, · · · , τ , randomly. Specifically, the phase of [v̄(t)]i to capture the multiuser interference. The graph consists of
can be constructed by drawing from a uniform random variable K + 1 nodes as shown in Fig. 2(b). The IRS is represented
in [−π, π). In this paper, we use the second approach to by node 0 and the K users are represented by nodes 1 to
construct Q if τ < N + 1, and use the first approach in other K. A representation vector, denoted as zk , k = 0, 1, · · · , K,
cases. These are empirically good choices. is associated with each node. The goal is to encode all the
useful information about each corresponding node in these
representation vectors. The representation vectors are updated
B. Conventional Channel Estimation layer by layer in a GNN, taking into account all the repre-
To estimate the channel matrix Fk from (10), we can use sentation vectors in the previous layer as input. After multiple
the minimum mean-squared error (MMSE) estimator, obtained layers, the representation vector of each node would contain
by solving the following problem sufficient information for designing the beamforming vectors
h i and the reflection coefficients. Specifically, the GNN is trained
minimize E kf (Ỹk ) − Fk k2F . (11) in such a way so that the phase shifts of the IRS can be
f (·)
6
(a) (b)
Fig. 2. (a) Proposed deep learning framework for directly designing the beamformers and phase shifts based on the received pilots; (b) Graph representation
of the network. The node representation vector z0 corresponds to the IRS, and z1 , · · · , zK correspond to the users.
subsequently obtained from z0 , and the beamforming matrix aggregation and combination layers, and a final layer that
at the BS can be obtained from z1 , · · · , zK . includes normalization. It takes the input feature from the user
As compared to a fully connected neural network, the GNN node of the graph as the initial value of the representation
more naturally captures the interactions between the users and vector, denoted as zk0 , then updating them through the D layers
the IRS. By explicitly embedding these interactions into the to produce zkd , d = 1, · · · , D, and finally a linear layer to
neural network architecture, the GNN is better able to learn produce zkD+1 , which is mapped to the beamformer matrix
a mechanism for reducing the interference between users. In W and the phase shifts v via normalization. The overall
particular, the update of each user node is a function of all architecture is shown in Fig. 3.
its neighboring user nodes and the IRS node, which enables 1) Initialization Layer: The initialization layer takes the
the GNN to learn to avoid interference. The update of the input features from the user nodes as the initial representation
IRS node is a function of all the user nodes, which enables vector zk0 , k = 1, · · · , K, then trains one layer of the neural
the GNN to learn to configure the phase shifts to spatially network to produce zk1 , k = 0, · · · , K for the subsequent
separate the channels for all the users. layers. Note that the IRS node does not have input features,
A useful feature of the GNN is that it is able to capture the because only the users transmit pilots.
permutation invariant and permutation equivariant properties The input features from the user nodes are simply the
[28], [30] of the network utility maximization problem (6). received pilots, i.e., vectorized form of the matrix Ỹk with
That is, if we permute the index labels of the users in the real and imaginary components separated:
problem, the neural network should output the same set of
beamforming vectors wk with permuted indices and the same zk0 = [vec(<{Ỹk })> , vec(={Ỹk })> ]> . (14)
phase shifts v. Here, permutation invariance means that the
phase shifts v is independent of the ordering of the user As mentioned earlier, it is easy for the neural network to
channels, and permutation equivariance means that if the user incorporate additional useful information about each user into
channels are permuted, the beamforming vectors wk would the input feature vector zk0 . For example, if the locations of
be permuted in the same way. These properties are not easy the users are available, the input feature vector zk0 can be
to learn in a fully connected neural network, but are naturally
embedded in the GNN. zk0 = [vec(<{Ỹk })> , vec(={Ỹk })> , lk> ]> , (15)
By tailoring to the problem structure, the GNN also reduces
the model complexity as compared to the fully connected where lk is the three-dimensional vector denoting the coordi-
neural network. More importantly, the parameters of the GNN nates of the location of the user k.
can be tied across the users, so that it can be easily generalized Given the input feature vector zk0 , we use a layer of fully
to scenarios with different number of users. This is in contrast connected neural networks, denoted as fw0 (·), to produce zk1
to fully connected neural networks, whose parameters need to for the user nodes, i.e.,
scale with the number of users, which makes it difficult to
generalize. zk1 = fw0 (zk0 ), k = 1, · · · , K. (16)
For the IRS node, we take inputs from all the user nodes and
B. GNN Architecture process them using a permutation invariant function ψ0 (·) first,
We now describe the proposed GNN architecture and the then a fully connected neural network fv0 (·) as
training process. The overall GNN aims to learn the graph
z01 = fv0 ψ0 z10 , · · · , zK
0
representation vector zk through an initialization layer, D . (17)
7
DNN Linear
Aggregation Aggregation Layer
…
…
…
+ +
Combination Combination
Normalization
Layer
…
+ +
…
…
Layer
Combination Combination
…
…
…
…
DNN Linear
Aggregation Aggregation
Layer
…
+ +
…
…
Combination Combination
Normalization
Aggregation Aggregation Linear
Layer
Layer
…
…
DNN + +
Combination Combination
Fig. 3. Overall graph neural network architecture with an initialization layer, D updating layers, and a final normalization layer.
DNN DNN
…
DNN …
DNN
…
DNN DNN
DNN
DNN
Fig. 4. Aggregation and combination operations of the d-th layer for (a) the IRS node, and (b) the user nodes.
In the implementation, we choose ψ0 as the element-wise representation and the aggregation of the representations from
mean function, i.e., its neighboring nodes. In a general GNN, this is given by [34]
zkd = fcombine
d
zkd−1 , faggregate
d
{zjd−1 }j∈N (k) , (19)
K
1 X 0
[ψ0 (z10 , · · · , zK
0
)]i = [zk ]i . (18)
K
k=1
where N (k) denotes the set of neighboring nodes of the node
d d
k, faggregate (·) and fcombine (·) are the aggregation function
This is a reasonable choice, because all the pilots are reflected and combining function of the layer d, respectively.
through the IRS, so all the received pilots contain equal A key in designing the GNN is to choose a suitable aggrega-
amount of information about IRS. d d
tion function faggregate (·) and combining function fcombine (·)
The representation vectors (z01 , z11 , · · · , zK
1
) now contain in (19) so that the GNN is scalable and generalizes well. An
features about the IRS and the user channels, respectively, and d
efficient implementation of faggregate (·) takes the following
are subsequently passed to the D updating layers of the GNN form [30], [35]:
in order to eventually produce the IRS reflective coefficients d
{zjd−1 }j∈N (k) = ψ {fnn d
(zjd−1 )}j∈N (k) , (20)
faggregate
v from z0D and the user beamformers wk from zkD .
2) Updating Layers: The update of the representation vec- where ψ(·) is a function that is invariant to the permutation
tor zk in the d-th layer is based on combining its previous of the inputs, e.g., element-wise max-pooling or element-wise
8
d
mean pooling, and fnn (·) is a fully connected neural network. Then a normalization layer outputs the real and imaginary
d
In addition, the combining function fcombine (·) in (19) can components of the reflection coefficients v as follows:
also be implemented by a fully connected neural network for
complicated optimization problems [30], [35]. Zv = [z0D+1 (1 : N ), z0D+1 (N + 1 : 2N )] ∈ RN ×2 , (25)
We adopt this framework to design the GNN for solving [Zv ]i1 [Zv ]i2
vi = p 2 2
+jp , ∀i, (26)
the problem (6). We should note that our problem requires [Zv ]i1 + [Zv ]i2 [Zv ]2i1 + [Zv ]2i2
permutation equivariance with respect to the user nodes and
where [Z]ik denotes the element in the i-th row and k-th
permutation invariance with respect to the IRS node, so the
column of the matrix Z, and the notation z(i1 : i2 ) denotes
aggregation and combination operations for the IRS and the
the subvector of z indexed from i1 to i2 .
user nodes are designed differently.
Similarly, to output the beamforming matrix, we first pass
For the IRS node, we derive z0d , the node representation zkD through a linear layer fwD+1 (·) with 2M units
vector in the d-th layer from the previous layer as follows:
zkD+1 = fwD+1 (zkD ) ∈ R2M , k = 1, · · · , K, (27)
z0d f2d f0d (z0d−1 ), ψ0 f1d (z1d−1 ), · · · d−1
, f1d (zK
= ) , (21)
then use the following normalization steps to produce the
beamforming matrix W :
where f0d (·), f1d (·) and f2d (·) are fully connected neural
networks, and the aggregation function ψ0 (·) is chosen to Zw = [z1D+1 , · · · , zKD+1
] ∈ R2M ×K , (28)
be the element-wise mean function over the users as in (18), p Zw
which performs well empirically and corresponds to the fact Zw = Pd , (29)
kZw kF
that the IRS reflective pattern needs to serve all users.
W = Zw (1 : M, :) + jZw (M + 1 : 2M, :), (30)
For the user nodes, we recognize that aggregation should be
with respect to all the other user nodes excluding the IRS node where the notation Z(i1 : i2 , :) denotes the submatrix of Z
and propose the update equation for the node representation constructed by taking the rows of Z indexed from i1 to i2 .
vectors zkd for k = 1, · · · , K in the d-th layer to take the form Note that throughout the GNN architecture, we use the
same fw0 (·), f0d (·), f1d (·), f2d (·), f3d (·), f4d (·), and fwD+1 (·) to
zkd = f4d f0d (z0d−1 ), zkd−1 , ψ1 f3d (zjd−1 ) ∀j6=0,j6=k ,
update the node representation vectors for all the user nodes.
(22) This allows the GNN to be generalizable to an IRS network
with arbitrary number of users. The learned combining and
aggregation operations (21) and (22) are independent of the
where f3d (·), f4d (·) are fully connected neural networks and
number of users. If we increase or decrease the number of
ψ1 (·) is chosen to be the element-wise max-pooling function,
users in the system, we only need to increase or decrease the
number of nodes in the graph, the same learned combining
[ψ1 (z1 , · · · , zK )]i = max([z1 ]i , · · · , [zK ]i ), (23)
and aggregation operations can still be used without having to
re-train the neural network.
which performs well empirically and corresponds to the fact
that multiuser interference is typically dominated by the
strongest user. C. Neural Network Training
The update of the node representation vectors in (21) and Since the existing deep learning software packages do not
(22) is shown in Fig. 4(a) and Fig. 4(b), respectively. We support complex-valued operations, to compute the network
remark that there are many different design choices for the utility during the training phase, we rewrite the achievable
aggregation and combination operations. There is no general rate Rk as a function of the real and imaginary parts of wk
theory on how to choose these permutation invariant functions. and v as follows:
In most works in the literature, the GNN architectures are !
designed based on empirical trials. In the simulation section of kγk k2
Rk = log 1 + PK , (31)
2 2
this paper, we shall see that the adopted GNN framework and i=1,i6=k kγi k + σ0
the choice of the permulation invariant functions can achieve
where
excellent performance.
<{wi> } −={wi> }
3) Normalization layer: After D update layers, the rep- γi =
resentation vectors produced by the GNN, i.e., zkD , k = ={wi> } <{wi> }
<{hdk }
0, · · · , K, are passed to a normalization layer to produce the <{Ak } −={Ak } <{v}
· + . (32)
reflective coefficients v ∈ CN and the beamforming matrix ={hdk } ={Ak } <{Ak } ={v}
W ∈ CM ×K , while ensuring that the unit modulus constraints
Given this real representation of Rk , the loss
on v and the total power constraint on W are satisfied.
function of the GNN can be expressed as
To this end, we first take z0D as the input to a linear layer −E [U (R1 (v, W ), . . . , RK (v, W ))]. Note that we need
D+1
fv (·) with 2N fully connected units as follows: CSI to generate training samples and to compute the network
utility, but once trained, the operation of the neural network
z0D+1 = fvD+1 (z0D ) ∈ R2N . (24) does not require CSI. We remark that the training of the
9
neural network is done offline, so it does not affect the rectangular array placed on the (y, z)-plane in a 10 × 10
run-time complexity of the proposed approach. configuration. The BS antennas have a uniform linear array
During training, the neural network learns to adjust its configuration parallel to the x-axis.
weights to maximize the network utility, i.e., the objective We assume that the direct channels from the BS to the users
function of problem (6), in an unsupervised manner, using follow Rayleigh fading, i.e.,
the stochastic gradient descent method. The updates of the
neural network parameters including the computation of the hdk = β0,k h̃dk , (34)
corresponding gradients can be automatically implemented
using any standard numerical deep learning software package. where h̃dk ∼ CN (0, I), and β0,k denotes the path-loss of the
The overall end-to-end training allows us to jointly design direct link between the BS and the user k modeled (in dB)
the beamforming matrix and phase shifts from the received as 32.6 + 36.7 log(dBU BU
k ), where dk is the distance of the
pilots directly. As shown in the simulation results in the next direct link from the BS to the user k. We assume that the
two sections, the proposed deep learning method is able to IRS is deployed at a location where a line-of-sight channel
solve problem (6) more efficiently, in the sense that it would exists between the IRS and the users/BS, so we model the
need fewer pilots to achieve the same performance as the channel hrk ’s between the IRS and the user k and the channel
conventional separated channel estimation and network utility G between the BS and the IRS as Rician fading channels:
maximization approach.
r r !
ε r,LOS 1 r,NLOS
V. P ERFORMANCE FOR S UM -R ATE M AXIMIZATION hrk
= β1,k h̃ + h̃ , (35)
1+ε k 1+ε k
In this section, we evaluate the performance of the proposed r r !
deep learning method for the problem of sum-rate maximiza- ε LOS 1 NLOS
G = β2 G̃ + G̃ , (36)
tion in comparison to the channel estimation based approach. 1+ε 1+ε
Specifically, the sum-rate maximization problem is given as
where the superscript LOS represents the line-of-sight part of
" #
X
maximize E Rk (v, W ) the channel and the superscript NLOS represents non-line-
(W ,v)=g(Y )
k of-sight part, ε is the Rician factor which is set to be 10 in
X (33)
subject to kwk k2 ≤ Pd , simulations, and β1,k , β2 are the path-loss from the IRS to the
k user k and the path-loss from the BS to the IRS, respectively.
|vi | = 1, i = 1, 2, · · · , N, The entries of G̃NLOS and h̃kr,NLOS are modeled as i.i.d.
P standard Gaussian distributions, i.e., [G̃NLOS ]ij ∼ CN (0, 1)
so the loss function can be set as −E [ k Rk (v, W )].
and [h̃r,NLOS
k ]i ∼ CN (0, 1). The path-losses (in dB) of the BS-
IRS link and the IRS-user link are modeled as 30+22 log(dBI )
A. Simulation Setting and 30 + 22 log(dIUk ), respectively, where d
BI
is the distance
IU
We consider an IRS assisted multiuser MISO communica- between the BS and the IRS, and dk is the distance between
tion system as illustrated in Fig. 5, consisting of a BS with the IRS and the user k [5], [7]. The uplink pilot transmit power
8 antennas and an IRS with 100 passive elements. As shown and the downlink data transmit power are set to be 15dBm
in the Fig. 5, the (x, y, z)-coordinates of the BS and the IRS and 20dBm, and the uplink noise power is −100dBm and the
locations in meters are (100,100, 0) and (0, 0, 0), respectively. downlink noise power is −85dBm unless otherwise stated.
There are 3 users uniformly distributed in a rectangular area The line-of-sight part of the channel hrk is a function of
[5, 35] × [−35, 35] in the (x, y)-plane with z = −20 as shown the IRS/user locations. Specifically, let φ∗3,k , θ3,k
∗
denote the
in Fig. 5. We assume that the IRS is equipped with a uniform azimuth and elevation angles of arrival from the user k to the
10
IRS, as shown in Fig. 5. Then the n-th element of the IRS TABLE I
steering vector aIRS (φ∗3,k , θ3,k
∗
) can be expressed as [36] A RCHITECTURE OF THE FULLY CONNECTED NEURAL NETWORKS .
5.5 6
5 5
4.5 4
4 3
3.5 2
3 1
2.5 0
0 20 40 60 80 100 120 0 10 20 30 40 50 60 70 80
(a) Sum rate vs. pilot length. (b) Testing sum rate vs. training epochs.
Fig. 6. Performance of the proposed GNN for the IRS system with M = 8, N = 100, K = 3, Pd = 20dBm, and Pu = 15dBm.
9 5.6 10.5
5.4 10
8
5.2 9.5
7 5 9
4.8 8.5
6
4.6
8
5
4.4
7.5
4 4.2
7
4
6.5
3
3.8
6
2 3.6
15 17 19 21 23 25 10 12 14 16 18 20 5.5
2 3 4 5 6 7 8
(a) Sum rate vs. downlink transmit power for K = 3 (b) Sum rate vs. uplink pilot transmit power for (c) Sum rate vs. number of users for Pu = 15dBm
and Pu = 15dBm. K = 3, Pd = 20dBm. and Pd = 25dBm.
Fig. 7. Generalization performance of the GNN in an IRS system with M = 8, N = 100 and L = 25K, where K is the number of users.
about 96% of the sum rate in Benchmark 1 which assumes However, the proposed deep learning approach for directly
perfect CSI. The deep learning approach using 15 pilots still maximizing the sum rate achieves even better performance,
achieves better performance than the Benchmark 2 with 120 which shows that the neural network for direct rate maxi-
pilots. For Benchmark 2, 303 pilot symbols are needed for mization is able to extract more pertinent information from the
perfect channel reconstruction in the noiseless case. These received pilots than the neural network for explicit estimation
observations show that the deep learning approach, which of the channel matrix Fˆk . Note also that the dimension of
directly optimizes the system objective based on the pilots the output of the neural network for solving the sum-rate
received, can reduce the pilot training overhead significantly. maximization problem is much smaller than the output of
the neural network needed for channel estimation. This is
Further, Fig. 6(a) shows that incorporating the user location
another reason that it is advantageous to maximize the sum
information to the input of the proposed GNN can further
rate directly instead of recovering the entries of the channel
improve the sum rate if we do not have sufficiently large
matrix first.
pilot length, as compared to the results without location
information. But as the pilot length increases, the improvement
Next, we show how much data is needed to train the
becomes marginal. This is because the received pilot signals
proposed neural network. In Fig. 6(b), we plot the sum rate
implicitly contain the location information of the users, so it
evaluated on testing data against the training epochs for the
can be learned by the neural network if the pilot length is
deep learning approach without location information. Recall
sufficiently large.
that we sample 102, 400 training data in each training epoch. It
Fig. 6(a) illustrates that the Benchmark 3 (Deep learning can be seen from Fig. 6(b) that 10 training epochs are sufficient
based channel estimation with BCD) can outperform the to achieve a satisfactory performance (greater than 90% of the
conventional LMMSE based channel estimation approach. sum rate achieved with 80 training epochs).
12
TABLE II 1
S UM RATE (bps/Hz) WITH N = 100, K = 3, Pd = 25dBm, Pu = 15dBm.
0.9
M L Deep Learning LMMSE+BCD Gain Perfect CSI
45 7.45 5.83 28% 0.8
8 8.5
75 7.81 6.59 18%
0.7
45 9.02 7.76 16%
16 11.6
75 9.70 8.86 9% 0.6
0.5
To evaluate the performance of GNN when the BS is
0.4
equipped with larger number of antennas, we train the same
GNN under the setting N = 100, K = 3, Pd = 25dBm, 0.3
Pu = 15dBm, for both M = 8 and M = 16. As can be 0.2
seen from Table V-C, the proposed GNN always outperforms
0.1
the benchmark of LMMSE with BCD. As expected the gain is
larger when the pilot length L is smaller. But, we also observe 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
that the gain is smaller when the number of antennas M
increases from 8 to 16. This is because the problem dimension
increases as the number of BS antennas increases. We would
need to increase the number of parameters in the GNN to Fig. 8. Empirical cumulative distribution function of the minimum user rate
further improve its performance. with M = 4, N = 20, K = 3, L = 75, Pd = 20dBm, and Pu = 15dBm.
D. Generalizability
3) Generalization to different number of users: In this
We now test the trained neural network on different system simulation, the pilot length is set to be L = 25K, where
parameters to show its generalization capability. K is the number of users. The downlink transmit power is
1) Generalization to different downlink transmit powers: set as Pd = 25dBm. We train the proposed neural network
We fix the number of pilot length L = 75, and train two with K = 6 and test its performance on the settings with
neural networks under the settings with downlink transmit different number of users. The results are shown in Fig. 7(c).
power 15dBm and 25dBm, respectively. Then the trained It can be observed that the proposed GNN is able to generalize
neural networks are tested under different downlink transmit to different number of users, and it always outperforms the
powers. The results are shown in Fig. 7(a). For comparison, explicit channel estimation Benchmark 2.
we also plot the performance of the neural network with
the same training and testing downlink transmit power. All
VI. P ERFORMANCE FOR M AXIMIZING M INIMUM R ATE
the points are averaged over 1000 channel realizations. From
Fig. 7(a), it can be observed that the proposed deep learning The previous section shows the benefits of bypassing ex-
approach can significantly outperform Benchmark 2 under the plicit channel estimation for the sum-rate maximization prob-
different downlink transmit powers. Furthermore, training the lem. But the sum-rate objective does not provide fairness
neural network at either 15dBm or 25dBm (but testing it across the users, which is typically required in practical
at different powers) only incurs small losses. This suggests wireless communication systems. In this section, we consider
that the proposed neural network generalizes well to different the problem of maximizing the minimum user rate, i.e., max-
downlink transmit powers. min problem, as formulated below:
2) Generalization to different uplink pilot transmit powers:
In this scenario, we fix the pilot length L = 75 and the maximize E min Rk (v, W )
(W ,v)=g(Y ) k
downlink transmit power Pd = 20dBm. We first train a X
(46)
subject to kwk k2 ≤ Pd ,
neural network under the setting with uplink pilot transmit
k
power Pu = 15dBm. We then change the pilot transmit
power to generate the testing data. The results are shown in |vi | = 1, i = 1, 2, · · · , N.
Fig. 7(b). We also plot the performance of the neural network For the max-min problem (46), we use the same neural
with the same training and testing uplink transmit power for network architecture as in the sum-rate maximization problem,
comparison. All the points are averaged over 1000 channel but the loss function is now −E [mink Rk (v, W )]. We should
realizations. In Fig. 7(b), training the neural network at a fixed note that the minimum user rate function is differentiable
pilot power of 15dBm (but testing at different pilot powers) with respect to W and v almost everywhere except at the
achieves almost identical performance as training and testing points when Rk = Rj for k 6= j, thus the gradient based
at the same powers, which implies that the proposed neural back-propagation method can still be applied for training the
network generalizes well under different uplink pilot transmit proposed neural network.
powers. Besides, the proposed deep learning approach is quite The simulation setting is that of an IRS assisted wireless
robust to the pilot transmit power variation, while we observe communication system in which a BS with M = 4 and an IRS
a significant performance degradation of the Benchmark 2 as with N = 20 serve K = 3 users distributed in the rectangular
the uplink transmit power decreases. area of [5, 15] × [−15, 15] in the (x, y)-plane with z = −20.
13
The locations of the BS and the IRS remain the same as in and i1 (n) = mod (n − 1, 10) and i2 (n) = b(n − 1)/10c.
Fig. 5. The downlink transmit power is 20dBm, and the uplink Similarly, the array response at the BS is a function of the
pilot length is 75. The training parameters are the same as in angles φ1 , θ1 given by
the sum-rate maximization problem. fb (φ1 , θ1 ) = |aBS (φ1 , θ1 )H w|, (48)
For the baseline with the LMMSE channel estimator, the
max-min problem is solved using the BCD algorithm between where
the beamformer and the reflective coefficient as proposed
[aBS (φ1 , θ1 )]m = ejπ(m−1)(cos(φ1 ) cos(θ1 )) . (49)
in [12] in which semidefinite relaxation (SDR) is used for
designing the reflective coefficients. The complexity of SDR is As there exists a line-of-sight component in the channel
high; this is why we choose a simulation setting with N = 20. between the IRS and the BS/user, we expect the learned
In Fig. 8, we plot the empirical cumulative distribution func- beamforming vector w and the learned reflection coefficients
tion of the minimum user rate over 1000 channel realizations. v to match the angles of the line-of-sight channels, so that the
As can be seen from Fig. 8, the proposed deep learning method SNR of the user can be maximized.
outperforms the baselines with either the LMMSE or the deep In the numerical simulation, the pilot length is set to be L =
learning channel estimation. This shows that the proposed 25K. The locations of the BS and the IRS are (100, −100, 0)
deep learning framework can also be applied to the max-min and (0, 0, 0) respectively, so that φ∗1 = 2.356, θ1∗ = 0 and
problem. φ∗2 = −0.785, θ2∗ = 0.
In Table III, we test the generalization capability of the We first examine the single-user case, in which, the user is
trained GNN for settings with different number of users. The located at (30, 20, −20), so that φ∗3 = 0.588 and θ3∗ = −0.506.
GNN is trained with 3 users, but is also tested in the case In Fig. 9(a), we plot the learned array responses of the BS as
with 2 users and 4 users. It can be observed that the deep a function of φ1 for the case N = 100 and M = 8. It shows
learning approach always achieves 87% − 90% of the perfect that indeed the learned beamforming vector at the BS focuses
CSI benchmark for L = 25K (where K is the number of energy in the direction of the IRS. In Fig. 9(b), we plot the
users) in all cases, which shows that the GNN generalizes well learned array responses of the IRS as a function of φ3 and θ3
to settings with different number of users. We also observe that (with fixed φ2 = φ∗2 and θ2 = θ2∗ according to the BS-to-IRS
the deep learning approach with 5K pilot length can achieve direction). It is observed that indeed the IRS array response is
over 94% of the minimum rate achieved by the conventional maximized when φ3 ≈ φ∗3 and θ3 ≈ θ3∗ . This shows that the
LMMSE approach with 25K pilot length, which shows the learned configurations of the IRS indeed reflect the signal in
remarkable pilot length reduction achieved by the proposed the correct user direction.
GNN for the max-min problem. Moreover, we investigate the impact of the number of
elements N at the IRS on the reflective pattern. For the user
at location (30, 20, -20), the array responses of the IRS with
VII. I NTERPRETATION OF S OLUTIONS FROM GNN N = 30 and N = 50 elements (placed as a rectangular
array with 10 elements horizontally) are shown in Fig. 9(d)
To interpret the solutions obtained by GNN, in this section, and Fig. 9(c). Combined with Fig. 9(b) with N = 100, it is
we visually verify that the learned IRS indeed reflects the clear that the array response focuses better as the number of
signal in the desired directions. We train the proposed neural elements at the IRS increases.
network architecture for both a single-user case and a mul- Next, we examine a multiuser case in which the neural
tiuser case, and use the array response as a way to illustrate network is trained to maximize the minimum rate over 3 users
qualitatively the correctness of the beamforming pattern. Let located at (5, −12, −20), (5, 0, −20) and (5, 12, −20), so that
φ1 , φ2 , φ3 denote the azimuth angle of arrival from the IRS (φ∗3 , θ3∗ ) = (−1.176, −0.994), (0, −0.980), (1.176, −0.994)
to the BS, the azimuth angle of departure from the IRS to the for the three users respectively. Fig. 10 shows the BS and
BS, and the azimuth angle of arrival from the user to the IRS, IRS array responses learned by the GNN. From Fig. 10(b),
respectively. The corresponding elevation angle parameters are we indeed see three peaks matching the angles φ∗3 and θ3∗
denoted by θ1 , θ2 , θ3 , respectively. corresponding to the three users. In addition, we observe that
The array response at the IRS is a function of the incident the peak corresponding to the first user is weaker than the other
angle and the reflection angle of the wireless signals, which two users, but we can see from Fig. 10(a) that this user has the
14
3
-1.5 -1.5 -1.5
90 45
2.5 -1 -1 -1 25
80 40
70 35
2 -0.5 -0.5 -0.5 20
60 30
1.5 0 50 0 25 0 15
40 20
1 0.5 0.5 0.5 10
30 15
20 10
0.5 1 1 1 5
10 5
(a) Array response of BS, M = 8. (b) Array response of IRS, N = 100. (c) Array response of IRS, N = 50. (d) Array response of IRS N = 30.
Fig. 9. Array response of the BS and the IRS obtained from GNN for the single-user case: (a) and (b) are for N = 100 and M = 8; (c) and (d) are for
N = 50, 30 and M = 8. The optimal φ∗1 = 2.356, φ∗3 = 0.588, θ3∗ = −0.506.
1.4
User 1 -1.5
User 2
1.2 User 3 60
-1
1 50
-0.5
0.8 40
0
0.6 30
0.5
0.4 20
1
0.2 10
1.5
0 0
0 0.5 1 1.5 2 2.5 3 3.5 -1.5 -1 -0.5 0 0.5 1 1.5
(a) Array response of the BS. (b) Array response of the IRS.
Fig. 10. Array response of the BS and the IRS obtained from GNN for maximizing the minimum rate over three users (i.e., K = 3), with N = 100, M = 8.
The optimal (φ∗3 , θ3∗ ) = (−1.176, −0.994), (0, −0.980), (1.176, −0.994) respectively for the three users. The azimuth angle of departure from BS to IRS
is φ∗1 = 2.356.
strongest array response at the BS. Therefore, the GNN indeed pilots to the desired IRS configuration and the desired per-
learns to jointly optimize the phase-shifts and the beamforming user beamformers at the BS. Simulation results show that the
matrix to ensure fairness in this scenario of maximizing the trained neural network produces interpretable results and can
minimum rate across the three users. The weaker IRS array efficiently learn to solve utility maximization problems using
response is compensated by a stronger BS array response. much fewer pilots as compared to the conventional approach.
We also note from Fig. 10(a) that the responses of the BS are
maximized at different angles for different users. This shows
that the BS beamformers have learned to differentiate the three R EFERENCES
users. Note that the combined BS and IRS array responses
[1] T. Jiang, H. V. Cheng, and W. Yu, “Learning to beamform for intelligent
would need to minimize the interference among the users and reflecting surface with implicit channel estimate,” in Proc. IEEE Global
not just to maximize each users’ direct channels. Note that the Commun. Conf. (GLOBECOM), Dec. 2020.
Rayleigh channel components from the BS to the users also [2] M. Di Renzo, A. Zappone, M. Debbah, M.-S. Alouini, C. Yuen,
J. de Rosny, and S. Tretyakov, “Smart radio environments empowered
impact the optimum BS and IRS array response patterns. by reconfigurable intelligent surfaces: How it works, state of research,
and road ahead,” IEEE J. Sel. Areas Commun., vol. 38, no. 11, pp. 2450
–2525, July 2019.
VIII. C ONCLUSION [3] C. Huang, S. Hu, G. C. Alexandropoulos, A. Zappone, C. Yuen,
Conventional communication system design always involves R. Zhang, M. Di Renzo, and M. Debbah, “Holographic MIMO surfaces
for 6G wireless networks: Opportunities, challenges, and trends,” IEEE
obtaining accurate CSI first, then designing the optimal trans- Trans. Wireless Commun., vol. 27, no. 5, pp. 118–125, Oct. 2020.
mission scheme according to the CSI. This design strategy [4] L. Subrt and P. Pechac, “Intelligent walls as autonomous parts of smart
is not practical for IRS due to the large number of passive indoor environments,” IET Commun., vol. 6, no. 8, pp. 1004–1010, May
2012.
reflective elements involved. This paper proposes an approach
[5] H. Guo, Y.-C. Liang, J. Chen, and E. G. Larsson, “Weighted sum-
that learns to configure the IRS and beamforming at the BS rate maximization for reconfigurable intelligent surface aided wireless
to maximize the system utility function directly based on the networks,” IEEE Trans. Wireless Commun., vol. 19, no. 5, pp. 3064–
received pilots, in effect bypassing explicit channel estimation. 3076, May 2020.
[6] X. Yang, C.-K. Wen, and S. Jin, “MIMO detection for reconfigurable
This is accomplished by a generalizable graph neural network intelligent surface-assisted millimeter wave systems,” IEEE J. Sel. Areas
architecture that unveils the direct mapping from the received Commun., vol. 38, no. 8, pp. 1777–1792, Aug. 2020.
15
[7] Q. Wu and R. Zhang, “Intelligent reflecting surface enhanced wireless [29] M. Lee, G. Yu, and G. Y. Li, “Graph embedding based wireless link
network via joint active and passive beamforming,” IEEE Trans. Wireless scheduling with few training samples,” IEEE Trans. Wireless Commun.,
Commun., vol. 18, no. 11, pp. 5394–5409, Nov. 2019. vol. 20, no. 4, pp. 2282 – 2294, Apr. 2021.
[8] C. Huang, A. Zappone, G. C. Alexandropoulos, M. Debbah, and [30] Y. Shen, Y. Shi, J. Zhang, and K. B. Letaief, “Graph neural networks for
C. Yuen, “Reconfigurable intelligent surfaces for energy efficiency in scalable radio resource management: Architecture design and theoretical
wireless communication,” IEEE Trans. Wireless Commun., vol. 18, no. 8, analysis,” IEEE J. Sel. Areas Commun., vol. 39, no. 1, pp. 101–115, Jan.
pp. 4157–4170, Aug. 2019. 2020.
[9] J. Zhu, Y. Huang, J. Wang, K. Navaie, and Z. Ding, “Power efficient [31] Ö. Özdogan and E. Björnson, “Deep learning-based phase reconfigura-
IRS-assisted NOMA,” IEEE Trans. Commun., vol. 69, no. 2, pp. 900– tion for intelligent reflecting surfaces,” in Proc. IEEE Asilomar Conf.
913, Feb. 2021. Signals, Syst., Comput., Nov. 2020.
[10] T. Jiang and Y. Shi, “Over-the-air computation via intelligent reflecting [32] K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward
surfaces,” in Proc. IEEE Global Commun. Conf. (GLOBECOM), Dec. networks are universal approximators,” Neural Netw., vol. 2, no. 5, pp.
2019. 359–366, 1989.
[11] X. Yu, D. Xu, Y. Sun, D. W. K. Ng, and R. Schober, “Robust and secure [33] E. Björnson and B. Ottersten, “A framework for training-based esti-
wireless communications via intelligent reflecting surfaces,” IEEE J. Sel. mation in arbitrarily correlated Rician MIMO channels with Rician
Areas Commun., vol. 38, no. 11, pp. 2637–2652, Nov. 2020. disturbance,” IEEE Trans. Signal Process., vol. 58, no. 3, pp. 1807–
1820, Mar. 2009.
[12] H. Alwazani, A. Kammoun, A. Chaaban, M. Debbah, and M.-S. Alouini,
[34] K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph
“Intelligent reflecting surface-assisted multi-user MISO communication:
neural networks?” in Proc. Int. Conf. Learning Representation, May
Channel estimation and beamforming design,” IEEE Open J. Commun.
2019.
Soc., vol. 1, pp. 661–680, May 2020.
[35] M. Fey and J. E. Lenssen, “Fast graph representation learning with
[13] D. Mishra and H. Johansson, “Channel estimation and low-complexity pytorch geometric,” in Proc. Int. Conf. Learning Representation Work-
beamforming design for passive intelligent surface assisted MISO wire- shops, May 2019.
less energy transfer,” in Proc. IEEE Int. Acoustics, Speech and Signal [36] E. Björnson and L. Sanguinetti, “Rayleigh fading modeling and channel
Process. (ICASSP), Apr. 2019, pp. 4659–4663. hardening for reconfigurable intelligent surfaces,” IEEE Wireless Com-
[14] L. Wei, C. Huang, G. C. Alexandropoulos, C. Yuen, Z. Zhang, and mun. Lett., vol. 10, no. 4, pp. 830–834, April 2021.
M. Debbah, “Channel estimation for RIS-empowered multi-user MISO [37] M. Abadi et al., “TensorFlow: Large-scale machine learning on
wireless communications,” IEEE Trans. Commun., Early Access, 2021. heterogeneous systems,” 2015, software available from tensorflow.org.
[15] E. Björnson, Ö. Özdogan, and E. G. Larsson, “Intelligent reflecting [Online]. Available: https://www.tensorflow.org/
surface vs. decode-and-forward: How large surfaces are needed to beat [38] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
relaying?” IEEE Wireless Commun. Lett., Feb. 2019. in Proc. Int. Conf. Learn. Represent., May 2015. [Online]. Available:
[16] B. Zheng and R. Zhang, “Intelligent reflecting surface-enhanced OFDM: https://arxiv.org/abs/1412.6980
Channel estimation and reflection optimization,” IEEE Wireless Com- [39] Q. Shi, M. Razaviyayn, Z.-Q. Luo, and C. He, “An iteratively weighted
mun. Lett., vol. 9, no. 4, pp. 518–522, Apr. 2020. mmse approach to distributed sum-utility maximization for a MIMO
[17] J. Chen, Y.-C. Liang, H. V. Cheng, and W. Yu, “Channel estimation interfering broadcast channel,” IEEE Trans. Signal Process., vol. 59,
for reconfigurable intelligent surface aided multi-user mmwave MIMO no. 9, pp. 4331–4340, Sept. 2011.
systems,” to appear in IEEE Trans. Wireless Commun., 2021. [Online].
Available: https://arxiv.org/abs/1912.03619
[18] Z. Wang, L. Liu, and S. Cui, “Channel estimation for intelligent reflect-
ing surface assisted multiuser communications: Framework, algorithms,
and analysis,” IEEE Trans. Wireless Commun., vol. 19, no. 10, pp. 6607–
6620, Oct. 2020.
[19] A. M. Elbir and K. V. Mishra, “A survey of deep learning
architectures for intelligent reflecting surfaces,” 2020. [Online].
Available: https://arxiv.org/abs/2009.02540
[20] S. Liu, Z. Gao, J. Zhang, M. Di Renzo, and M.-S. Alouini, “Deep
denoising neural network assisted compressive channel estimation for
mmwave intelligent reflecting surfaces,” IEEE Trans. Veh. Technol.,
vol. 69, no. 8, pp. 9223–9228, Aug. 2020.
[21] A. M. Elbir, A. Papazafeiropoulos, P. Kourtessis, and S. Chatzinotas,
“Deep channel learning for large intelligent surfaces aided mm-wave
massive MIMO systems,” IEEE Wireless Commun. Lett., vol. 9, no. 9,
pp. 1447–1451, Sep. 2020.
[22] J. Gao, C. Zhong, X. Chen, H. Lin, and Z. Zhang, “Unsupervised
learning for passive beamforming,” IEEE Wireless Commun. Lett.,
vol. 24, no. 5, pp. 1052–1056, May 2020.
[23] K. Feng, Q. Wang, X. Li, and C.-K. Wen, “Deep reinforcement learning
based intelligent reflecting surface optimization for MISO communica-
tion systems,” IEEE Wireless Commun. Lett., vol. 9, no. 5, pp. 745–749,
May 2020.
[24] C. Huang, R. Mo, and C. Yuen, “Reconfigurable intelligent surface as-
sisted multiuser MISO systems exploiting deep reinforcement learning,”
IEEE J. Sel. Areas Commun., vol. 38, no. 8, pp. 1839–1850, Aug. 2020.
[25] A. Alkhateeb, S. Alex, P. Varkey, Y. Li, Q. Qu, and D. Tujkovic, “Deep
learning coordinated beamforming for highly-mobile millimeter wave
systems,” IEEE Access, vol. 6, pp. 37 328–37 348, June 2018.
[26] W. Cui, K. Shen, and W. Yu, “Spatial deep learning for wireless
scheduling,” IEEE J. Sel. Areas Commun., vol. 37, no. 6, pp. 1248–
1261, June 2019.
[27] C. Huang, G. C. Alexandropoulos, C. Yuen, and M. Debbah, “Indoor
signal focusing with deep learning designed reconfigurable intelligent
surfaces,” in Proc. IEEE Int. Workshop Signal Process. Adv. Wireless
Commun. (SPAWC), July 2019.
[28] M. Eisen and A. Ribeiro, “Optimal wireless resource allocation with
random edge graph neural networks,” IEEE Trans. Signal Process.,
vol. 68, pp. 2977–2991, Apr. 2020.