Compressed Sensing in Radar Signal Processing

Compressed Sensing in Radar Signal Processing
Learn about the most recent theoretical and practical advances in radar signal processing
using tools and techniques from compressive sensing. Providing a broad perspective
that fully demonstrates the impact of these tools, the accessible and tutorial-like
chapters cover topics such as clutter rejection, CFAR detection, adaptive beamforming,
random arrays for radar, space–time adaptive processing, and MIMO radar. Each chapter
includes coverage of theoretical principles, a detailed review of current knowledge, and
discussion of key applications, and also highlights the potential benefits of using
compressed sensing algorithms. A unified notation and numerous cross-references
between chapters make it easy to explore different topics side by side. Written by
leading experts from both academia and industry, this is the ideal text for researchers,
graduate students, and industry professionals working in signal processing and radar.
Antonio De Maio is a professor in the Department of Electrical Engineering and

Information Technology at the University of Naples Federico II, and a Fellow of
the IEEE.
Yonina C. Eldar is a professor at the Weizmann Institute of Science. She has authored
and edited several books, including Sampling Theory: Beyond Bandlimited Systems
and Compressed Sensing: Theory and Applications (Cambridge University Press, 2015;
2012). She is a Fellow of the IEEE and EURASIP, and a member of the Israel National
Academy of Science and Humanities.
Alexander M. Haimovich is a distinguished professor in the Department of Electrical

and Computer Engineering at the New Jersey Institute of Technology, and a Fellow of
the IEEE.
Compressed Sensing in Radar
Signal Processing
Edited by
ANTONIO DE MAIO
University of Naples Federico II
YONINA C. ELDAR
Weizmann Institute of Science
ALEXANDER M. HAIMOVICH
New Jersey Institute of Technology
University Printing House, Cambridge CB2 8BS, United Kingdom
One Liberty Plaza, 20th Floor, New York, NY 10006, USA
477 Williamstown Road, Port Melbourne, VIC 3207, Australia
314–321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre, New Delhi – 110025, India
79 Anson Road, #06–04/06, Singapore 079906
Cambridge University Press is part of the University of Cambridge.

It furthers the University’s mission by disseminating knowledge in the pursuit of
education, learning, and research at the highest international levels of excellence.
www.cambridge.org
Information on this title: www.cambridge.org/9781108428293
DOI: 10.1017/9781108552653
© Cambridge University Press 2020
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2020
Printed in the United Kingdom by TJ International Ltd, Padstow Cornwall
A catalogue record for this publication is available from the British Library.
Library of Congress Cataloging-in-Publication Data
Names: De Maio, Antonio, 1974– editor. | Eldar, Yonina C., editor. |
Haimovich, Alexander M., 1954– editor.
Title: Compressed sensing in radar signal processing / edited by Antonio De Maio,
University of Naples Federico II, Yonina C. Eldar, Weizmann Institute of Science,
Alexander M. Haimovich, New Jersey Institute of Technology.
Description: First edition. | Cambridge, United Kingdom ; New York, NY :
Cambridge University Press, [2020] | Includes bibliographical references and index.
Identifiers: LCCN 2019014859 | ISBN 9781108428293 (hardback)
Subjects: LCSH: Radar. | Compressed sensing (Telecommunication)
Classification: LCC TK6580 .C66 2020 | DDC 621.3848/3–dc23
LC record available at https://lccn.loc.gov/2019014859
ISBN 978-1-108-42829-3 Hardback
Cambridge University Press has no responsibility for the persistence or accuracy
of URLs for external or third-party internet websites referred to in this publication
and does not guarantee that any content on such websites is, or will remain,
accurate or appropriate.
To my daughter Claudia: my light, my hope, my love – ADM
To my husband Shalomi and children Yonatan, Moriah, Tal, Noa, and Roei
for their boundless love and for filling my life with endless happiness – YE
To my students and collaborators for their contributions to my work
on radar – AH
Contents
List of Contributors page xi

Introduction xiv
List of Symbols xx
1 Sub-Nyquist Radar: Principles and Prototypes 1

Kumar Vijay Mishra and Yonina C. Eldar
1.1 Introduction 1
1.2 Prior Art and Historical Notes 3
1.3 Temporal Sub-Nyquist Radar 5
1.4 Doppler Sub-Nyquist Radar 15
1.5 Cognitive Sub-Nyquist Radar and Spectral Coexistence 18
1.6 Spatial Sub-Nyquist: Application to MIMO Radar 29
1.7 Sub-Nyquist SAR 39
1.8 Summary 43
References 44
2 Clutter Rejection and Adaptive Filtering in Compressed Sensing Radar 49

Peter B. Tuuk
2.1 Introduction 49
2.2 Problem Formulation 50
2.3 Interference Sources 53
2.4 Signal Processing Treatment of Clutter 55
2.5 Measurement Compression 58
2.6 Estimating Interference Statistics from Compressed Measurements 59
2.7 Mitigating Clutter in Compressed Sensing Estimation 66
2.8 Summary 68
References 69
3 RFI Mitigation Based on Compressive Sensing Methods for UWB Radar Imaging 72
Tianyi Zhang, Jiaying Ren, Jian Li, David J. Greene, Jeremy A. Johnston, and Lam H. Nguyen
3.1 Introduction 72
3.2 RPCA for RFI Mitigation 75
3.3 CLEAN-BIC for RFI Mitigation 82
vii
viii Contents
3.4 Enhanced Algorithms for RFI Mitigation 91

3.5 Performance Evaluations 92
3.6 Conclusions 101
3.7 Acknowledgment 102
References 102
4 Compressed CFAR Techniques 105

Laura Anitori and Arian Maleki
4.1 Introduction 105
4.2 Radar Signal Model 105
4.3 Classical Radar Detection 106
4.4 CS Radar Detection 110
4.5 Complex Approximate Message Passing (CAMP) Algorithm 112
4.6 Target Detection Using CAMP 115
4.7 Adaptive CAMP Algorithm 118
4.8 Simulation Results 120
4.9 Experimental Results 127
4.10 Conclusions 131
References 132
5 Sparsity-Based Methods for CFAR Target Detection in STAP Random Arrays 135
Haley H. Kim and Alexander M. Haimovich
5.2 STAP Radar Concepts 137
5.3 STAP Detection Problem 145
5.4 Compressive Sensing CFAR Detection 148
5.5 Numerical Results 157
5.6 Summary 161
References 162
6 Fast and Robust Sparsity-Based STAP Methods for Nonhomogeneous Clutter 165
Xiaopeng Yang, Yuze Sun, Xuchen Wu, Teng Long, and Tanpan K. Sarkar
6.2 Signal Models 166
6.3 Sparsity Principle Analysis of STAP 168
6.4 Fast and Robust Sparsity-Based STAP Methods 172
6.5 Conclusions 190
References 190
7 Super-Resolution Radar Imaging via Convex Optimization 193

Reinhard Heckel
Contents ix
7.2 Signal Model and Problem Statement 195

7.3 Atomic Norm Minimization and Associated Performance Guarantees 199
7.4 Super-Resolution Radar on a Fine Grid 204
7.5 Proof Outline 207
7.6 MIMO Radar 211
7.7 Discussion and Current and Future Research Directions 219
References 222
8 Adaptive Beamforming via Sparsity-Based Reconstruction of Covariance Matrix 225

Yujie Gu, Nathan A. Goodman, and Yimin D. Zhang
8.2 Adaptive Beamforming Criterion 228
8.3 Covariance Matrix Reconstruction-Based Adaptive Beamforming 234
8.4 Simulation Results 240
8.5 Conclusion 252
References 252
9 Spectrum Sensing for Cognitive Radar via Model Sparsity Exploitation 257
Augusto Aubry, Vincenzo Carotenuto, Antonio De Maio, and Mark A. Govoni
9.2 System Model and Problem Formulation 259
9.3 2-D Radio Environmental Map Recovery Strategies 263
9.4 Performance Analyses 270
9.5 Conclusions 280
References 280
10 Cooperative Spectrum Sharing between Sparse Sensing-Based

Radar and Communication Systems 284
Bo Li and Athina P. Petropulu
10.2 MIMO Radars Using Sparse Sensing 286
10.3 Coexistence System Model 293
10.4 Cooperative Spectrum Sharing 297
10.5 Numerical Results 309
References 316
11 Compressed Sensing Methods for Radar Imaging in the Presence of Phase Errors
and Moving Objects 321
Ahmed Shaharyar Khwaja, Naime Ozben Onhon, and Mujdat Cetin
11.1 Introduction and Outline of the Chapter 321
11.2 Compressed Sensing and Radar Imaging 322
x Contents
11.3 Synthetic Aperture Radar Autofocus and Compressed Sensing 328

11.4 Synthetic Aperture Radar Moving Target Imaging and Compressed Sensing 333
11.5 Inverse Synthetic Aperture Radar Imaging and Compressed Sensing 341
References 349
Index 355
Contributors
Laura Anitori
Netherlands Organisation for Applied Scientific Research (TNO)
Augusto Aubry
Vincenzo Carotenuto
Mujdat Cetin
University of Rochester; Sabanci University
Antonio De Maio
Yonina C. Eldar
Weizmann Institute of Science
David J. Greene
University of Florida
Nathan A. Goodman
University of Oklahoma
Mark A. Govoni
US Army Research Laboratory
Yujie Gu
Temple University
Alexander M. Haimovich
xi
xii List of Contributors
Reinhard Heckel
Rice University
Jeremy A. Johnston
Ahmed Shaharyar Khwaja

Sabanci University
Haley H. Kim
Bo Li
Qualcomm
Jian Li
Teng Long
Beijing Institute of Technology
Arian Maleki
Columbia University
Kumar Vijay Mishra

Technion Israel Institute of Technology
Lam H. Nguyen
US Army Research Laboratory
Naime Ozben Onhon

Turkish-German University
Athina P. Petropulu
Rutgers, State University of New Jersey
Jiaying Ren
Tapan K. Sarkar
Syracuse University
Yuze Sun
Tsinghua University
List of Contributors xiii
Peter B. Tuuk
Georgia Tech Research Institute
Xuchen Wu
Xiaopeng Yang
Tianyi Zhang
Yimin D. Zhang
Temple University
Introduction
Digital signal processing (DSP) is a revolutionary paradigm shift that enables processing
of physical data in the digital domain, where design and implementation are consider-
ably simplified. The success of DSP has driven the development of sensing and pro-
cessing systems that are more robust, flexible, cheaper, and, consequently, more widely
used than their analog counterparts. As a result of this success, the amount of data gener-
ated by sensing systems has grown considerably. Furthermore, in modern applications,
signals of wider bandwidth are used in order to convey more information and to enable
high resolution in the context of imaging. Unfortunately, in many important and emerg-
ing applications, the resulting sampling rate is so high that far too many samples need to
be transmitted, stored, and processed. In addition, in applications involving very wide-
band inputs it is often very costly, and sometimes even physically impossible, to build
devices capable of acquiring samples at the necessary rate. Thus, despite extraordinary
advances in sampling theory and computational power, the acquisition and processing
of signals in application areas such as radar, wideband communications, imaging, and
medical imaging continue to pose a tremendous challenge.
Recent advances in compressed sensing (CS) and sampling theory provide a frame-
work to acquire a wide class of analog signals at rates below the Nyquist rate, and
to perform processing at this lower rate as well. Together with the theory, various
prototypes have been developed that demonstrate the feasibility of sampling and pro-
cessing signals at sub-Nyquist rates in a robust and cost-effective fashion. More specif-
ically, CS is a framework that enables acquisition and recovery of sparse vectors from
underdetermined linear systems. This research area has seen enormous growth over the
past decade and has been explored in many areas of applied mathematics, computer
science, statistics, and electrical engineering. At its core, CS enables recovery of sparse
high-dimensional vectors from highly incomplete measurements using very efficient
optimization algorithms. More specifically, consider a vector x of length n. The vector
is said to be k-sparse if it has at most k nonzero components. More generally, CS results
apply to signals that are sparse in an appropriate basis or overcomplete representation.
The main idea underlying CS is that the vector x can be recovered from measurements
y = Ax, where y is of length m n as long as A satisfies certain mathematical
properties that render it a suitable CS matrix. The number of measurements m can be
chosen on the order of k log n, which in general is much smaller than the length of
the vector x. A large body of work has been published on a variety of optimization
algorithms that can recover x efficiently and robustly when m ≈ k log n. Loosely
xiv
Introduction xv
speaking, the theory of CS deals with conditions under which the recovery of informa-
tion has vanishing or small errors. The mathematical framework of CS has inspired new
acquisition methods and new signal processing applications in a large variety of areas,
including image processing, analog to digital conversion, communication systems, and
radar processing. In many of these examples the basic ideas underlying CS need to be
extended to include, for example, continuous-time inputs, practical sampling methods,
other forms of structure on the input, computational aspects, noise affects, different
metrics for recovery performance, nonlinear acquisition methods, and more.
Two books devoted to this topic have been published recently, which focus on
many of these aspects, as well as on the underlying mathematical results [1,2]. Their
main emphasis is on the basic underlying theory and its generalizations, optimization
methods, as well as applications primarily to image processing and analog-to-digital
conversion. The latter is also covered in depth in [3].
Radar signal processing represents a fertile field for CS applications. By their very
nature, radars collect data about surveillance volumes (search radars), targets (tracking
radars), terrain and ground targets (imaging radars), or buried objects (radar tomogra-
phy). From radar’s early days in World War II, through the emergence of digital radar in
the 1970s, to today’s advanced systems, the amount of data a radar system has to handle
has increased by orders of magnitude. While early digital radars had to contend with 10s
and 100s of kbps, today’s radars may be faced with data rates in the Gbps range or more,
leading to demanding requirements in cost, hardware, data storage, and processing. The
implications of applying CS to radar are potentially enormous: sampling rates could
be lowered, the number of antenna elements in large arrays might be reduced and the
computers required to handle the data may be downsized.
This book aims to present the latest theoretical and practical advances in radar signal
processing using tools from CS. In particular, this book offers an up-to-date review of
fundamental and practical aspects of sparse reconstruction in radar and remote sensing,
demonstrating the potential benefits achievable with the CS paradigm. We take a wider
scope than previous edited books on CS-based radars: we do not restrict ourselves to
specific disciplines (such as earth observation as in [4]) or applications (such as urban
sensing as in [5]), but discuss a variety of diverse application fields, including clutter
rejection, constant false alarm rate (CFAR) processing, adaptive beamforming, random
arrays for radar, space–time adaptive processing (STAP), multiple input multiple output
(MIMO) systems, radar super-resolution, cognitive radar [6] applications involving sub-
Nyquist sampling and spectrum sensing, radio frequency interference (RFI) suppres-
sion, and synthetic aperture radar (SAR).
The book is aimed at postgraduate students, PhD students, researchers, and engi-
neers working on signal processing and its applications to radar systems, as well as
researchers in other fields seeking an understanding of the potential applications of
CS. To read and fully understand the content it is assumed that the reader has some
background in probability theory and random processes, matrix theory, linear algebra,
and optimization theory, as well as radar systems. The book is organized into eleven
chapters broadly cathegorized into five areas: sub-Nyquist radar (Chapter 1); detection,
clutter/interference mitigation, and CFAR techniques (Chapters 2–6); super-resolution
xvi Introduction
and beamforming (Chapters 7 and 8); radar spectrum sensing/sharing (Chapters 9 and
10); radar imaging (Chapter 11). Each chapter is self-contained and typically covers
three main aspects: fundamental theoretical principles, overview of the current state of
the art, and emerging/future research directions. Some chapters are also complemented
with analyses on real data. Since the chapters are independent, there is flexibility in
selecting material both for university courses and short seminars.
In Chapter 1, the authors review several sub-Nyquist pulse-Doppler radar systems
based on the Xampling framework. Contrary to other CS-based designs, their formu-
lations directly address the reduced-rate analog sampling in space and time, avoid a
prohibitive dictionary size, and are robust in the face of noise and clutter. The chapter
begins by introducing temporal sub-Nyquist processing for estimating the target loca-
tions using less bandwidth than conventional systems. This paves the way to cognitive
radars, which share their transmit spectrum with other communication services, thereby
providing a robust solution for coexistence in spectrally crowded environments. Next,
without impairing Doppler resolution, the authors reduce the dwell time by transmitting
interleaved radar pulses in a scarce manner within a coherent processing interval or
slow time. Then, they consider MIMO array radars and demonstrate spatial sub-Nyquist
processing, which allows the use of few antenna elements without degradation in
angular resolution. Finally, they demonstrate application of sub-Nyquist and cognitive
radars to imaging systems such as SAR. For each setting, the authors present a state-
of-the-art hardware prototype designed to demonstrate the real-time feasibility of
sub-Nyquist radars.
Chapter 2 discusses the problem of clutter mitigation, which has posed challenges to
radar designers and engineers since the early days of radar. Early techniques matured to
current approaches like STAP, which use a coherently processed data cube to estimate
clutter statistics and to perform adaptive filtering. This chapter examines CS techniques
for the mitigation of structured interference, such as clutter. The author first introduces
the relevant sensing model and describes results in uncompressed adaptive filtering.
This paves the way to the development of models for measurement compression of the
coherent data cube and of approaches to estimate and filter clutter from compressed
measurements. The chapter includes recent results showing how clutter second-order
statistics can be reliably estimated from compressed measurements if the clutter has
well-controlled eigenspectrum. Additionally, the covariance of the interference can be
incorporated into the CS estimation process to improve performance.
RFIs pose serious threats to the proper operations of ultra wideband (UWB) radar
systems due to severely degrading their imaging and target detection capabilities. RFI
mitigation is a challenging problem, since dynamic RFI sources utilize diverse mod-
ulation schemes, hence they are difficult to model precisely. Fortunately, RFI sources
possess certain unique properties that can be exploited for their mitigation. In Chapter 3
the authors propose several sparse signal recovery methods for effective RFI mitigation.
They first show that the RFI sources possess a low rank property and are sparse in the
frequency domain, while in contrast the desired UWB radar echoes are sparse in the time
domain. Therefore, robust principal component analysis (RPCA) can be used to simul-
taneously exploit these properties for effective RFI mitigation. RPCA, however, requires
Introduction xvii
a fine tuning of a user parameter, which is dependent on the signal-to-interference ratio

(SIR). This parameter tuning is not straightforward in practice due to the lack of prior
knowledge on the RFI sources and on the desired UWB radar echoes. To avoid the
user parameter tuning problem, the authors consider modeling the RFI sources within
a pulse repetition interval (PRI) as a sum of sinusoids. The CLEAN algorithm can
then be used with the Bayesian information criterion (BIC) to determine the number
of sinusoids and to estimate their parameters. They show that CLEAN-BIC is user-
parameter-free and can be used to remove dominant RFI sources effectively. However,
since the sparse property of the UWB radar echoes are not utilized by CLEAN-BIC, the
resulting SAR images appear noisy, especially for low SIR values. To take advantage
of the merits of both RPCA and CLEAN-BIC algorithms, the authors consider using
CLEAN-BIC to estimate SIR, and the estimated SIR value is then used to determine
the user parameter for the RPCA algorithm. Finally, the algorithms are applied to both
simulated and experimentally measured data for performance evaluation.
Chapter 4 is focused on target detection from a set of compressive radar measure-
ments corrupted by additive white Gaussian noise. The complications in the calculation
of false alarm and detection probabilities that are caused by the nonlinear nature of target
recovery schemes in CS have impeded the application of such systems in practice. In
this chapter, the authors aim to show how recent advances in the asymptotic analysis of
CS recovery algorithms help to overcome this challenge. Fully adaptive and practical
CS target detection schemes are provided together with a detailed analysis of their
performance through extensive simulated and experimental data.
In Chapter 5, the authors present CFAR detectors for STAP random arrays. The
problem is formulated as detection of sparse targets given space–time observations
from thinned random arrays. The observations are corrupted by colored Gaussian noise
of an unknown covariance matrix, but secondary data are available for estimating the
covariance matrix. It is shown that the number of elements required to constrain the
peak sidelobe level scales logarithmically with the array aperture, whereas the number
of elements of a uniform linear array (ULA) scales linearly with the array aperture. New
adaptive detectors are developed that cope with the high sidelobes of random arrays.
Performance and complexity analysis demonstrate high performance at a reasonable
computation cost with significantly fewer elements than a ULA.
In Chapter 6, sparse-based STAP methods are developed by exploiting the intrinsic
sparsity of the clutter spatial-temporal power spectrum and of the space–time adaptive
weight vectors. First, the signal model of received space–time data for an airborne
phased array radar is introduced, and the intrinsic model sparsity for radar STAP is
analyzed. Second, leveraging on the sparsity of clutter spatial-temporal power spectrum,
a robust and fast iterative sparse recovery method is introduced. It can not only alleviate
the effect of noise and dictionary mismatch but can also reduce the computational com-
plexity via recursive inverse matrix calculation. Finally, based on the sparsity of space–
time adaptive weight vectors, a fast STAP method based on projection approximation
subspace tracking (PAST) with a sparse constraint is discussed. It provides a robust
and stable estimation of the clutter subspace when a small set of training samples is
available. Based on both the simulated and actual airborne phased array radar data, it is
xviii Introduction
verified that the developed methods can provide satisfactory performance with a small
training sample support in a practical complex nonhomogeneous environment.
Chapter 7 considers the use of CS techniques for the resolution of multiple targets.
Estimating the relative angles, delays, and Doppler shifts from the received signals
allows for the determination of the locations and velocities of objects. However, due to
practical constraints, the probing signals have finite bandwidth B, the received signals
are observed over a finite time interval of length T only, and in addition, a radar typically
has only one or a few transmit and receive antennas. Those constraints fundamentally
limit the resolution up to which objects can be localized: the delay and Doppler reso-
lution is proportional to 1/B and 1/T , and a radar with NT transmit and NR receive
antennas can only achieve an angular resolution proportional to 1/(NT NR ). The author
shows that the continuous angle-delay-Doppler triplets and the corresponding attenua-
tion factors can be resolved at much finer resolution, using ideas from CS. Specifically,
provided the angle-delay-Doppler triplets are separated either by factors proportional
to 1/(NT NR − 1) in angle, 1/B in delay, or 1/T in Doppler direction, they can be
recovered at significantly smaller scale or higher resolution.
Traditional adaptive beamformers are very sensitive to model mismatch, especially
when the training samples for adaptive beamformer design are contaminated by the
desired signal. In Chapter 8, the authors propose a strategy to reconstruct a signal-
free interference-plus-noise covariance matrix for adaptive beamformer design. Using
the sparsity of sources, the interference covariance matrix can be reconstructed as a
weighted sum of the tensor outer products of the interference steering vectors, and the
corresponding parameters are estimated from a sparsity-constrained covariance matrix
fitting problem. In contrast to classical CS and sparse reconstruction problems, the for-
mulated sparsity-constrained covariance matrix fitting problem can be effectively solved
by using the a priori information on array structure rather than using convex relaxation.
Simulation results demonstrate that the proposed adaptive beamformer almost always
provides near-optimal performance.
Chapter 9 deals with two-dimensional (2-D) spectrum sensing in the context of a
cognitive radar to gather real-time space–frequency electromagnetic awareness. Assum-
ing a sensor equipped with multiple receive antennas, a formal discrete-time sensing
signal model is developed, and two signal processing techniques capable of recovering
the space–frequency occupancy map via block sparsity exploitation are presented. The
former relies on the iterative adaptive algorithm (IAA) and incorporates a BIC-based
stage to foster block-sparsity in the recovery process. The latter resorts to the regularized
maximum likelihood (RML) estimation paradigm, which automatically promotes block-
sparsity in the 2-D profile evaluation. Some illustrative examples (both on simulated and
real data) are provided to compare the different strategies and highlight the effectiveness
of the developed approaches.
In Chapter 10, a cooperative spectrum-sharing scheme for a MIMO communication
system and a sparse sensing-based MIMO radar is presented. Both the radar and the
communication systems use transmit precoding. The radar transmit precoder, the radar
subsampling scheme, and the communication transmit covariance matrix are jointly
designed in order to maximize the radar SIR, while meeting certain communication
Introduction xix
rate and power constraints. The joint design is implemented at a control center, which
is a node with which both systems share physical layer information, and which also
performs data fusion for the radar. Efficient algorithms for solving the correspond-
ing optimization problem are presented. The cooperative design significantly improves
spectrum sharing performance, and the sparse sensing provides opportunities to control
interference.
Chapter 11 discusses applications of CS to radar imaging problems with reference
to SAR and inverse synthetic aperture radar (ISAR) sensors. The authors first provide
the relevant mathematical expressions for CS and SAR necessary to formulate the prob-
lem of CS SAR imaging. Thereafter, they consider the case where unknown motion
errors are present during the SAR acquisition process. Autofocusing, i.e., the blind
compensation of the aforementioned errors, is discussed, and general CS solutions are
presented. The chapter ends with a survey of CS methods for ISAR imaging of targets
with unknown motion.
References
[1] Y. C. Eldar and G. Kutyniok, Compressed Sensing: Theory and Applications. Cambridge
University Press, 2012.
[2] S. Foucart and H. Rauhut, A Mathematical Introduction to Compressive Sensing. Birkhäuser
Basel, 2013, vol. 1, no. 3.
[3] Y. C. Eldar, Sampling Theory: Beyond Bandlimited Systems. Cambridge University Press,
2015.
[4] C.-H. Chen, Compressive Sensing of Earth Observations. CRC Press, 2017.
[5] M. Amin, Compressive Sensing for Urban Radar. CRC Press, 2014.
[6] A. Farina, A. De Maio, and S. Haykin, The Impact of Cognition on Radar Technology. Scitech
Publishing, Radar, Sonar & Navigation, 2017.
Symbols
A unified notation is used throughout the book.
z column vector (lower case)

Z matrix (upper case)
zi ith element of z
Zi,l (i,l)-th entry of Z
A sensing matrix
sparsity matrix
= A product
y observed measurement vector
x original signal vector
k sparsity
n ambient dimension
m number of measurements
· p p-norm
(·)T transpose operator
(·)∗ conjugate operator
(·)H conjugate transpose operator
(·)† pseudo inverse of the matrix argument
tr (·) trace of the square matrix argument
Rank (·) rank of the square matrix argument
λmax (·) maximum eigenvalue of the square matrix argument
λmin (·) minimum eigenvalue of the square matrix argument
diag(x) N -dimensional diagonal matrix whose ith diagonal element
is xi , i = 1,. . .,N, with x ∈ CN
Range (A) range span of the column vectors of the matrix A
I identity matrix (its size is determined from the context)
0 matrix with zero entries (its size is determined from the context)
RN set of N -dimensional vectors of real numbers
CN set of N -dimensional vectors of complex numbers
HN set of N × N Hermitian matrices
for any A ∈ HN , A 0 means that A is a positive semidefinite matrix
for any A ∈ HN , A 0 means that A is a positive definite matrix
xx
List of Symbols xxi
T standard notation for sets (uppercase letter)

|T | cardinality of a set T
x̂ result of 1 minimization/recovery algorithm
supp(x) support of vector x
I standard notation for subset of indices
xT length-|T | sub-vector containing the elements of x
corresponding to the indices in T
AT m × |T | sub-matrix containing the columns of
the m × n matrix A indexed by T
j imaginary unit
Re(x) real part of the complex number x
Im(x) imaginary part of the complex number x
|x| modulus of the complex number x
arg(x) argument of the complex number x
E [·] statistical expectation
Hadamard product
⊗ Kronecker product
∂y dy
ẏ, ∂x , dx first derivative of y with respect to variable x
2 2
ÿ, ∂∂xy2 , ddxy2 second derivative of y with respect to variable x
P[·] probability measure
x(t) continuous time signal
h(t) pulse shape
xi measurements of x(t)
δk = δk (A) restricted isometry constant.
Statement of restricted isometry property (RIP): a matrix A satisfies the RIP of order
K if
(1 − δk )x2 ≤ Ax2 ≤ (1 + δk )x2
for all x with x0 ≤ K.
1 Sub-Nyquist Radar: Principles
and Prototypes
Kumar Vijay Mishra∗ and Yonina C. Eldar∗∗
1.1 Introduction
Radar remote sensing has advanced tremendously since 1950 and is now applied to
diverse areas such as military surveillance, meteorology, geology, collision avoidance,
and imaging [1]. In monostatic pulse-Doppler radar systems, an antenna transmits a
periodic train of known narrowband pulses within a defined coherent processing interval
(CPI). When the radiated wave from the radar interacts with moving targets, the ampli-
tude, frequency, and polarization states of the scattered wave change. By monitoring this
change, it is possible to infer the targets’ size, location, and radial Doppler velocity. The
reflected signal received by the radar antenna is a linear combination of echoes from
multiple targets; each of these is an attenuated, time-delayed, and frequency-modulated
version of the transmit signal. The delay in the received signal is linearly proportional
to the target’s range or its distance from the radar. The frequency modulation encodes
the Doppler velocity of the target. The complex amplitude or target’s reflectivity is a
function of the target’s size, geometry, propagation, and scattering mechanism. Radar
signal processing is aimed at detecting the targets and estimating their parameters from
the output of this linear, time-varying system.
Traditional radar signal processing employs matched filtering (MF) or pulse com-
pression [2] in the digital domain, wherein the sampled received signal is correlated
with a replica of the transmit signal in the delay-Doppler plane. The MF maximizes
the signal-to-noise ratio (SNR) in the presence of additive white Gaussian noise. In
some specialized systems, this stage is replaced by a mismatched filter with a different
optimization metric such as minimization of peak-to-sidelobe ratio of the output. Here,
the received signal is correlated with a signal that is close but not identical to the
transmit signal [3–5]. While all of these techniques reliably estimate target parameters,
their resolution is inversely proportional to the support of the ambiguity function of the
transmit pulse, thereby restricting ability to super-resolve targets that are closely spaced.
The digital MF operation requires the signal to be sampled at or above the Nyquist
sampling rate, which guarantees perfect reconstruction of a bandlimited analog signal
[6]. Many modern radar systems use wide bandwidths, typically ranging from hundreds
∗ K.V.M. acknowledges partial support via the Andrew and Erna Finci Viterbi Postdoctoral Fellowship and
the Lady Davis Postdoctoral Fellowship.
∗∗ This work is supported by the European Union’s Horizon 2020 research and innovation program under
grant agreement no. 646804-ERC-COG-BNYQ.
1
2 Mishra and Eldar
of MHz to GHz, in order to achieve fine radar range resolution. Since the Nyquist sam-
pling rate is twice the baseband bandwidth, the radar receiver requires expensive, high-
rate analog-to-digital converters (ADCs). The sampled signal is then also processed at
high rates, resulting in significant power, cost, storage, and computational overhead.
Recently, in order to mitigate this rate bottleneck, new methods have been proposed that
sample signals at sub-Nyquist rates and yet are able to estimate the targets’ parameters
[6,7].
Analogous trade-offs arise in other aspects of radar system design. For example, the
number of transmit pulses governs the resolution in Doppler velocity. The estimation
accuracy of target parameters is greatly affected by the radar’s dwell time [1], i.e., the
time duration a directional radar beam spends hitting a particular target. Long dwell
times imply a large number of transmit pulses and, therefore, high Doppler precision.
But, simultaneously, this negatively affects the ability of the radar to look at targets in
other directions. Sub-Nyquist sampling approaches have, therefore, been suggested for
the pulse dimension or “slow-time” domain in order to break the link between dwell
time and Doppler resolution [8–10].
Finally, radars that deploy antenna arrays deal with similar sampling problems in the
spatial domain. A phased array radar antenna consists of several radiating elements
that form a highly directional radiating beam pattern. Without requiring any mechan-
ical motion, a phased array accomplishes beam-steering electronically by adjusting
the relative phase of excitation in the array elements. The operational advantage is
the agile scanning of the target scene, ability to track a large number of targets, and
efficient search-and-track in the regions of interest [11]. The beam pattern of individual
array elements, array geometry, and its size define the overall antenna pattern [12,13],
wherein high spatial resolution is achieved by large array apertures. As per the Nyquist
Theorem, the array must not admit fewer than two signal samples per spatial period (i.e.,
radar’s operating wavelength) [14]. Otherwise, it introduces spatial aliasing or multiple
beams in the antenna pattern, thereby reducing its directivity. Often an exceedingly
large number of radiating elements are required to synthesize a given array aperture in
order to enhance the radar’s ability to unambiguously distinguish closely spaced targets;
the associated cost, weight, and area may be unacceptable. It is therefore desirable to
apply sub-Nyquist techniques to thin a huge array without causing degradation in spatial
resolution [15–17].
Sub-Nyquist sampling leads to the development of low-cost, power-efficient, and
small-size radar systems that can scan faster and acquire larger volumes than traditional
systems. Apart from design benefits, other applications of such systems have been
envisioned recently, including imparting hardware-feasible cognitive abilities to the
radar [18,19], a role in devising spectrally coexistent systems [20], and extension to
imaging [21]. In this chapter, we provide an overview of sub-Nyquist radars, their
applications, and hardware realizations.
The outline of the chapter is as follows. In the next section, we overview various
reduced-rate techniques for radar system design and explain the benefits of our approach
to sub-Nyquist radars. In Section 1.3, we describe the principles, algorithms, and hard-
ware realization of temporal sub-Nyquist monostatic pulse-Doppler radar. Section 1.4
Sub-Nyquist Radar: Principles and Prototypes 3
presents an extension of the sub-Nyquist principle to slow time. We then introduce the
cognitive radar concept based on sub-Nyquist reception in Section 1.5 and show an
application to coexistence in a spectrally crowded environment. Section 1.6 is devoted to
spatial sub-Nyquist applications in multiple-input multiple-output (MIMO) array radars.
Finally, we consider sub-Nyquist synthetic aperture radar (SAR) imaging in Section 1.7,
followed by concluding remarks in Section 1.8.
1.2 Prior Art and Historical Notes
There is a large body of literature on reduced-rate sampling techniques for radars. Most
of these works employ compressed sensing (CS) methods, which allow recovery of
sparse, undersampled signals from random linear measurements [7]. A pre-2010 review
of selected applications of CS-based radars can be found in [22]. A qualitative, system-
level commentary from the point of view of operational radar engineers is available
in [23], while CS-based radar imaging studies are summarized in [24]. An excellent
overview on sparsity-based SAR imaging methods is provided in [25]. The review
in [26] recaps major developments in this area from a nonmathematical perspective.
In the following, we review the most significant works relevant to the sub-Nyquist
formulations presented in this chapter.
On-Grid CS The earliest application of CS toward recovering time delays with sub-
Nyquist samples in a noiseless case was formulated in [27]. CS-based parameter esti-
mation for both delay and Doppler shifts was proposed in [28] with samples acquired
at the Nyquist rate. These and similar later works [29–31] discretize the delay-Doppler
domain, assuming that targets lie on a grid. Subsequently, these ideas were extended
to colocated [32,33] and distributed [34] MIMO radars where targets are located on an
angle-Doppler-range grid. In practice, target parameters are typically continuous values
whose discretization may introduce gridding errors [35]. In particular, [28] constructs
a dictionary that exhaustively considers all possible delay-Doppler pairs, thereby ren-
dering the processing computationally expensive. Noise and clutter mitigation are not
considered in this literature. Simulations show that such systems typically have poor
performance in clutter-contaminated noisy environments.
Off-Grid CS A few recent works [36,37] formulate the radar parameter estimation for
off-grid targets using atomic norm minimization [38,39]. However, these methods do
not address direct analog sampling, the presence of noise, and clutter. Further details
on this approach are available in Chapter 7 (Super-resolution radar imaging via convex
optimization) of this book.
Parametric Recovery A different approach was suggested in [40], which treated radar
parameter estimation as the identification of an underlying linear, time-varying system
[41]. The proposed two-stage recovery algorithm, largely based on [42], first estimates
target delays and then utilizes these recovered delays to estimate Doppler velocities and
complex reflectivities. They also provide guarantees for system identification in terms
4 Mishra and Eldar
of the minimum time-bandwidth product of the input signal. However, this method does
not handle noise well.
Matrix Completion In some radar applications, the received signal samples are pro-
cessed as data matrices, which, under certain conditions, are low rank. In this context,
general works have suggested retrieving the missing entries using matrix completion
methods [10,43]. The target parameters are then recovered through classic radar signal
processing. These techniques have not been exhaustively evaluated for different signal
scenarios and their practical implementations have still not been thoroughly examined.
Finite-Rate-of-Innovation (FRI) Sampling The received radar signal from L targets

can be modeled with 3L degrees of freedom because three parameters – time delay,
Doppler shift, and attenuation coefficient – characterize each target. The classes of
signals that have finite degrees of freedom per unit of time are called finite-rate-of-
innovation (FRI) signals [44]. Low-rate sampling and signal recovery strategies for
FRI signals have been studied in detail in the past [6, chapter 15]. In [45], a tempo-
ral sub-Nyquist radar was proposed to recover delays relying on the FRI model. The
Xampling framework [6] was used to obtain Fourier coefficients from low-rate samples
with a practical hardware prototype. Similar techniques were later studied for delay
channel estimation problems in ultra-wideband communication systems [46,47] and for
ultrasound imaging [48]. In [49], Doppler focusing was added to the FRI-Xampling
framework to recover both delays and Dopplers. Doppler focusing is a narrowband
technique that can be interpreted as low-rate beamforming in the frequency domain, and
was applied earlier to ultrasound imaging [50,51]. It considers a chosen center frequency
with a band of frequencies around it and coherently processes multiple echoes in this
focused region to estimate the delays from low-rate samples.
Extensions of Sub-Nyquist Radars The system proposed in [49] reduces samples

only in time and not in the Doppler domain. Since the set of frequencies for Doppler
focusing is usually fixed a priori, the resultant Doppler resolution is limited by the
focusing; it remains inversely proportional to the number of pulses P , as is also the case
with conventional radar. In [8], sub-Nyquist processing in slow time was introduced
to recover the target range and Doppler by simultaneously transmitting few pulses in
the CPI and sampling the received signals at sub-Nyquist rates. Later, [52] proposed a
whitening procedure to mitigate the presence of clutter in a sub-Nyquist radar. Spatial-
domain compressed sensing (SCS) was examined for a MIMO array radar in [16] and
later for phased arrays in [15]. Recently, [17] proposed Xampling in time and space
to recover delay, Doppler, and azimuth of the targets by thinning a colocated MIMO
array and collecting low-rate samples at each receive element. This sub-Nyquist MIMO
radar (SUMMeR) system was also conceptually demonstrated in hardware [18,53].
The formulation in [54] proposes tensor-based 3D sub-Nyquist radar (TenDSuR) that
performs thinning in all three domains and recovers the signal via tensor-based recovery.
Finally, an extension to SAR was demonstrated in [21]. Table 1.1 summarizes these
developments.
Table 1.1 Sub-Nyquist radars and their corresponding reduction domains.
Sub-Nyquist system Temporal Doppler Spatial
Monostatic pulsed radar [45] Yes No No

Monostatic pulse-Doppler radar [49] Yes No No
Reduced time-on-target radar [8] Yes Yes No
MIMO SCS [16] No No Yes
Phased array SCS [15] No No Yes
SUMMeR [15,17] Yes No Yes
TenDSuR [54] Yes Yes Yes
Sub-Nyquist SAR [21] Yes No No
1.3 Temporal Sub-Nyquist Radar
Consider a standard pulse-Doppler radar that transmits a pulse train

P −1
rTX (t) = h(t − pτ), 0 ≤ t ≤ P τ, (1.1)
p=0
consisting of P uniformly spaced known pulses h(t). The interpulse transmit delay τ
is the pulse repetition interval (PRI) or fast time; its reciprocal is the pulse repetition
frequency (PRF). The entire duration of P pulses in (1.1) is known as the CPI or slow
time. The radar operates at carrier frequency fc so that its wavelength is λ = c/fc ,
where c = 3 × 108 m/s is the speed of light.
In a conventional pulse-Doppler radar, the pulse h(t) = hNyq (t) is a time-limited
baseband
∞ function whose continuous-time Fourier transform (CTFT) is HNyq (f ) =
h (t)e −j 2πf t dt. It is assumed that most of the signal’s energy lies within the
−∞ Nyq
frequencies ±Bh /2, where Bh denotes the effective signal bandwidth, such that the
following approximation holds:
h /2
B
hNyq (t) ≈ HNyq (f )ej 2πf t df . (1.2)

−Bh /2
The total transmit power of the radar is defined as

Bh /2
|HNyq (f )|2 df = PT . (1.3)
−Bh /2
1.3.1 Received Signal Model

Assume that the radar target scene consists of L non-fluctuating point-targets, according
to the Swerling-0 target model [1]. The transmit signal is reflected back by the L targets
and these echoes are received by the radar processor. The latter aims at recovering
the following information about the L targets from the received signal: time delay τl ,
which is linearly proportional to the target’s range dl = cτl /2; Doppler frequency νl ,
6 Mishra and Eldar
proportional to the target’s radial velocity vl = cνl /4πfc ; and complex amplitude αl .
The target locations are defined with respect to the polar coordinate system of the radar
and their range and Doppler are assumed to lie in the unambiguous time-frequency
region, i.e., the time delays are no longer than the PRI, and Doppler frequencies are up
to the PRF.
Typically, the radar assumes the following operating conditions, which leads to a
simplified expression for the received signal [49]:
A1 “Far targets”: assuming νl 2πfc τl /P τ, target radar distance is large compared
to the distance change during the CPI over which attenuation α l is allowed to be
constant.
A2 “Slow targets”: assuming νl 2πfc /P τBh , target velocity is small enough
to allow for constant τl during the CPI. In this case, the following piecewise-
constant approximation holds νl t ≈ νl pτ, for t ∈ [pτ,(p + 1)τ].
A3 “Small acceleration”: assuming dνl /dt c/2fc (P τ)2 , target velocity remains
approximately constant during the CPI allowing for constant νl .
A4 “No time or Doppler ambiguities”: The delay-Doppler pairs (τl ,νl ) for all l ∈
[1,L] lie in the radar’s unambiguous region of delay-Doppler plane defined by
[0,τ] × [−π/τ,π/τ].
A5 The pairs in the set (τl ,νl ) for all l ∈ [1,L] are unique.
Under these assumptions, the received signal can be written as

P −1 L−1

rRX (t) = α l h(t − τl − pτ)e−j νl t + w(t), (1.4)
p=0 l=0
for 0 ≤ t ≤ P τ, where w(t) is a zero mean wide-sense stationary random signal with
autocorrelation rw (s) = σ 2 δ(s). It is convenient to express rRX (t) as a sum of single
frames

P −1
p
rRX (t) = rRX (t) + w(t), (1.5)
p=0
where

L−1
α l h(t − τl − pτ)e−j νl pτ,
p
rRX (t) = (1.6)
l=0
for pτ ≤ t ≤ (p + 1)τ is the return signal from the pth pulse.

p
A classical radar signal processor samples each incoming frame rRX (t) at the Nyquist
p
rate Bh to yield the digitized samples rRX [n],0 ≤ n ≤ N − 1, where N = τBh . The
p
signal enhancement process employs an MF for the sampled frames rRX [n]. This is then
followed by Doppler processing where a P -point discrete Fourier transform (DFT) is
performed on slow-time samples. By stacking all the N DFT vectors together, a delay-
Doppler map is obtained for the target scene. Finally, the time delays τl and Doppler
shifts νl of the targets are located on this map using, e.g., a constant false-alarm rate
detector [55].
1.3.2 Sub-Nyquist Delay-Doppler Recovery

Traditional radar systems sample the received signal at the Nyquist rate, determined by
p
the baseband bandwidth of h(t). Our goal is to recover rRX (t) from its samples taken
p
far below this rate. We note that over the interval τ, rRX (t) is completely specified by
{α l ,τl ,νl }L
l=1 , and is an FRI signal with rate of innovation 3L/τ. Hence, in the absence
p
of noise, one would expect to be able to accurately recover rRX (t) from only a few
samples per time τ. Since radar signals tend to be sparse in the time domain, simply
acquiring a few data samples at a low rate will not generally yield adequate recovery.
Indeed, if the separation between samples is larger than the effective spread in time,
then with high probability many of the samples will be close to zero. This implies that
presampling analog processing must be performed on the frequency-domain support of
the radar signal in order to smear the signal in time before low-rate sampling.
The approach we adopt follows the Xampling architecture designed for sampling
and processing of analog inputs at rates far below Nyquist, whose underlying structure
can be modeled as a union of subspaces (UoS). The input signal belongs to a single
subspace, a priori unknown, out of multiple, possibly even infinitely many, candidate
subspaces. Xampling consists of two main functions: low-rate analog-to-digital conver-
sion (ADC), in which the input is compressed in the analog domain prior to sampling
with commercial devices, and low-rate digital signal processing, in which the input
subspace is detected prior to digital signal processing. The resulting sparse recovery
is performed using CS techniques adapted to the analog setting. This concept has been
applied to both communications [56–59] and radar [49,60], among other applications.
Time-varying linear systems, which introduce both time shifts (delays) and frequency
shifts (Doppler shifts), such as those arising in surveillance point-target radar systems,
fit nicely into the UoS model. Here, a sparse target scene is assumed, allowing the
reduction of the sampling rate without sacrificing delay and Doppler resolution. The
Xampling-based system is composed of an ADC, which filters the received signal to pre-
determined frequencies before taking point-wise samples. These compressed samples,
or “Xamples,” contain the information needed to recover the desired signal parameters.
To demonstrate sub-Nyquist sampling, we begin by deriving an expression for the
Fourier coefficients of the received signal and show that the target parameters are
embodied in them. Let FR and fNyq be the set of all frequencies in the received signal
spectrum and the corresponding Nyquist sampling rate, respectively. Consider the
p p
Fourier series representation of the aligned frames rRX (t + pτ), with rRX (t) defined
in (1.6):
τ
L−1
1
rRX (t + pτ)e−j 2πkt/τ dt = α l e−j 2πkτl /τ e−j νl pτ,
p
cp [k] = H [k] (1.7)
0 τ
l=0
f
for k ∈ κ, where κ = k = fNyq N f ∈ FR . From (1.7), we see that the unknown
parameters {α l ,τl ,νl }L−1
l=0 are embodied in the Fourier coefficients cp [k]. We can esti-
mate these parameters using only a small number of Fourier coefficients, which trans-
lates to a low sampling rate.
8 Mishra and Eldar
Figure 1.1 Sub-Nyquist sampling methods: (a) direct sampling; (b) low frequencies only;
(c) multiband sampling.
There are several ways to implement a sub-Nyquist sampler [47,61] in order to obtain
a set of Fourier coefficients from low-rate samples of the signal. For simplicity, consider
|κ| = K such that q = N/K is an integer defining the sampling reduction factor.
In direct sampling (Figure 1.1a), the signal rRx (t) obtained after the anti-aliasing filter
is passed through as many analog chains as the number of sub-Nyquist coefficients
K. Each branch is modulated by a complex exponential, followed by integration over
τ and necessary digital signal processing (DSP). This technique provides the largest
flexibility in choosing the Fourier coefficients, but is expensive in terms of hardware.
Another approach is to limit the bandwidth of the anti-aliasing filter such that only the
lowest K frequencies are free of aliasing (Figure 1.1b). We then sample these lowest
K frequencies. Here, the measurements are correlated and a modification in the analog
hardware is also required so that the anti-aliasing filter has reduced passband. In the
multiband sampling method shown in Figure 1.1c, M disjoint randomly chosen groups
of consecutive Fourier coefficients are obtained such that the total number of sampled
coefficients is K. This translates to splitting the signal across M branches, passing the
downconverted signal through reduced-passband anti-aliasing filters, and then sampling
each band with a low-rate ADC. This method can be easily implemented but requires
M low-rate ADCs. The sub-Nyquist hardware prototypes developed in [45,49] adopt
multiband sampling using four groups of consecutive coefficients. In practice, the spe-
cific Fourier coefficients are chosen through extensive software simulations to provide
low mutual coherence [6] for CS-based signal recovery.
Our goal now is to recover {α l ,τl ,νl }L−1
l=0 from cp [k] for k ∈ κ and 0 ≤ p ≤ P − 1.
To that end, [49] adopts the Doppler focusing approach. Consider the DFT of the coef-
ficients cp [k] in the slow-time domain:

P −1
L−1
P −1
1
˜ ν [k] =
cp [k]ej ν pτ = H [k] α l e−j 2πkτl /τ ej (ν −νl )pτ . (1.8)
τ
p=0 l=0 p=0
Figure 1.2 Sum of exponents |g(ν|νl )| for P = 200, τ = 1 s, and νl = 0 [20,49]. ©2018 IEEE.
Reprinted, with permission, from [20].
The key to Doppler focusing follows from the approximation:

P −1
j (ν −νl )pτ P |ν − νl | < π/P τ

g(ν|νl ) = e ≈ (1.9)
0 |ν − νl | ≥ π/P τ,
p=0
as illustrated in Figure 1.2. Denote the normalized focused measurements by

τ
ν [k] = ˜ ν [k].
(1.10)
P H [k]
As in traditional pulse-Doppler radar, suppose we limit ourselves to the Nyquist grid
so that τl /τ = rl /N , where rl is an integer satisfying 0 ≤ rl ≤ N − 1. Then, (1.10) can
be approximately written in vector form as
ν = Fκ xν , (1.11)

where ν = ν [k0 ] . . . ν [kK−1 ] ,ki ∈ κ for 0 ≤ i ≤ K − 1, Fκ is composed of the
K rows of the N × N Fourier matrix indexed by κ, and xν is an L-sparse vector that
contains the values α l at the indices rl for the Doppler frequencies νl in the “focus zone,”
that is, |ν − νl | < π/P τ. It is convenient to write (1.11) in matrix form, by vertically
concatenating the vectors ν , for ν on the Nyquist grid, namely ν = − 2τ 1
+ P1τ , into
the K × P matrix , as
= Fκ X, (1.12)
where X is formed similarly by vertically concatenating the vectors xν . Note that the
matrix Fκ is not square and, as a result, the system of linear equations (1.12) is under-
determined. The system in (1.12) can be solved using any CS algorithm, such as orthog-
onal matching pursuit (OMP) and 1 minimization [6,7].
A Nyquist receiver needs Bh τ samples to recover the targets. However, as stated
shortly in Theorem 1.3.1, the number of samples required by the sub-Nyquist receiver
10 Mishra and Eldar
depends only on the number of targets present and not on Bh . This shows that a sub-
Nyquist radar breaks the link between range resolution and transmit bandwidth. In gen-
eral, only a few targets are present in the radar coverage region leading to a significant
reduction in sampling rate.
theorem 1.3.1 [49] The minimal number of samples required for perfect recovery of
{αl ,τl ,νl }L 2
l=0 in a noiseless environment is 4L . In addition, the number of samples per
period is at least 2L, and the number of periods P ≥ 2L.
The Doppler focusing operation (1.8) is a continuous operation on the variable ν,

and can be performed for any Doppler frequency up to the PRF. With Doppler focusing
there are no inherent blind speeds, i.e., target velocities that are undetectable, as occurs
with a classic moving target indicator [1]. Since strong amplitudes are indicative of true
target existence as opposed to noise, Doppler focusing recovery searches for large mag-
nitude entries and marks them as detections. After detecting each target, its influence
is removed from the set of Fourier coefficients in order to reduce masking of weaker
targets and to remove spurious targets created by processing sidelobes. It is important to
note that our dictionary in (1.12) is indifferent to the Doppler estimation. CS methods,
which estimate delay and Doppler simultaneously [28], require a dictionary that grows
with the number of pulses. Here, by separating delay and Doppler estimation, the CS
dictionary is not a function of P .
Moreover, the performance of the sub-Nyquist radar in the presence of noise improves
with Doppler focusing. The following theorem states that Doppler focusing increases
the per-target SNR by a factor of P . This linear scaling is similar to that obtained using
an MF.
|cpl [k]|2
theorem 1.3.2 [49] Let the prefocusing SNR of the lth target be pl [k] = E[|wp [k]|2 ]
where cpl [k] and wp [k] are the signal and white noise Fourier coefficients. Then, the
focused SNR for the lth target at center frequency ν is P pl [k].
A continuous-value parameter recovery using Doppler focusing is described in [49].

For practical considerations of computational efficiency, Doppler focusing can be per-
formed on a uniform grid of frequencies so that focused coefficients are computed
efficiently using the fast Fourier transform (FFT). Algorithm 1 in this section outlines
this approach to solving the P equations (1.12) simultaneously, where, in each itera-
tion, the maximal projection of the observation vectors onto the measurement matrix
is retained. The algorithm termination criterion follows from the generalized likelihood
ratio test (GLRT) framework presented in [62]. For each iteration, the alternative and
null hypotheses in the GLRT problem define the presence or absence of a candidate
target, respectively. In Algorithm 1, Qχ22 (ρ) denotes the right-tail probability of the
chi-square distribution function with 2 degrees of freedom, C is the complementary
set of and
PT
ρ= 2 (1.13)
σ |FR |
is the SNR with σ2 the noise variance and PT the total transmit power.
Algorithm 1 Sub-Nyquist Radar Delay-Doppler Recovery [20,49]

Input: Observations cp [k], 0 ≤ p ≤ P − 1 and k ∈ κ, probability of false alarm Pfa ,
noise variance σ2 , transmitted power PT , total transmitted bandwidth |FR |
Output: Estimated target parameters { α̂ l , τ̂l , ν̂l }L−1
l=0
1: Create from cp [k] using the FFT (1.8), for k ∈ κ and ν = −1/(2τ) + p/(P τ),
0≤p ≤P −1
2: Compute detection thresholds
PT
ρ= 2 , γ = Q−12 (1 − N 1 − Pfa )
σ |FR | χ 2 (ρ)
3: Initialization: residual R0 = , index set 0 = ∅, t = 1

4: Project residual onto measurement matrix:
= FH
κ Rt−1
5: Find the two indices λt = [λt (1) λt (2)] such that

[λ t (1) λt (2)] = arg maxi,j i,j
6: Compute the test statistic
(Fκ )λt (1) ((Rt−1 )λt (2) )H ((Fκ )λt (1) )H (Rt−1 )λt (2)
=
σ2
where (M)i denotes the ith column of M
7: If > γ continue; otherwise go to step 12

8: Augment index set t = t {λ t }
9: Find the new signal estimate
X̂t|t = (Fκ )†t , X̂t|C = 0

t
10: Compute new residual

Rt = − (Fκ )t X̂
11: Increment t and return to step 4
12: Estimated support set ˆ = t
τ ˆ 1 ˆ
13: τ̂l = N (l,1), ν̂l = P τ (l,2), α̂ l = X̂(l,1),
ˆ ˆ
(l,2)
In Section 1.3.4, we introduce a sub-Nyquist prototype implementing the ideas in

this section using simple hardware. Before that, we describe how to account for clutter
mitigation in sub-Nyquist radar.
1.3.3 Sub-Nyquist Clutter Removal

Clutter refers to unwanted echoes from stationary objects such as buildings, trees, chaff,
and ground surface as well as moving elements like weather and sea. Since strong
clutter echoes hamper detection of desired targets, clutter removal has been investigated
12 Mishra and Eldar
intensively. In the context of CS-based radars, [63] provides a general overview of clut-
ter rejection algorithms. In [64], Capon beamforming is used to reject clutter and then
the target is retrieved by exploiting sparse reconstruction methods. On the other hand, a
few works such as [65–67] utilize sparsity of the clutter in the mitigation process. Along
similar lines, [68] assumes sparse clutter and proposes a GLRT detector. However, they
obtain signal samples at the Nyquist rate.
Conventionally, clutter is modeled as a random process with Doppler frequency that
follows a colored Gaussian noise distribution [69–71]. A standard operation to remove
this correlated noise is to use receive filters that maximize the signal-to-clutter-plus-
noise (SCNR) ratio. This method is equivalent to first whitening the received signal
samples, and then performing matched-filtering with respect to a whitened pulse. Our
approach [52] to clutter removal in sub-Nyquist radar is based on this philosophy as it
fits well with our Fourier-based analysis.
In the presence of clutter and noise, the received signal rq (t) is
r(t) = rRX (t) + y(t), (1.14)
where rRX (t) is the target signal with noise as in (1.5) and

P −1
C
y(t) = α c h(t − pτ − τc )ej vc pτ, (1.15)
p=0 c=1
is the echo from C clutter targets. We assume that the mean clutter amplitude is
E[|αc |2 ] = σc2 . Further, the delays τc ∼ U (0,τ) and the clutter Doppler spectrum
vc ∼ N (vd ,σd2 ) are independent and identically distributed.
Analogous to the target signal in (1.7), the Fourier series representation of the clutter
signal is given by
1 C−1
2π
c̃p [k] = H [k] α c e−j τ kτc e−j vc pτ . (1.16)
τ
c=0
Let the Fourier series coefficients of the noise be w̃p [k]. We now form a P × K matrix
R with kth column given by the Fourier coefficients Rp [k] = cp [k] + c̃p [k] + w̃p [k],
k ∈ κ such that
N H + B,
R = X + Y + N = FP AFK (1.17)
2π
where B = Y + N, FP is the P × P Fourier matrix with (l,k)th element e−j P lk , FK
N is
2π
a submatrix formed by K rows of the N × N Fourier matrix with (l,k)th element ej N lk ,
H = diag(H [k]) is a K × K diagonal matrix, A is a P × K sparse matrix with complex
reflectivity α l at the L indices (rl ,sl ), and Y and N are P × K matrices with (p,k)th
elements c̃p [k] and w̃p [k], respectively. As mentioned previously, noise is white over
the indices k (all tones).
Our goal is to extract A from the measurements R. For simplicity, we assume that
|H [k]|2 is unity for all k. The whitening transformation requires information about the
statistics of clutter and noise, which are summarized in the following proposition.
proposition 1.3.3 [52] The mean of the clutter Fourier coefficients is E[c̃p [k]] = 0,
and their correlation is given by
1 2 2 2
Rl1 [k1,k2 ] = E c̃p [k1 ]c̃p+l1 [k2 ] = Cσc2 δk1,k2 e−j vd l1 τ− 2 σd l1 τ . (1.18)
The mean and variance of the Fourier coefficients of the noise are, respectively,
1
E Np [k] = 0, E Np [k1 ]Np+l1 [k2 ] = σn2 δk1 k2 δl2 . (1.19)
τ
Our clutter mitigation technique is based on whitening all the tones of the measure-
ments R. It follows from Proposition 1.3.3 that the columns of B are uncorrelated and
identically distributed. The covariance matrix M of the columns of B is a Toeplitz matrix
with mth diagonal value
1 2 1 2
M(m) = Cσc2 e−j vd mτ− 2 σd m
2 τ2
+ σ δm . (1.20)
τ n
Therefore, the columns of R can be whitened by multiplying on the left by M−1/2 :
M−1/2 R = M−1/2 FP AFK

NH + M
−1/2
B, (1.21)
where M−1/2 B corresponds to white noise. From here, we proceed with Doppler focus-
ing on M−1/2 R by taking a Hermitian transpose of (1.21) and multiplying on the right
by M−1/2 FP :
= H(FK H H H −1 H −1
N ) A FP M FP + B M FP = + W, (1.22)
where W is white noise for each focused frequency. This equation represents a sparse
matrix recovery problem. For known matrices D1 = H(FK H −1
N ) and D2 = FP M FP , we
H
are given measurements = D1 XD2 , and the goal is to retrieve the sparse matrix X =
AH . These problems are solved by matrix-sketching algorithms, as described in [72]. It
has been shown in [52] that whitened Doppler focusing generally increases the SCNR.
Compared to other CS-based radars [27,28], this technique is robust to the presence of
clutter, despite sampling at low rates.
1.3.4 Sub-Nyquist Hardware Prototype

The first sub-Nyquist radar hardware implementation was presented in [45]. It was
then developed further to incorporate Doppler focusing and clutter removal in [49,52],
respectively. Since sub-Nyquist techniques manifest themselves mostly in the radar
receiver, this prototype emulates receiver processing.
The basic prototype (Figure 1.4) consists of an analog front-end (Figure 1.3), fed
by a synthetized RF signal using National Instruments (NI) hardware and followed by
digital delay-Doppler map recovery. To evaluate the Xampler board, we make use of
NI equipment for both system synchronization and RF signal sources. Figure 1.5 shows
the entire assembly wrapped in the NI chassis. We transmit 50 pulses with bandwidth
20 MHz. At the receiver, a multiple bandpass sampling approach was chosen, where 4
groups of consecutive Fourier coefficient subsets are selected. Each channel is fed by
14 Mishra and Eldar
Figure 1.3 The four-channel Xampler board [45]. ©2014 IEEE. Reprinted, with permission,
from [45].
Figure 1.4 Sub-Nyquist hardware prototype showing connections between the Xampler board and
NI chassis [8,45,49].
Figure 1.5 NI chassis showing various signal generation and synchronization components.
a local oscillator (LO), which modulates the desired frequency band of the channel to
the central frequency of a narrow 80 KHz bandwidth band pass filter (BPF). A fifth
LO, common to all 4 channels, modulates the BPF output to a low-frequency band,
and sampled with a standard low-rate ADC operating at 250 kHz frequency. The digital
samples are acquired by the chassis controller and a MATLAB function is launched that
runs Doppler focusing. The digital reconstruction algorithm, performed at a low rate of
250 ksps, allows recovery of the unknown delays and Doppler frequencies of the targets.
A block diagram of the system is shown in Figure 1.6.
fLO1 28.915MHz
Crystal Filter LPF

14dB fc = 29MHz 14dB fp = 100KHz
ADC 4 ¥ 250KHz (1MHz Total)

Df3dB = 80KHz fs = 125KHz
fLO2 28.915MHz
Crystal Filter LPF
fc = 29MHz fp = 100KHz
1Æ4 Splitter
14dB 14dB
fLO3 Digital Processing
Input Signal LPF 14dB 28.915MHz
f3dB = 2.5MHz Crystal Filter LPF
14dB fc = 29MHz 14dB fp = 100KHz
fLO4 28.915MHz
Crystal Filter LPF
14dB
fc = 29MHz 14dB fp = 100KHz
Figure 1.6 Block diagram of a 4-channel solid-state receiver with 4 up-modulating local
oscillators with respective center frequencies of 28.375, 28.275, 27.65, and 27.391 MHz [45].
©2014 IEEE. Reprinted, with permission, from [45]
Figure 1.7 Sub-Nyquist prototype experiment [52]. Top left to right: Signal corresponding to
targets, clutter, noise, and all three combined. Bottom left to right: Low rate samples at receiver
and delay-Doppler map with true and recovered targets.
Real-time analog experiments show that the system is able to maintain good detection
capabilities, while sampling radar signals that require Nyquist rate of about 30 MHz at a
total rate of 1 MHz, i.e., 1/30th of the Nyquist rate. We conducted several experiments
in order to test the accuracy of our system under various conditions. For example,
Figure 1.7 shows results for a hardware experiment where the target scene has seven
scatterers with different delays and Doppler frequencies. A few cases of closely spaced
targets in the delay-Doppler plane are also included. Clutter is also added to the scene
and identified by the system. Our low-rate processing rejects the clutter and successfully
detects only targets in the delay-Doppler plane despite sampling at 1/30th of the Nyquist
rate. The digital recovery algorithm is efficient as it involves only solving 1D delay
recovery problems post FFT-based Doppler focusing and without increasing the size of
the dictionary.
1.4 Doppler Sub-Nyquist Radar
The temporal sub-Nyquist processing in the fast-time domain described in the previous
section breaks the link between signal bandwidth, sampling rate, and range resolution.
16 Mishra and Eldar
The Xampling framework can also be extended in the slow-time or Doppler-frequency

domain. The Doppler resolution in classical radar processing is given by 2π/P1 τ, where
P1 is the number of pulses transmitted during the CPI. In Doppler domain sub-Nyquist
processing, we nonuniformly transmit P2 < P1 pulses and reduce the power consump-
tion and dwell time in a particular direction without loss of Doppler resolution. The
advantage is gaining the ability to look at other directions within the same CPI by
interleaving transmissions in different directions.
A few other CS-based works [9,10] have considered reduced time-on-target (RToT)
scenarios without addressing analog sampling. The Doppler sub-Nyquist processing that
we review here was introduced in [8], and is based on the prototype and principles
presented in the previous section.
We consider a nonuniformly transmitting pulse-Doppler radar such that the pth pulse
2 −1
is sent at time mp τ, where {mp }Pp=0 is an ordered set of integers such that mp ≥ p.
Then, (1.1) is written as
2 −1
P
rTX (t) = h(t − mp τ), 0 ≤ t ≤ P1 τ. (1.23)
p=0
The received signal rRX (t) is accordingly expressed as a sum of single frames
2 −1
P
p
rRX (t) = rRX (t), (1.24)
p=0
where

L−1
α l h(t − τl − mp τ)e−j νl mp τ,
p
rRX (t) = (1.25)
l=0
for 0 ≤ t ≤ P1 τ, is the return signal from the pth pulse. Our goal is to recover the
p
targets range and Doppler frequency from the received signals rRX (t), with a reduced
number of transmit pulses P2 < P1 as well as low-rate samples per pulse.
1.4.1 Xampling in CPI and Delay-Doppler Recovery

As before, we consider the Fourier series representation of the aligned frames
p
rRX (t + mp τ):
1
L−1
Xp [k] = H [k] α l e−j 2πkτl /τ e−j νl mp τ, 0 ≤ k ≤ N − 1, (1.26)
τ
l=0
where N = Bh τ. From (1.26), the Fourier coefficients embody all the information about
the unknown parameters {α l ,τl ,νl }L−1
l−0 . The goal is then to recover these parameters
from Xp [k], 0 ≤ p ≤ P2 − 1. The low rate sampling technique is as described earlier
in Section 1.3.2, but the processing steps to recover the target parameters are different
to account for sub-Nyquist sampling in Doppler. Let X be the K × P matrix with pth
column given by the Fourier coefficients Xp [k], k ∈ κ. Then X can be expressed as
P2 T
X = HFK
N A(FP1 ) , (1.27)
P2
where H = τ1 diag(H [k]), FKN is a K × N partial Fourier matrix, FP1 is a P2 × P1 partial
Fourier matrix indexed by the values of mp , 1 ≤ p ≤ P2 , and A is an N × P1 sparse
matrix with α l values at the L indices {sl ,τl }. We would like to recover A from the
measurements X.
The system of linear equations (1.27) can be solved by CS techniques. However, this
problem is different than the temporal sub-Nyquist formulation of Section 1.3.2, where
only the range sensing matrix FK N is a partial DFT. In Doppler sub-Nyquist radar, both
P2
range and Doppler sensing matrices (i.e., FK N and FP1 , respectively) are partial DFTs.
Analogous to Theorem 1.3.1, we have the following result for the Doppler sub-Nyquist
radar:
theorem 1.4.1 [8] The minimal number of samples required for perfect recovery of A
for L targets in noiseless settings is 4L2 . In addition, the number of samples per period
is at least 2L, and the number of periods P2 ≥ 2L.
Note that the number of periods P2 here is for nonuniform transmission while the
minimum number of periods in Theorem 1.3.1 pertain to uniformly spaced pulses in the
CPI. Theorem 1.4.1 indicates the lower limit of rate reduction in temporal and Doppler
domains.
To solve for the sparse matrix A in (1.27) one can use the matrix version of OMP or
1 minimization [6]. Alternatively, Doppler focusing is still approximately applicable.
The nonuniform discrete Fourier transform of the coefficients Xp [k] is
2 −1
P
L−1 2 −1
P
j ν mp τ 1 −j 2πkτl /τ
ν [k] = Xp [k]e = H [k] αl e ej (ν −νl )mp τ . (1.28)
τ
p=0 l=0 p=0
This can be approximated similar to (1.9) and solved by Algorithm 1, described earlier.
However, this is a poor approximation because the P2 points in the sum of exponents
P2 −1 j (ν −ν )m τ
p are not equally spaced over the unit circle.
p=0 e
l
1.4.2 RToT Hardware Prototype

A hardware implementation of the RToT concept is described in [8]. It uses the sub-
Nyquist hardware prototype presented in Section 1.3.4. We evaluated the prototype for
a scenario wherein targets are located at two distinct azimuths. Here, P1 = 50 pulses
were chosen such that a quarter of them were sent in one direction and the rest in another.
The target scenario for both is then simultaneously recovered within the same original
CPI (Figure 1.8). In this experiment, the reduction in temporal domain is the same as in
Section 1.3.4, i.e., 1/30 of the Nyquist rate. In the Doppler domain, pulses in the two
directions are reduced by 75% and 25%, respectively.
18 Mishra and Eldar
Figure 1.8 RToT sub-Nyquist radar prototype [8]. Top left: Targets at two different azimuths.
Bottom left: Echoes from both directions are acquired via nonuniform pulses. Top and bottom
right: delay-Doppler maps showing reconstruction for both directions.
1.5 Cognitive Sub-Nyquist Radar and Spectral Coexistence
In the previous two sections, we focused on processing the received signal. The receiver
design in the sub-Nyquist framework can be exploited to also alter the behavior of
the radar transmitter. In this section, we discuss the opportunistic control of the trans-
mitter to impart cognition to the radar and leverage it for spectrum sharing applica-
tions. For alternative, non-sub-Nyquist approaches to cognition in radars, we refer the
reader to Chapters 9 (“Spectrum sensing for cognitive radar via model sparsity exploita-
tion”) and 10 (“Cooperative spectrum sharing between sparse sensing based radar”) of
this book.
The unhindered operation of a radar that shares its spectrum with communication
systems has captured a great deal of attention within the operational radar community
during the last decade [73]. The interest in such spectrum-sharing radars is largely due
to the electromagnetic spectrum being a scarce resource and almost all services having
a need for greater access to it.
Recent research in spectrum sharing radars has focused on S- and C-bands, where
the spectrum has seen increasing cohabitation by long-term evolution (LTE) cellu-
lar/wireless commercial communication systems. Many synergistic efforts by major
agencies are underway for efficient radio spectrum utilization. A significant recent
development is the announcement of the Shared Spectrum Access for Radar and
Communications (SSPARC) program [74] by the Defense Advanced Research Projects
Agency (DARPA). This program is focused on S-band military radars and views
spectrum sharing as a cooperative arrangement where the radar and communication
services actively exchange information. It defines spectral coexistence as equipping
existing radar systems with spectrum sharing capabilities and spectral codesign as
developing new systems that utilize opportunistic spectrum access [75]. For a review of
spectral interference from different services at IEEE radar bands, see [20].
A variety of system architectures have been proposed for spectrum-sharing radars.
Most put emphasis on optimizing the performance of either radar or communications
while ignoring the performance of the other. The radar-centric architectures [20,76]
usually assume fixed interference levels from communication systems and design the
system for high probability of detection (Pd ). Similarly, the communications-centric
systems attempt to improve performance metrics, like the error vector magnitude and
bit/symbol error rate for interference from radar. With the introduction of the SSPARC
program, joint radar-communication performance is being investigated [77]. In nearly
all cases, real-time exchange of information between radar and communications hard-
ware has not yet been integrated into the system architectures. In a similar vein, our
proposed method, described later in this section, incorporates handshaking of spectral
information between the two systems.
Conventional receiver processing techniques to remove RF interference in radar
employ notch filters at hostile frequencies. Typically, spectrum sharing is achieved by
notching out the radar waveform’s bandwidth, causing a decrease in range resolution.
Our spectrum sharing solution departs from this baseline. The approach we adopt
follows the Xampling architecture on which the sub-Nyquist radar prototype described
earlier in Section 1.3 is based. We recall that the sub-Nyquist receiver samples and
processes only small narrow subbands of the received signal. Hence, we capitalize on
the simple observation that if only narrow spectral bands are sampled and processed,
then one can restrict the transmit signal to these bands. The concept of transmitting
only a few subbands that the receiver processes is one way to formulate a cognitive
radar (CRr) [60]. The delay-Doppler recovery is then performed as presented earlier
in Section 1.3. The range resolution obtained through this multiband signal spectrum
fragmentation can be the same as that of a wideband traditional radar. Furthermore, by
concentrating all the available power in the transmitted narrow bands rather than over a
wide bandwidth, the CRr increases SNR, as illustrated in Figure 1.9.
In the CRr system [60], the support of subbands varies with time to allow for
dynamic and flexible adaptation to the environment. Such a system also enables the
radar to disguise the transmitted signal as an electronic countermeasure or cope with
crowded spectrum by using a smaller interference-free portion. The CRr configuration is
key to spectrum sharing since the radar transceiver adapts its transmission to available
bands, achieving coexistence with communication signals. To detect vacant bands,
a communication receiver is needed that performs spectrum sensing over a large
Figure 1.9 A conventional radar with bandwidth Bh transmits in the band Bh . A cognitive radar
Nb
transmits only in subbands {Bi }i=1 , but with increased in-band power. The sub-Nyquist receiver
samples and processes only these subbands [19].
20 Mishra and Eldar
bandwidth. Such systems have recently received tremendous interest in communications

research, which faces a bottleneck in terms of spectrum availability. To increase the
efficiency of spectrum managing, dynamic opportunistic exploitation of temporarily
vacant spectral bands by secondary users has been considered, under the name of
cognitive radio (CRo) [78]. Here, we use a CRo receiver to detect the occupied
communication bands, so that our radar transmitter can exploit the spectral holes.
One of the main challenges of spectrum sensing in the context of CRo is the sampling
rate bottleneck due to the wide signal bandwidth. In this context, we use the Xampling
framework to subsample and process the signal [20,56].
Denote the set of all frequencies of the available common spectrum by F. The com-
munication and radar systems occupy subsets FC and FR of F, respectively, such that
FC ∩ FR = ∅. Once the CRo receiver has identified FC , it provides the radar with spec-
tral occupancy information. Equipped with this spectral map as well as a known radio
environment map (REM) detailing typical interference, the CRr transmitter chooses
narrow frequency subbands that minimize interference for its transmission. The radar
conveys the frequencies FR to the communication receiver as well, so that it can ignore
the radar bands while sensing the spectrum. The combined CRo-CRr system results in
spectral coexistence via the Xampling (SpeCX) framework, which optimizes the radar’s
performance without interfering with existing communication transmissions. Our hard-
ware prototype for SpeCX, presented in Section 1.5.3, performs real-time recovery of
CRo and CRr signals sharing a common spectrum at SNRs as low as −5 dB.
1.5.1 Cognitive Radio

We first introduce the signal model, processing, and prototype of CRo in the context of
SpeCX. Let xC (t) be a real-valued continuous-time communication signal, supported on
F = [−1/2TNyq, +1/2TNyq ] and composed of up to Nsig transmit waveforms, such that
Nsig

xC (t) = si (t), (1.29)
i=1
where s(t) has unknown carrier frequency fi , and Xc (f ) is the Fourier transform
of xC (t). We denote by fNyq = 1/TNyq the Nyquist rate of xC (t). The waveforms,
respective carrier frequencies, and bandwidths are unknown. We only assume that the
single-sided bandwidth Bci for the ith transmission does not exceed an upper limit B.
Such sparse wideband signals belong to the so-called multiband signal model [56,79].
Figure 1.10 illustrates the two-sided spectrum of a multiband signal with K = 2Nsig
bands centered around unknown carrier frequencies |fi | ≤ fNyq /2.
~
− − − − 0
2 2
Figure 1.10 Multiband model with K = 6 bands [20].

PC–Matlab Based Controller Mixing Series Generator PC - Labview + Matlab Based Controller
Generates RF Input in Real Time
pi(t)
yi(t)
i = 1..3
XIUNX VC707 – HighSpeed FPGA
NI PXIe-1065 with DC Coupled 4-Channel ADC
PCIe ×4 to MXI ×4
x(t)
Signal Generators RF Signal
Signal ADC
2× NI° USRP-2942R RF Generator The Cog-Radio Card *Note: Allows real-time sampling
Figure 1.11 CRo system [80].
Let FC ⊂ F be the unknown support of xC (t). The goal of the CRo communication
receiver is to retrieve FC , while sampling and processing xC (t) at low rates in order
to reduce system cost and resources. A CRo system was developed earlier [56,80] for
blind sensing (see Figure 1.11). Next, we explain the details on combining this system
with the sub-Nyquist radar to implement SpeCX.
The input signal at the communication receiver of the SpeCX system is
x(t) = xC (t) + xR (t), (1.30)
where xR (t) = rTX (t) + rRX (t) is the radar signal sensed by the communication receiver,
composed of the transmitted and received radar signals defined in (1.1) and (1.4), respec-
tively. Since the frequency support of xC (t) is unknown, a classic processor would
sample such a signal at its Nyquist rate, which can be prohibitively high. In this work,
we instead use the modulated wideband converter (MWC) [56], a sub-Nyquist sampling
technique that achieves the lower sampling rate bound for perfect blind recovery of
multiband signals, namely twice the Landau rate, and is also practically feasible. The
MWC is composed of M parallel channels. In each channel, an analog mixing front
end, where xC (t) is multiplied by a mixing function pi (t), aliases the spectrum, such
that each band appears in baseband. The mixing functions pi (t) are periodic with period
Tp such that fp = 1/Tp ≥ B and have thus the following Fourier expansion:
∞
j T2π lt
pi (t) = cil e p . (1.31)
l=−∞
In each channel, the signal next goes through a lowpass filter (LPF) with cut-off
frequency is sampled at rate fs ≥ fp , resulting in samples zi [n]. Define
fs /2 and
fNyq +fs
N = 2 2fp and Fs = [−fs /2,fs /2]. Following the calculations in [56], the
relation between the known discrete-time Fourier transform of the samples zi [n] and
the unknown XC (f ) is given by
z(f ) = A(xC (f ) + xR (f )), f ∈ Fs , (1.32)
where z(f ) is a vector of length M with ith element zi (f ) = Zi (ej 2πf Ts ) and the
unknown vector xC (f ) is given by
xC i (f ) = XC (f + (i − N/2)fp ), f ∈ Fs , (1.33)
22 Mishra and Eldar
Figure 1.12 Schematic implementation of the MWC analog sampling front end and digital signal
recovery from low-rate samples [20]. The CRo inputs are the communication signal xC (t) and
radar support FR . The communication support output FC is shared with the radar transmitter.
for 1 ≤ i ≤ N . The vector xR i (f ) is defined similarly. The M × N matrix A contains

the known coefficients cil such that Ail = ci,−l = cil∗ .
The MWC analog mixing front end, shown in Figure 1.12, results in folding the
spectrum to baseband with different weights for each frequency interval. The CRo’s goal
is now to recover the support of xC (f ) from the low-rate samples z(f ). The recovery
of xC (f ) for each f independently is inefficient and not robust to noise. Instead, the
support recovery paradigm from [56] exploits the fact that the bands occupy continuous
spectral intervals so that xC (f ) are jointly sparse for f ∈ Fp . The continuous to finite
block [56] then produces a finite system of equations, called multiple measurement
vectors (MMV) from the infinite number of linear systems (1.32).
From (1.32), we have
Q = ZH , (1.34)
where

Q= z(f )zH (f )df , Z = x(f )xH (f )df , (1.35)
f ∈Fp f ∈Fp
are M × M and N × N matrices, respectively. Here, x(f ) = xC (f ) + xR (f ). The matrix

Q is then decomposed to a frame V such that Q = VVH . Clearly, there are many ways
to select V. One possibility is to construct it by performing an eigendecomposition of Q
and choosing V as the matrix of eigenvectors corresponding to the nonzero eigenvalues.
The finite dimensional MMV system is then given by
V = A(UC + UR ). (1.36)
The support of the unique sparsest solution of (1.36) is the same as the support of our
original set of equations (1.32) [56]. Therefore, the support of UC and UR are disjointed.
The frequency support FR of xR (t) is known at the communication receiver. From
FR , we derive the support SR of the radar slices xR (f ), which is identical to the support
of UR , such that

fRi fs + BRi

SR = n n − − N/2 < , (1.37)
fp 2fp
for 1 ≤ i ≤ Nb . Our goal can then be stated as recovering the support of UC from V,
given the known support SR of UR . This can be formulated as a sparse recovery with
partial support knowledge, studied under the framework of modified CS [81]. Modified-
CS has been used to adapt CS recovery algorithms to exploit partial known support. In
particular, greedy algorithms, such as OMP, have been modified to OMP with partial
known support [82]. Instead of starting with an initial empty support set, one starts with
SR as the initial support. In the first iteration, we compute the estimate
ÛS1R = A†SR V, Û1i = 0, ∀i ∈

/ SR , (1.38)
and residual
V1 = V − ASR Û1 . (1.39)
The remainder of the algorithm is then identical to OMP.

Once the overall support SC SR is known, we have

x̂SC SR
[n] = A†S z[n],
SR
(1.40)
C

x̂i [n] = 0, ∀i ∈
/ SC SR .

Here, xSC SR (f ) denotes the vector x(f ) reduced to its support, ASC SR is composed

of the columns of A indexed by SC SR and † is the Moore–Penrose pseudo-inverse.
The occupied communication support is then

fp
FC = f ||f − (i + N/2)fp | ≤ , for all i ∈ SC . (1.41)
2
1.5.2 Cognitive Radar

After CRo detects the communication signal support, the CRr transmits a pulse h(t)
in the unused parts of the spectrum. The transmit signal is supported over Nb disjoint
frequency bands, with bandwidths {Bri }N b
i=1 centered around the respective frequencies
Nb Nb i
{fr }i=1 , such that i=1 Br < Bh . The number of bands Nb is known to the receiver
i
and does not change during operation. The location and extent of the bands Bri and fri
are determined by the radar transmitter through an optimization procedure to identify
the least contaminated bands (see Section 1.5.2). The resulting transmitted radar signal
CTFT is

βi HNyq (f ), f ∈ FRi , for 1 ≤ i ≤ Nb

HR (f ) = (1.42)
0, otherwise,
where FRi = [fri − Bri /2,fri + Bri /2] is the set of frequencies in the ith band, such that
b i
FR = N i=1 FR . The parameters β i > 1 are chosen such that the total transmit power
PT of the spectrum-sharing radar remains the same as that of the conventional radar:
Bh /2 Nb

|HNyq (f )| df = 2
|HR (f )|2 df = PT . (1.43)
−Bh /2 i=1
Fri
24 Mishra and Eldar
The radar identifies an appropriate transmit frequency set FR ⊂ F \ FC such that

the radar’s probability of detection Pd is maximized. For a fixed probability of false
alarm Pfa the Pd increases with higher signal to interference and noise ratio [55]. At
the spectrum-sharing radar receiver, we employ the sub-Nyquist approach described in
Section 1.3.2, where the delay-Doppler map is recovered from the subset of Fourier
coefficients defined by FR .
Optimal Radar Transmit Bands

We now explain the procedure through which a CRr selects transmit subbands that
have minimal spectral interference. The REM is assumed to be known to the radar
transmitter in the form of typical interfering energy levels with respect to frequency
bands, represented by a vector y ∈ Rq , where q is the number of frequency bands with
bandwidth by |F|/q. In addition, the information from the CRo indicates that the
radar waveform must avoid all frequencies in the set FC . Therefore, we set y to be
equal to ∞ in these bands. Our goal is to select subbands from the set F \ FC with
minimal interference. We do that by seeking a block-sparse frequency vector w ∈ Rp
with unknown block lengths, where p is the number of discretized frequencies, whose
support indicates frequency bands with low interference for the radar. Each entry of w
represents a subband of bandwidth bw |F |/p.
To this end, we use the structured sparsity framework of [83] based on the one-
dimensional graph sparsity structure whose nodes denote the p frequency points of w. In
order to find the desired block-sparse w, the formulation in [83] replaces the traditional
sparse recovery 0 constraint by a more general term c(w), referred to as the coding
complexity, such that c(F ) = g log p + |F |, where F ⊂ {1,. . .,p} is a sparse subset
of the index set of the coefficients of w and g is the number of connected regions or
blocks of F . This coding complexity, which accounts for both the number of discretized
frequencies |F | and the number of connected regions g, favors blocks within the graph.
In our setting, this reduces to solving the following optimization problem for finding the
block-sparse frequency vector w with (yinv )i = 1/yi :
minimizew ||yinv − Dw||22 + λc(w), (1.44)
where λ is a regularization parameter and c(w) is defined by c(w) = minF {c(F )|

supp(w) ⊂ F }. The matrix D is q × p matrix and maps each discrete frequency in
w to the corresponding band in yinv . That is, the (i,j )th entry of D is equal to 1 if the j th
frequency in w belongs to the ith band in y; otherwise, it is equal to 0. Problem (1.44)
can be solved using structured OMP [83].
Delay-Doppler Recovery
In order to recover the delay-Doppler map from only Nb transmitted narrow bands,
CRr employs a sub-Nyquist receiver that we explained earlier in Section 1.3.2. The
radar receiver first filters the CRr subbands supported on FR and computes the Fourier
coefficients of the received signal. Our resulting spectrum sharing SpeCX framework is
summarized in Algorithm 3.
Algorithm 2 Cognitive Radar Band Selection [20]

Input: REM vector y and subbands bandwidth by = |F|/q, shared support F,
communication support FC , mapping matrix D, number of discretized frequencies
p, number of bands Nb
Output: Block sparse vector w, radar support FR
1: Set yi = ∞, for each ith subband not in FC and compute (yinv )i = 1/yi
2: Initialization F0 = ∅, w = 0, t = 1
3: Find the index λ t so that λ t = arg max φ(i), where
||Pi (Dŵt−1 − yinv )||22

φ(i) =
c(i Ft−1 ) − c(Ft−1 )
with Pi = Di (DTi Di )† DTi

4: Augment index set Ft = λt Ft−1
5: Find the new estimate ŵt|Ft = D†Ft yinv, ŵt|F C = 0
t
6: If the number of blocks, or connected regions, g(w) > Nb , go to step 7. Otherwise,
return to step 3
7: Remove the last index λt so that Ft = Ft−1 and ŵt = ŵt−1

8: Compute the radar support FR = j ∈Ft [j bw − |F|/2,(j + 1)bw − |F|/2] with
bw = |F|/p
Algorithm 3 Spectral Coexistence via Xampling (SpeCX) [20]

Input: Communication signal xC (t)
Output: Estimated target parameters { α̂ l , τ̂l , ν̂l }L−1
l=0
1: Initialization: perform spectrum sensing at the CRo receiver on xC (t) following the
procedure in Section 1.5.1
2: Choose the least noisy subbands for the radar transmit spectrum with respect to
detected FC using Algorithm 2
3: Send FR to communication and radar receivers
4: Perform target delay and Doppler estimation using Algorithm 1
5: Perform spectrum sensing at the communication receiver on x(t) = xC (t) + xR (t)
following the procedure in Section 1.5.1
6: If FC changes, then the radar transmitter goes back to step 2
For time-delay estimation, [19] compares the performance of conventional and cog-
nitive radars using the extended Ziv–Zakai lower bound (EZB). In a conventional radar,
the EZB for a single target delay estimate τˆ0 is

3/2 SN R
SN R 4
EZBR ( τˆ0 ) = στ20 · 2Q + 2
, (1.45)
2 SN R · F
where Q(·) denotes the right tail Gaussian probability function, a (b) is the incomplete
gamma function with parameter a and upper limit b, and F is the root-mean-square
26 Mishra and Eldar
(rms) bandwidth of the full-band signal. The bound for CRr is given in the following
theorem.
theorem 1.5.1 [19] The extended Ziv–Zakai lower bound (EZB) for delay estimation
in a cognitive radar is

SN R
⎛ ⎞ 3/2

SN R 4
EZBCRr ( τ̂0 ) = στ20 · 2Q ⎝ ⎠+ , (1.46)
2
Nb
SN Ri · Fi2
i=1
where SN Ri and Fi are the in-band SNR and rms bandwidth of the ith subband and

SN R is the total SNR.
b i
As noted in [19], since N
i=1 Br ⊂ Bh , we have SN R > SN R for given PT . There-
fore, the SNR threshold for asymptotic performance of EZBCRr is lower than EZBR .
As the noise increases and power remains constant for both radars, the asymptotic
performance of EZBCRr is more tolerant to noise than EZBR .
The multiband design strategy, besides allowing a dynamic form of the transmitted
signal spectrum over only a small portion of the whole bandwidth to enable spectrum
sharing, has two additional advantages. First, as we show in hardware experiments
(Section 1.5.3), our CS reconstruction achieves the same resolution as traditional
Nyquist processing over a significantly smaller bandwidth. Second, the entire transmit
power is concentrated in small narrow bands. Therefore, the SNR in the sampled bands
is improved, which leads to better parameter estimation, as indicated by Theorem 1.5.1.
1.5.3 SpeCX Prototype

Figure 1.13 shows our SpeCX prototype, composed of a CRo receiver and a CRr
transceiver. The CRo hardware realizes the system shown in Figure 1.12. At the heart
of the system lies our proprietary MWC board [84] that implements the sub-Nyquist
Comm Display
Comm Analog Rx
Comm Digital Rx
Radar Display
Signal Generator Radar Analog Rx Radar Digital Rx
Figure 1.13 Shared spectrum prototype [20]. The system is composed of a signal generator,
a CRo receiver based on the MWC, a communication digital receiver, and a CRr analog and
digital receiver.
analog front-end receiver. The card first splits the wideband signal into M = 4 hardware
channels, with an expansion factor of q = 5, yielding Mq = 20 virtual channels after
digital expansion. In each channel, the signal is then mixed with a periodic sequence
pi (t), generated on a dedicated FPGA, with fp = 20 MHz. The sequences are chosen as
truncated versions of Gold Codes. These were heuristically found to give good detection
results [85], primarily due to small bounded cross-correlations within a set.
Next, the modulated signal passes through a Chebyshev LPF of 7th order with a cut
off frequency (−3 dB) of 50 MHz. Finally, the low-rate analog signal is sampled by
a National Instruments ADC operating at fs = (q + 1)fp = 120 MHz, leading to
a total sampling rate of 480 MHz. The digital receiver is implemented on a National
Instruments PXIe-1065 computer with DC-coupled ADC. Since the digital processing
is performed at the low rate 120 MHz, very low computational load is required in order
to achieve real time recovery. MATLAB and LabVIEW platforms are used for digital
recovery operations.
The prototype is fed with RF signals composed of up to Nsig = 5 real communication
transmissions, namely K = 10 spectral bands with total bandwidth occupancy of up to
200 MHz and varying support, with Nyquist rate of 6 GHz. To test the system’s support
recovery capabilities, an RF input is generated using vector signal generators, each
producing a modulated data channel with individual bandwidth of up to 20 MHz, and
carrier frequencies ranging from 250 MHz up to 3.1 GHz. The input transmissions then
go through an RF combiner, resulting in a dynamic multiband input signal that enables
fast carrier switching for each of the bands. This input is specially designed to allow
testing the system’s ability to rapidly sense the input spectrum and adapt to changes, as
required by modern CRo and shared spectrum standards, e.g., in the SSPARC program.
The system’s effective sampling rate, equal to 480 MHz, is only 8% of the Nyquist
rate and 2.4 times the Landau rate. The main advantage of the Xampling framework,
demonstrated here, is that sensing is performed in real-time from sub-Nyquist samples
for the entire spectral range.
Support recovery is digitally performed on the low rate samples. The prototype
successfully recovers the support of the CRo transmitted bands, as demonstrated in
Figure 1.14. The signal is then reconstructed in real-time. Reconstruction does not
Figure 1.14 SpeCX communication system display [20] showing (a) low rate samples acquired
from one MWC channel at rate 120 MHz, and (b) digital reconstruction of the entire spectrum
from sub-Nyquist samples.
28 Mishra and Eldar
require interpolation to the Nyquist rate and the active transmissions are recovered at
the low rate of 20 MHz, corresponding to the bandwidth of the slices z(f ) defined
in (1.32). By combining spectrum sensing and signal reconstruction, the MWC serves
as two separate communication devices. The first is a state-of-the-art CRo that performs
real time spectrum sensing at sub-Nyquist rates, and the second is a receiver that is
able to decode multiple data transmissions simultaneously, regardless of their carrier
frequencies, while adapting to real-time spectral changes.
The CRr system is based on the sub-Nyquist radar receiver board described in
Section 1.3.4. The prototype simulates transmission of P = 50 pulses towards L = 9
targets. The CRr transmits over Nb = 4 bands, selected according to the procedure
presented in Section 1.5.2, after the spectrum-sensing process has been completed by
the communication receiver. We compare the target detection performance of our CRr
with a traditional wideband radar with bandwidth Bh = 20 MHz. The CRr-transmitted
bandwidth is thus equal to 3.2% of the wideband.
Figure 1.15 shows windows from the graphical user interface (GUI) of our CRr
system. Figure 1.15a illustrates the coexistence between the radar transmitted bands
Figure 1.15 SpeCX radar display [20] showing (a) coexisting CRo and CRr (b) CRr spectrum
compared with the full-band radar spectrum. The range-Doppler display of detected and true
locations of the targets for the case of (a) CRr (four disjoint bands) and (d) all four transmit
subbands together forming a contiguous 320 kHz band.
(thick curve) and the existing communication bands (thin curve). The gain in power
is demonstrated in Figure 1.15b, which plots the wideband radar spectrum, CRr, and
noise. The true and recovered range-Doppler maps for the CRr (whose transmit signal
consists of four disjoint subbands) are shown in Figure 1.15c. All 9 targets are perfectly
recovered and clutter is discarded. Figure 1.15d shows the performance when the four
subbands are joined together to result in a 320 kHz contiguous band for the radar
transmitter. There are many missed detections and false alarms in this case. Let the
true and estimated ranges of the ith target be di and d̂i , respectively. Then the rms
localization error (RMSLE) of L targets is given by
!
"
"1 L
RMSLE = # (di − d̂i )2 . (1.47)
L
i=1
In Figure 1.15c–d, the RMSLE is shown as follows: CRr (0.34 km), 320 kHz band
or 4 adjacent bands with same bandwidth (8.1 km), and wideband (1.2 km). The poor
resolution of the 4 adjacent bands scenario is due to its small aperture. The native range
resolution in case of 2 MHz wideband scenario is 75 m. In Figure 1.15c, the CRr is able
to detect 9 targets at locations 6.097, 31.764, 35.046, 35.451, 35.479, 81.049, 81.570,
121.442, and 120.922 km. Here, the distance between two closely spaced targets is less
than 75 m.
1.6 Spatial Sub-Nyquist: Application to MIMO Radar
We now consider extending sub-Nyquist processing to the spatial domain for the
particular case of MIMO radar [86]. MIMO radars use an array of several transmit
and receive antenna elements, with each transmitter radiating a different, mutually
orthogonal waveform. Waveform orthogonality can be in time, frequency or code. Our
system is based on the collocated MIMO configuration [87], in which the elements
are close to each other so that the radar cross section of a target appears identical in
all elements. The MIMO receiver separates and coherently processes the target echoes
that correspond to each transmitter. The angular resolution of MIMO using the classic
virtual ULA is the same as a phased array with equivalent virtual aperture but many
more antenna elements.
Conventional MIMO radar’s spatial (angular) and range resolutions are limited by
the number of elements and the receiver sampling rate, respectively. Here, we extend
the Xampling framework for temporal sub-Nyquist radar in Section 1.3 to both space
and time by simultaneously thinning an antenna array and sampling received signals at
sub-Nyquist rates. This sub-Nyquist collocated MIMO radar (SUMMeR) recovers the
target range, azimuth, and Doppler velocity without loss of any of the aforementioned
radar resolutions. In SUMMeR, the radar antenna elements are randomly placed within
the aperture, and signal orthogonality is achieved by frequency division multiplexing
(FDM). The FDM-based sub-Nyquist MIMO mitigates the range-azimuth coupling by
randomizing the element locations in the aperture [88].
30 Mishra and Eldar
Figure 1.16 Location of transmit (diamonds) and receive (triangles) antenna elements within the
same physical aperture for (a) conventional MIMO array with T = 5 transmitters and R = 4
receivers, (b) virtual ULA with T R = 20 antenna elements, and (c) randomly thinned MIMO
array with M = 4 transmitters and Q = 3 receivers.
1.6.1 Sub-Nyquist Collocated MIMO Radar Model

Let the operating wavelength of the radar be λ and the total number of transmit and
receive elements be T and R respectively. The classic approach to collocated MIMO
adopts a virtual ULA structure, where the receive antennas spaced by λ2 and transmit
antennas spaced by R λ2 form two ULAs (or vice versa). Here, the coherent processing
of a total of T R channels in the receiver creates a virtual equivalent of a phased array
with T R λ2 -spaced receivers and normalized aperture Z = T2R . This standard array
structure and the corresponding receiver virtual array are illustrated in Figure 1.16a–b
for T = 5 and R = 4.
Consider a collocated MIMO radar system that has M < T transmit and Q < R
receive antennas. The locations of these antennas are chosen uniformly at random within
the aperture of the virtual array mentioned previously, as in Figure 1.16c. The mth
transmitting antenna sends P pulses
P
−1
sm (t) = hm (t − pτ)ej 2πfc t , 0 ≤ t ≤ P τ, (1.48)
p=0
where {hm (t)}M−1

m=0 is a set of narrowband, orthogonal FDM pulses each with CTFT
∞
Hm (ω) = hm (t)e−j ωt dt. (1.49)
−∞
For simplicity, we assume that fc τ is an integer. The pulse time support is denoted
by Tp .
Consider a target scene with L non-fluctuating point targets following the Swerling-0
model [1] whose locations are given by their ranges Rl , Doppler velocity vl , and azimuth
angles θl , 1 ≤ l ≤ L. The pulses transmitted by the radar are reflected back by the
targets and collected at the receive antennas. When the received waveform is downcon-
verted from RF to baseband, we obtain the following signal at the qth antenna,

P −1 M−1
L
α l hm (t − pτ − τl ) ej 2πβmq ϑl ej 2πfl
D pτ
xq (t) = , (1.50)
p=0 m=0 l=1
where α l denotes the complex-valued reflectivity of the lth target, τl = 2Rl /c is the
range-time delay the lth target, flD = 2vc l fc is the frequency in the Doppler spectrum,
ϑl = sin θl is the azimuth parameter, and βmq is governed by the array structure. We
express xq (t) as a sum of single frames

P −1
p
xq (t) = xq (t), (1.51)
p=0
where

M−1 L
α l h(t − τl − pτ)ej 2πβmq ϑl ej 2πfl
p D pτ
xq (t) = . (1.52)
m=0 l=1
Our goal is to estimate the time delay τl , azimuth θl , and Doppler shifts flD of each
target from low rate samples of xq (t), for 0 ≤ q ≤ Q − 1, and a small number of M
channels and Q antennas.
1.6.2 Xampling in Time and Space

The application of Xampling in both space and time enables recovery of range, direction,
and velocity at sub-Nyquist rates. The sampling technique is the same as in Section
1.3.2, but now the low-rate samples are obtained in both range and azimuth domains.
The received signal xq (t) is separated into M channels, aligned, and then normalized.
The Fourier coefficients of the received signal corresponding to the channel that pro-
cesses the mth transmitter echo at the qth receiver are given by

L
2π
α l ej 2πβmq ϑl e−j e−j 2πfm τl ej 2πfl
p D pτ
ym,q [k] = τ kτl , (1.53)
l=1
where − N2 ≤ k ≤ − N2 −1, fm is the (baseband) carrier frequency of the mth transmitter,

and N is the number of Fourier coefficients per channel.
As in traditional MIMO, assume that the time delays, azimuths, and Doppler frequen-
cies are aligned to a grid. In particular, τl = TτN sl , ϑl = −1 + T2R rl , and flD = − 2τ1
+
1
Pτ u l , where sl , r l , and ul are integers satisfying 0 ≤ sl ≤ T N − 1, 0 ≤ r l ≤ T R − 1,
and 0 ≤ ul ≤ P − 1, respectively. Let Zm be the KQ × P matrix with qth column given
p
by the vertical concatenation of ym,q [k],k ∈ κ, for 0 ≤ q ≤ Q − 1. We can then write
Zm as
$ %
Zm = Bm ⊗ Am XD FH P. (1.54)
32 Mishra and Eldar
2π −j 2π fm n
Here, Am denotes the K × T N matrix whose (k,n)th element is e−j T N κk n e Bh T
with κk the kth element in κ, B is the Q × T R matrix with (q,p)th element

m
2
e−j 2πβmq (−1+ T R p) , and FP denotes the P × P Fourier matrix. The matrix XD is a
T 2 NR × P sparse matrix that contains the values αl at the L indices (rl T N + sl ,ul ).
The range and azimuth dictionaries Am and Bm are not square matrices due to low-
rate sampling of Fourier coefficients at each receiver and reduction in antenna elements,
respectively. Therefore, the system of equations in (1.54) is undetermined in azimuth
and range. Our goal is to recover XD from the measurement matrices Zm,0 ≤ m ≤
M − 1. The temporal, spatial, and frequency resolution stipulated by XD are T 1Bh , T2R ,
1
and Pτ respectively.
theorem 1.6.1 [17] The minimal number of transmit and receive array elements, i.e.,
M and Q, respectively, required for perfect recovery of XD with L targets in a noiseless
setting are determined by MQ ≥ 2L. In addition, the number of samples per receiver is
at least MK ≥ 2L where K is the number of Fourier coefficients sampled per receiver
and the number of pulses per transmitter is P ≥ 2L.
Theorem 1.6.1 shows that the number of SUMMeR transmit and receive elements
as well as samples K depend only on the number of targets present. These design
parameters, therefore, can be substantially lesser than the requirements of a Nyquist
MIMO array. Similar results for temporal and Doppler sub-Nyquist radars were obtained
in Theorems 1.3.1 and 1.6.1.
1.6.3 Range-Azimuth-Doppler Recovery

To jointly recover the range, azimuth, and Doppler frequency of the targets, we apply the
concept of Doppler focusing from Section 1.3.2 to our MIMO setting. Doppler focusing
for a specific frequency ν yields

P −1
ym,q [k]e−j 2πν pτ
p
νm,q [k] = (1.55)
p=0

L
P −1
2π
α l ej 2πβmq ϑl e−j τ (k+fm τ)τl
D −ν )pτ
= ej 2π(fl ,
l=1 p=0
for − N2 ≤ k ≤ − N2 − 1. Following Section 1.3.2, it holds that

P −1
j 2π(flD −ν )pτ ∼ P |flD − ν| < 1

2P τ ,
e = (1.56)
0 otherwise.
p=0
Then, for each focused frequency ν, (1.55) reduces to a 2D problem, which can be
solved using CS recovery techniques, as summarized in Algorithm 4. Note that step 1
can be performed using the FFT. In the algorithm

description, vec(Z)concatenates the
T
columns of Zm , for 0 ≤ m ≤ M − 1, et (l) = (e0t (l))T · · · (eM−1
t (l))T where
& 'T
et (l) = vec (B̄ ⊗ A )t (l,2)T N+t (l,1) (F̄ )t (l,3)
m m m m T
, (1.57)
with t (l,i) the (l,i)th element in the index set t at the tth iteration, and Et =
[et (1) . . . et (t)]. Once XD is recovered, the delays, azimuths, and Dopplers are esti-
mated as
τL (l,1) 2L (l,2) ˆD 1 L (l,3)
τ̂l = , ϑ̂ l = −1 + , fl = − + . (1.58)
TN TR 2τ Pτ
Since in real scenarios, targets delays, Dopplers, and azimuths are not necessarily
aligned to a grid, a finer grid can be used around detection points on the coarse grid
to reduce quantization error. This technique adds a step after support detection in each
iteration (step 4 in Algorithm 4).
1.6.4 Multi-Carrier and Cognitive Transmission

The frequency bands left vacant can be exploited to increase the system’s performance
without expanding the total bandwidth of Btot = T Bh . Denote by γ = T /M the com-
pression ratio of the number of transmitters. In multi-carrier SUMMeR, every transmit
antenna sends γ pulses, each belonging to a different frequency band, in one PRI. The
total number of user bands is M γBh = T Bh . The ith pulse of the pth PRI is transmitted
at time i γτ + pτ, for 0 ≤ i < γ and 0 ≤ p ≤ P − 1. The samples are then acquired
and processed as described in Sections 1.6.2 and 1.6.3. Besides increasing the detection
performance, this method multiplies the Doppler dynamic range by a factor of γ with
the same Doppler resolution since the CPI, equal to P τ, is unchanged. Preserving the
CPI allows us to maintain the targets’ stationarity.
Cognitive transmission described in Section 1.5.2 can also be extended to a SUMMeR
system wherein the spectrum of each of the transmitted waveforms is limited to a
few nonoverlapping frequency bands while keeping the transmit power per transmitter
the same. Cognitive transmission imparts two advantages to the SUMMeR hardware.
First, the spatial sub-Nyquist processing of large arrays can be easily designed without
replicating the pre-filtering operation for each subband in the hardware. Second, since
the total transmit power remains the same, a cognitive signal has more in-band power
resulting in an increase in SNR as discussed in Section 1.5.2.
1.6.5 Cognitive SUMMeR Hardware Prototype

A cognitive SUMMeR prototype was first presented in [89]. The system realizes a
receiver with a maximum of 8 transmit (Tx) and 10 receive (Rx) antenna elements.
A scenario includes modeling of pulse transmission, accurate power loss due to wave
propagation in a realistic medium, and interaction of a transmit signal with the target.
A large variety of scenarios, consisting of different target parameters, i.e., delays,
34 Mishra and Eldar
Algorithm 4 Simultaneous sparse 3D recovery based OMP with focusing [17]

Input: Observation matrices Zm , measurement matrices Am , Bm , for all 0 ≤ m ≤
M −1
Output: Index set containing the locations of the non zero indices of X, estimate
for sparse matrix X̂
1: Perform Doppler focusing for 0 ≤ i ≤ K − 1 and 0 ≤ j ≤ Q − 1:

P −1
(m, ν ) j 2π ν pτ
i,j = Zm
i+j K,p e .
p=0
(m, ν )
2: Initialization: residual R0 = (m, ν ) , index set 0 = ∅, t = 1
3: Project residual onto measurement matrices for 0 ≤ p ≤ P − 1:
ν = AH Rν B,
T T T T T T
where A &= [A0 A1 · · · A(M−1) ' ] , B = [B B · · · B
T 0 1 (M−1) ]T , and
(0, ν ) (M−1, ν )
R = diag [Rt−1 · · · Rt−1
ν ] is block diagonal
4: Find the three indices λ t = [λ t (1) λ t (2) λ t (3)] such that

[λt (1) λt (2) λt (3)] = arg maxi,j, ν νi,j

5: Augment index set t = t {λ t }
6: Find the new signal estimate
α̂ = [ α̂1 . . . α̂ t ]T = (ETt Et )−1 ETt vec(Z)

7: Compute new residual
& ' & 'T
(m, ν )

t
j 2π − 12 + t P(l,3) p m
Rt = Zm − αl e at (l,1) b̄m
t (l,2)
l=1
8: If t < L, increment t and return to step 2; otherwise stop

9: Estimated support set ˆ = L
10: Estimated matrix X̂D : (L (l,2)T N + L (l,1),L (l,3))-th component is given by
α̂ l while rest of the elements are zero
Doppler frequencies, and amplitudes, and array configurations, i.e., number of trans-
mitters and receivers and antenna locations, can be examined using the prototype. The
waveform generator board produces an analog signal corresponding to the synthesized
radar environment, which is amplified and routed to the MIMO radar receiver board.
The prototype then samples and processes the signal in real time. The physical array
aperture and simulated target response correspond to an X-band (fc = 10 GHz) radar.
A conventional 8 × 10 MIMO radar receiver would require simultaneous hardware
processing of 80 (or 160 I/Q) data streams. Since a separate sub-Nyquist receiver for
each of these 80 channels is expensive, we implement the 8-channel analog process-
ing chain for only 1 receive antenna element, and serialize the received signals of all
10 elements through this chain. This approach allows the prototype to implement a
Table 1.2 Technical characteristics of the cognitive SUMMeR prototype.
Parameters Mode 1 Mode 2 Mode 3 Mode 4
#Tx, #Rx 8,10 8,10 4,5 8,10

Element placement Uniform Random Random Random
Equivalent aperture 8 × 10 8 × 10 8 × 10 20 × 20
Angular resolution (sine of DoA) 0.025 0.025 0.025 0.005
Range resolution 1.25 m
Signal bandwidth per Tx 12 MHz (15 MHz including guard-bands)
Pulse width 4.2 μs
Carrier frequency 10 GHz
Unambiguous range 15 km
Unambiguous DoA 180◦ (from −90◦ to 90◦ )
PRI 100 μs
Pulses per CPI 10
Unambiguous Doppler from −75 to 75 m/s
number of receivers greater than 10 as the 8-channel hardware only limits the number
of transmitters.
If we use the same pre-filtering approach as in Section 1.3.4 for each of the eight
channels of our sub-Nyquist MIMO prototype, then the hardware design would need a
total of 4 × 8 = 32 BPFs and ADCs excluding the analog filters to separate transmit
channels. We sidestep this requirement by adopting cognitive transmission wherein
the analog signal of each channel exists only in certain predetermined subbands and
consequently, a BPF stage is not required. More importantly, for each channel, a single
low-rate ADC subsamples this narrow-band signal as long as the subbands are coset
bands so that they do not alias after sampling [46]. This implementation needs only
eight low-rate ADCs, one per channel. Another advantage of this approach is flexibility
of the prototype in selecting the Xampling slices. Unlike in Section 1.3.4, the number
and spectral locations of slices are not permanently fixed, and they can be changed.
Table 1.2 lists detailed technical characteristics of the prototype. The system can
be configured to operate in various array configurations or modes. Mode 3 and 4 are
sub-Nyquist MIMO modes; the hardware switches off the inactive channels and does
not sample any data over the corresponding ADCs. Figure 1.18 shows the sub-Nyquist
MIMO prototype, user interface and radar display. As shown in Figure 1.19a, the cog-
nitive radar signal occupies only certain subbands in a 15 MHz band. Here, the sliced
transmit signal has eight subbands each of width 375 kHz with the frequency range of
1.63–2, 2.16–2.53, 3.05–3.42, 3.88–4.25, 5.66–6.03, 6.51–6.88, 8.64–9.01, and 12.32–
12.69 MHz before subsampling. The total signal bandwidth is 0.375 × 8 = 3 MHz.
This signal is subsampled at 7.5 MHz and the subbands locations were chosen so that
there is no aliasing between different subbands (Figure 1.19b). A noncognitive signal
would have occupied the entire 15 MHz spectrum requiring a Nyquist sampling rate of
30 MHz. Therefore, the use of cognitive transmission enables spectral sampling reduc-
tion by a factor of 4 (= 30 MHz/7.5 MHz) for each channel. Depending on whether
the guard-bands of the noncognitive transmission are included in the computation or
not, the effective signal bandwidth is reduced by a factor of 5 (= 15 MHz/3 MHz) or 4
36 Mishra and Eldar
Figure 1.17 Tx and Rx element locations for the hardware prototype modes over a 6 m antenna
aperture. Mode 4’s virtual array equivalent is the 20 × 20 ULA [18].
Figure 1.18 Sub-Nyquist MIMO prototype and user interface. The analog preprocessor module
consists of two cards mounted on opposite sides of a common chassis [18].
Figure 1.19 The normalized one-sided spectrum of one channel of a given receiver (a) before and
(b) after subsampling with a 7.5 MHz ADC. Each of the subbands spans 375 kHz and is marked
with a numeric label. In a noncognitive processing, the signal occupies the entire 15 MHz
spectrum before sampling [18].
(= 12 MHz/3 MHz) respectively for each channel. Mode 3 has 50% spatial sampling
reduction when compared with Mode 1 or 2. Table 1.3 summarizes the reduction of
various resources in Mode 3 when compared with Mode 1.
We evaluated the performance of all modes through hardware experiments. We trans-
mitted P = 10 pulses at a PRF of 100 μs and all modes were evaluated against identical
target scenarios. In the first experiment, when the angular spacing (in terms of the
sine of azimuth) between any two targets was greater than 0.025 and the signal SNR
= −8 dB, the recovery performance of the thinned 4 × 5 array in Mode 3 was not
worse than Modes 1 and 2. For this experiment, Figures 1.20 and 1.21 show the plan
Table 1.3 Cognitive SUMMeR Prototype: comparison of resource reduction.
Resource Nyquist Mode 1 Sub-Nyquist Mode 3 Reduction
Bandwidth usage per Tx (including 15 MHz 3 MHz 80%

guard-bands)
Bandwidth usage per Tx (excluding 12 MHz 3 MHz 75%
guard-bands)
Temporal sampling rate per channel 30 MHz 7.5 MHz 75%
Spatial sampling rate 8 × 10 4×5 50%
Tx/Rx hardware channels 80 20 75%
Figure 1.20 Plan position indicator (PPI) display of results for (a) Mode 1 (b) Mode 2 (c) Mode 3
and (d) Mode 4. The origin is the location of the radar. The dark dot indicates the north direction
relative to the radar. Positive (negative) distances along the horizontal axis correspond to the east
(west) of the radar. Similarly, positive (negative) distances along the vertical axis correspond to
the north (south) of the radar. The estimated targets are plotted over the ground truth [18,53].
Figure 1.21 Range-azimuth-Doppler map for the target configurations shown in Figure 1.20 for
(a) Mode 1 (b) Mode 2 (c) Mode 3 and (d) Mode 4. The lower axes represent Cartesian
coordinates of the polar representation of the PPI plots from Figure 1.20. The vertical axis
represents the Doppler spectrum [18,53].
38 Mishra and Eldar
Figure 1.22 PPI plots as in Figure 1.20 for (a) Mode 1 (b) Mode 2 (c) Mode 3 and (d) Mode 4.
Only Mode 3 is operating cognitively. All modes have the same overall transmit power per
transmitter. The inset plots show the selected region in each PPI display on a magnified scale.
Figure 1.23 Range-azimuth-Doppler maps as in Figure 1.21 for (a) Mode 1 (b) Mode 2 (c) Mode 3
and (d) Mode 4. Only Mode 3 is operating cognitively. All modes have the same overall transmit
power per transmitter. The inset plots show the selected region in each map on a magnified scale.
position indicator (PPI) plot and range-azimuth-Doppler maps of all the modes. Here, a
successful detection (circle with light fill and no boundary) occurs when the estimated
target is within one range cell, one azimuth bin, and one Doppler bin of the ground truth
(circle with dark boundary and no fill); otherwise, the estimated target is labeled as a
false alarm (circle with dark fill). When a target remains undetected, we label the ground
truth location as a missed detection (circle with hatched fill).
Finally, we considered a high-noise scenario with SNR = −15 dB. We operated only
Mode 3 cognitively and kept all other modes in noncognitive mode. We noticed that the
noncognitive Nyquist 8 × 10 Mode 1 array exhibits false alarms while cognitive sub-
Nyquist 4 × 5 Mode 3 array is still able to detect all the targets (Figures 1.22 and 1.23),
thereby demonstrating robustness to low SNR.
1.7 Sub-Nyquist SAR
Synthetic aperture radar (SAR) and other similar radar techniques were one of the
first applications of CS methods (see reviews in [22,25]). SAR imaging data are not
naturally sparse in the range-time domain. However, they are often sparse in other
domains, such as wavelets. Our motivation to apply sub-Nyquist methods here is to
address the following SAR processing challenge. Among the several algorithms that are
available to process SAR data, the range-Doppler algorithm (RDA) is most widely used
to obtain high-resolution images [90]. Its performance is, however, limited by the range
cell migration correction (RCMC) step, which requires oversampled data in order to
decouple range and azimuth axes.
Recently, [21] proposed a sub-Nyquist SAR that replaces RDA by a Fourier domain
method that achieves non-integer nonconstant shifts in the RCMC interpolation via
the Fourier series coefficients. This avoids the interpolation step in RCMC and fur-
ther allows sub-Nyquist sampling that follows the Fourier-domain analysis presented in
previous sections. A similar technique was earlier employed in ultrasound imaging [48]
to dramatically reduce sampling and processing rates.
In this section, we present this Fourier domain RDA processing as a framework
for sub-Nyquist sampling of SAR signals. The first part of the sub-Nyquist algorithm
exploits the relationship between the signals before and after RCMC in the Fourier
domain. We show analytically that a single Fourier coefficient after RCMC can be
computed using a small number of Fourier coefficients of the raw data, which translates
into low rate sampling as shown in Section 1.3.2. Having the partial Fourier samples
after RCMC, the second part of the algorithm is aimed at solving a 2D CS problem in
order to reconstruct the image from the low rate samples. Finally, we show that cognitive
transmission can also be extended to SAR. We end by demonstrating a prototype that
we designed and developed to realize concepts of cognitive SAR (CoSAR) [21].
1.7.1 Traditional SAR Processing via RDA

Consider a radar that travels along a path with velocity ν and transmits a time-limited
pulse h(t) at PRI T . The pulse has negligible energy at frequencies beyond the band-
width Bh /2. The transmitted pulses are sent from M different locations, {xm }M−1
m=0 , where
x0 is the origin and ||xm − x0 || = m|ν|T . The pulses are transmitted into a scene with
reflectivity σ(r). The received signal, after coherent demodulation, is given by

dm (t) = σ(r)h(t − 2||r − xm ||/c) × wa (xm,r)e−j 4πfc ||r−xm ||/c dr, (1.59)
where ||r − xm || is the distance from the radar to a scatter point and wa (xm,r) is the
antenna beam pattern, which varies depending on the SAR operation mode [90]. The
main goal of SAR data processing is to construct the scene’s reflectivity map, σ(r),
from the raw data. The reflected signal dm (t) at a point m requires sampling at least
at the bandwidth Bh , as per the Nyquist sampling theorem. The resulting discrete-time
40 Mishra and Eldar
signal is d[n,m] = dm (nT s), with 0 ≤ n < N = Tfs , where fs = 1/Ts is the
sampling rate.
RDA processing consists of the following steps. First the sampled raw data is com-
pressed in the range dimension:
s[n,m] = d[n,m] ∗ h∗ [−n], (1.60)
where h[n] is the sampled transmit signal. This data is then transformed to the range-
Doppler domain using DFT along the azimuth:

M−1
S[n,k] = s[n,m]e−j 2πkm/M . (1.61)
m=0
RCMC is applied assuming a far-field approximation. The purpose of RCMC is to

compensate for the effect of range cell migration due to the varied satellite-scatterer
distance and to correct the hyperbolic behavior of the target trajectories. The RCMC
operator can be written as
C̃[n,k] = S[n + nak 2,k]. (1.62)
For every Doppler frequency k, the range axis is scaled by 1 + ak 2 . The value of a
is predetermined depending on the observation mode. For example, in stripmap SAR,
a = 8|ν |Tλ 2 M 2 . This range-variant shift requires values that fall outside the discrete grid.
2
An MF then achieves compression in azimuth via

k2
Y [n,k] = C̃[n,k]e−j π Ka [n] , (1.63)
where Ka [n] is the range dependent azimuth chirp rate. Finally, an inverse DFT in the
azimuth direction yields the focused image:
1
M−1
I [n,m] = Y [n,k]ej 2πmk/M . (1.64)
M
k=0
There are two ways to implement RCMC: In the first option, RCMC is performed by
range interpolation in the range-Doppler domain. However, this interpolation is time-
consuming and computationally demanding. The second approach involves the assump-
tion that the range cell migration is range invariant, at least over a finite range block. In
this case, RCMC is implemented using a DFT, linear phase multiply, and inverse DFT
per block. However, this implementation’s disadvantage is that samples should overlap
in range, and the efficiency gain may not be worth the added complexity.
1.7.2 Fourier Domain RDA and Sub-Nyquist SAR

In this section, we introduce a new RDA processing technique implemented in fre-
quency using the Fourier series coefficients of the raw data. This paves the way for
substantial reduction in the number of samples in the time-domain interpolation needed
to obtain the same image quality and without any assumptions on the signal structure or
the invariance of range blocks.
We begin with the continuous version of (1.62):
Ck (t) = Sk (t(1 + ak 2 )), (1.65)
where Sk (nTs ) = S[n,k]. The Fourier series coefficients of Ck (t) over the interval [0,T )
can be expressed as [21]
T
1
Ck [l] = Sk (t)qk,l (t), (1.66)
T
0
where qk,l (t) approximates the scaling operation in (1.62). The Fourier coefficients of
the continuous-time signals Sk (t) and qk,l (t) are, respectively, Sk [n] and

1 −j π(n+ 1 2 ) l
Qk,l [n] = e 1+ak sinc n + . (1.67)
1 + ak 2 1 + ak 2
It can be shown that most of the energy of the set Qk,l [n] is concentrated around
a specific component nk,l . Thus, for every Doppler frequency k, the Fourier series
coefficients of the scaled signal, Ck (t), can be calculated as a linear combination of
a local choice of Fourier series coefficients of Sk (t)

Ck [l] = Sk [n]Qk,l [−n], (1.68)
n∈ν (k,l)
where ν(k,l) is the set of indices that dictate the decay property of Qk,l [n].
Assuming the Fourier series coefficients Dm [l] of the raw data dm (t) can be acquired
directly, the range compression is achieved in the Fourier domain as
D̃m [l] = T Dm [l]H ∗ [l], (1.69)
where H [l] are the Fourier series coefficients of the transmitted pulse h(t). Applying
azimuth DFT gives Sk [l] that can be used in (1.68) to perform Fourier domain RCMC.
The inverse DFT on the coefficients Ck [l] provides the corrected sampled signal after
RCMC. One could then proceed with the remaining steps of RDA, i.e., (1.63) and (1.64),
to complete the processing.
The number of Fourier coefficients required can be further reduced if a basis (e.g.,
wavelet) is found in which the desired image is sparse. Then, the relationship between
Ck [l] and the raw data samples Dm [l] can be exploited to solve for the coefficients
in the sparse basis using fewer Fourier coefficients. In [21], it was suggested that we
modify a fast iterative shrinkage-thresholding algorithm (FISTA) to solve this problem
and achieve full data reconstruction from the partial measurements with reasonable
computational load. Similar to cognitive pulse-Doppler radar (Section 1.5.2) and cog-
nitive SUMMeR (Section 1.6.5), sub-Nyquist SAR systems can also be modified to
fit cognitive radar requirements and allow for dynamic transmission and reception of
several narrow frequency bands. We present the hardware prototype of such a system in
the next subsection.
42 Mishra and Eldar
Figure 1.24 Cognitive SAR (CoSAR) prototype and (inset) analog preprocessor.
Figure 1.25 CoSAR GUI showing the cognitive and noncognitive chirp waveforms along with the
sampled subbands at top right.
1.7.3 Hardware Prototype

We designed and developed a hardware prototype of a CoSAR system and evaluated
Fourier domain RDA processing in real time. Figure 1.24 shows the entire setup. The
PRI is 51.2 μs and carrier frequency of the signal is 90 MHz. A control interface
(Figure 1.25) activates the prototype that generates the desired I/Q signal and feeds it
to the analog preprocessor (inset). The analog preprocessor filters have 30 dB stopband
attenuation in order to filter out interference from neighboring channels. The digital
receiver obtains and processes samples at low rates. The processed image is then shown
on the radar display. We used a 5 MHz cognitive chirp signal whose only 4 narrow
subbands of 625 kHz bandwidth were sampled and processed by the digital receiver. The
Xampling and RCMC are performed at 1/4th and 1/8th of the Nyquist rate, respectively.
Similar to the cognitive SUMMeR system, our CoSAR prototype can operate in
both cognitive and noncognitive modes. Figure 1.26 shows results of these modes at
Nyquist and sub-Nyquist sampling rates at SNR = 2dB. The range and cross-range
(azimuth) resolutions are 30 and 10 m, respectively. When compared with the Nyquist
rate of 10 MHz, the combined sampling rate of the 4 slices is 2.5 MHz leading to
reduction of rate by 75%. We note that CoSAR reconstruction exhibits smaller error
Figure 1.26 Comparison of prototype outputs for an image of a ship.
than the noncognitive Nyquist processing in low SNR scenarios, despite sampling at
a significantly reduced sampling rate. Further, the prototype demonstrates operation
of SAR using narrow subbands that can be adaptively changed. This opens up the
possibility of spectral coexistence of SAR with other satellite-borne services.
1.8 Summary
In this chapter, we reviewed sub-Nyquist radar principles, algorithms, prototypes,

and applications. Our focus was on pulse-Doppler systems for which sub-Nyquist
processing can be individually applied to temporal, Doppler, and spatial domains. Our
approach has distinct advantages over several past CS-based designs. The proposed
sub-Nyquist radar receivers perform low-rate sampling and processing, which can be
implemented with simple hardware, impose no restrictions on the transmitter, use a CS
dictionary that does not scale up with the problem size, and exhibit robustness to clutter
and noise.
We presented colocated MIMO radar as an application where joint spatiotemporal
sub-Nyquist processing leads to reduction in antenna elements and savings in signal
bandwidth. In SAR imaging, sub-Nyquist processing in the Fourier domain leads to
sampling rate reduction without compromising high-quality and high-resolution imag-
ing. We demonstrated that sub-Nyquist receivers lead to the feasibility of cognitive
radar, which transmits thinned spectrum signals. This development was significant in
making the spectral coexistence of radar with a communication service possible. We
also extended cognition ideas based on sub-Nyquist processing to MIMO and SAR
systems.
Most importantly, we emphasized that sub-Nyquist radars are realizable in hardware
for each of the systems described in this chapter. The hardware prototypes were in-
house and custom-made using many off-the-shelf components. The systems operate
in real-time and their performance is robust to high noise and clutter. We believe that
such practical implementations pave the way to delivering the promise of reduced-rate
processing in radar remote sensing.
44 Mishra and Eldar
References
[1] M. I. Skolnik, Radar Handbook, 3rd edn. McGraw-Hill, 2008.
[2] P. Z. Peebles, Radar Principles. Wiley-Interscience, 1998.
[3] J. E. Cilliers and J. C. Smit, “Pulse compression sidelobe reduction by minimization of
lp -norms,” IEEE Transactions on Aerospace and Electronic Systems, vol. 43, no. 3,
pp. 1238–1247, 2007.
[4] J. George, K. V. Mishra, C. M. Nguyen, and V. Chandrasekar, “Implementation of blind
zone and range-velocity ambiguity mitigation for solid-state weather radar,” in IEEE Radar
Conference, 2010, pp. 1434–1438.
[5] K. V. Mishra, V. Chandrasekar, C. Nguyen, and M. Vega, “The signal processor system for
the NASA dual-frequency dual-polarized Doppler radar,” in IEEE International Geoscience
and Remote Sensing Symposium, 2012, pp. 4774–4777.
[6] Y. C. Eldar, Sampling Theory: Beyond Bandlimited Systems. Cambridge University Press,
2015.
[7] Y. C. Eldar and G. Kutyniok, Compressed Sensing: Theory and Applications. Cambridge
University Press, 2012.
[8] D. Cohen and Y. C. Eldar, “Reduced time-on-target in pulse Doppler radar: Slow time
domain compressed sensing,” in IEEE Radar Conference, 2016, pp. 1–4.
[9] J. Akhtar, B. Torvik, and K. E. Olsen, “Compressed sensing with interleaving slow-time
pulses and hybrid sparse image reconstruction,” in IEEE Radar Conference, 2017, pp. 0006–
0010.
[10] K. V. Mishra, A. Kruger, and W. F. Krajewski, “Compressed sensing applied to weather
radar,” in IEEE International Geoscience and Remote Sensing Symposium, 2014, pp. 1832–
1835.
[11] R. P. Shenoy, “Phased array antennas,” in Advanced Radar Techniques and Systems,
G. Galati, Ed. Peter Peregrinus, 1993.
[12] E. T. Bayliss, “Design of monopulse antenna difference patterns with low sidelobes,” Bell
System Technical Journal, vol. 47, no. 5, pp. 623–650, 1968.
[13] D. K. Cheng, “Optimization techniques for antenna arrays,” Proceedings of the IEEE,
vol. 59, no. 12, pp. 1664–1674, 1971.
[14] R. L. Haupt, Timed Arrays: Wideband and Time Varying Antenna Arrays. John Wiley &
Sons, 2015.
[15] K. V. Mishra, I. Kahane, A. Kaufmann, and Y. C. Eldar, “High spatial resolution radar using
thinned arrays,” in IEEE Radar Conference, 2017, pp. 1119–1124.
[16] M. Rossi, A. M. Haimovich, and Y. C. Eldar, “Spatial compressive sensing for MIMO radar,”
IEEE Transactions on Signal Processing, vol. 62, no. 2, pp. 419–430, 2014.
[17] D. Cohen, D. Cohen, Y. C. Eldar, and A. M. Haimovich, “SUMMeR: Sub-Nyquist MIMO
radar,” IEEE Transactions on Signal Processing, vol. 66, no. 16, pp. 4315–4330, 2018.
[18] K. V. Mishra, E. Shoshan, M. Namer et al., “Cognitive sub-Nyquist hardware prototype of
a collocated MIMO radar,” in International Workshop on Compressed Sensing Theory and
its Applications to Radar, Sonar and Remote Sensing, 2016, pp. 56–60.
[19] K. V. Mishra and Y. C. Eldar, “Performance of time delay estimation in a cognitive radar,”
in IEEE International Conference on Acoustics, Speech and Signal Processing, 2017,
pp. 3141–3145.
[20] D. Cohen, K. V. Mishra, and Y. C. Eldar, “Spectrum sharing radar: Coexistence via
Xampling,” IEEE Transactions on Aerospace and Electronic Systems, vol. 29, no. 3,
pp. 1279–1296, 2018.
[21] K. Aberman and Y. C. Eldar, “Sub-Nyquist SAR via Fourier domain range-Doppler
processing,” IEEE Transactions on Geoscience and Remote Sensing, vol. 55, no. 11,
pp. 6228–6244, 2017.
[22] J. H. Ender, “On compressive sensing applied to radar,” Signal Processing, vol. 90, no. 5,
pp. 1402–1414, 2010.
[23] N. A. Goodman and L. C. Potter, “Pitfalls and possibilities of radar compressive sensing,”
Applied Optics, vol. 54, no. 8, pp. C1–C13, 2015.
[24] L. Zhao, L. Wang, L. Yang, A. M. Zoubir, and G. Bi, “The race to improve radar imagery: An
overview of recent progress in statistical sparsity-based techniques,” IEEE Signal Processing
Magazine, vol. 33, no. 6, pp. 85–102, 2016.
[25] M. Cetin, I. Stojanovic, O. Onhon, et al., “Sparsity-driven synthetic aperture radar imaging:
Reconstruction, autofocusing, moving targets, and compressed sensing,” IEEE Signal
Processing Magazine, vol. 31, no. 4, pp. 27–40, 2014.
[26] M. A. Hadi, S. Alshebeili, K. Jamil, and F. E. A. El-Samie, “Compressive sensing applied to
radar systems: An overview,” Signal, Image and Video Processing, vol. 9, no. 1, pp. 25–39,
2015.
[27] R. Baraniuk and P. Steeghs, “Compressive radar imaging,” in IEEE Radar Conference, 2007,
pp. 128–133.
[28] M. A. Herman and T. Strohmer, “High-resolution radar via compressed sensing,” IEEE
Transactions on Signal Processing, vol. 57, no. 6, pp. 2275–2284, 2009.
[29] Y.-S. Yoon and M. G. Amin, “Compressed sensing technique for high-resolution radar
imaging,” in Signal Processing, Sensor Fusion, and Target Recognition XVII, vol. 6968,
2008, p. 69681A.
[30] X. Tan, W. Roberts, J. Li, and P. Stoica, “Range-Doppler imaging via a train of probing
pulses,” IEEE Transactions on Signal Processing, vol. 57, no. 3, pp. 1084–1097, 2009.
[31] J. Zhang, D. Zhu, and G. Zhang, “Adaptive compressed sensing radar oriented toward cog-
nitive detection in dynamic sparse target scene,” IEEE Transactions on Signal Processing,
vol. 60, no. 4, pp. 1718–1729, 2012.
[32] C.-Y. Chen, “Signal processing algorithms for MIMO radar,” PhD dissertation, California
Institute of Technology, 2009.
[33] Y. Yu, A. P. Petropulu, and H. V. Poor, “Measurement matrix design for compressive
sensing–based MIMO radar,” IEEE Transactions on Signal Processing, vol. 59, no. 11,
pp. 5338–5352, 2011.
[34] Y. Yu, A. P. Petropulu, and H. V. Poor, “MIMO radar using compressive sampling,” IEEE
Journal on Selected Topics in Signal Processing, vol. 4, no. 1, pp. 146–163, 2010.
[35] Y. Chi, L. L. Scharf, A. Pezeshki, and A. R. Calderbank, “Sensitivity to basis mismatch in
compressed sensing,” IEEE Transactions on Signal Processing, vol. 59, no. 5, pp. 2182–
2195, 2011.
[36] R. Heckel, V. I. Morgenshtern, and M. Soltanolkotabi, “Super-resolution radar,” Information
and Inference: A Journal of the IMA, vol. 5, no. 1, pp. 22–75, 2016.
[37] R. Heckel, “Super-resolution MIMO radar,” in IEEE International Symposium on Informa-
tion Theory, 2016, pp. 1416–1420.
[38] G. Tang, B. N. Bhaskar, P. Shah, and B. Recht, “Compressed sensing off-the-grid,” IEEE
Transactions on Information Theory, vol. 59, no. 11, pp. 7465–7490, 2013.
46 Mishra and Eldar
[39] K. V. Mishra, M. Cho, A. Kruger, and W. Xu, “Super-resolution line spectrum estimation
with block priors,” in Asilomar Conference on Signals, Systems and Computers, 2014, pp.
1211–1215.
[40] W. U. Bajwa, K. Gedalyahu, and Y. C. Eldar, “Identification of parametric underspread linear
systems and super-resolution radar,” IEEE Transactions on Signal Processing, vol. 59, no. 6,
pp. 2548–2561, 2011.
[41] W. Kozek and G. E. Pfander, “Identification of operators with bandlimited symbols,” SIAM
Journal on Mathematical Analysis, vol. 37, no. 3, pp. 867–888, 2005.
[42] K. Gedalyahu and Y. C. Eldar, “Time-delay estimation from low-rate samples: A union of
subspaces approach,” IEEE Transactions on Signal Processing, vol. 58, no. 6, pp. 3017–
3031, 2010.
[43] S. Sun, W. U. Bajwa, and A. P. Petropulu, “MIMO-MC radar: A MIMO radar approach
based on matrix completion,” IEEE Transactions on Aerospace and Electronic Systems,
vol. 51, no. 3, pp. 1839–1852, 2015.
[44] M. Vetterli, P. Marziliano, and T. Blu, “Sampling signals with finite rate of innovation,”
[45] E. Baransky, G. Itzhak, I. Shmuel et al., “A sub-Nyquist radar prototype: Hardware and
algorithms,” IEEE Transactions on Aerospace and Electronic Systems, vol. 50, pp. 809–822,
2014.
[46] K. M. Cohen, C. Attias, B. Farbman, I. Tselniker, and Y. C. Eldar, “Channel estimation in
UWB channels using compressed sensing,” in IEEE International Conference on Acoustics,
Speech and Signal Processing, 2014, pp. 1966–1970.
[47] K. V. Mishra and Y. C. Eldar, “Sub-Nyquist channel estimation over IEEE 802.11ad link,” in
IEEE International Conference on Sampling Theory and Applications, 2017, pp. 355–359.
[48] T. Chernyakova and Y. C. Eldar, “Fourier-domain beamforming: The path to compressed
ultrasound imaging,” IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency
Control, vol. 61, no. 8, pp. 1252–1267, 2014.
[49] O. Bar-Ilan and Y. C. Eldar, “Sub-Nyquist radar via Doppler focusing,” IEEE Transactions
on Signal Processing, vol. 62, pp. 1796–1811, 2014.
[50] R. Tur, Y. C. Eldar, and Z. Friedman, “Innovation rate sampling of pulse streams with
application to ultrasound imaging,” IEEE Transactions on Signal Processing, vol. 59, no. 4,
pp. 1827–1842, 2011.
[51] N. Wagner, Y. C. Eldar, and Z. Friedman, “Compressed beamforming in ultrasound
imaging,” IEEE Transactions on Signal Processing, vol. 60, no. 9, pp. 4643–4657, 2012.
[52] Y. C. Eldar, R. Levi, and A. Cohen, “Clutter removal in sub-Nyquist radar,” IEEE Signal
Processing Letters, vol. 22, no. 2, pp. 177–181, 2015.
[53] D. Cohen, K. V. Mishra, D. Cohen et al., “Sub-Nyquist MIMO radar prototype with Doppler
processing,” in IEEE Radar Conference, 2017, pp. 1179–1184.
[54] S. Na, K. V. Mishra, Y. Liu, Y. C. Eldar, and X. Wang, “TenDSuR: Tensor-based 3D sub-
Nyquist radar,” IEEE Signal Processing Letters, 2018, in press.
[55] S. M. Kay, Fundamentals of Statistical Signal Processing, Volume 2: Detection Theory.
Prentice Hall, 1998.
[56] M. Mishali and Y. C. Eldar, “From theory to practice: Sub-Nyquist sampling of sparse
wideband analog signals,” IEEE Journal on Selected Topics in Signal Processing, vol. 4,
no. 2, pp. 375–391, 2010.
[57] M. Mishali and Y. C. Eldar, “Sub-Nyquist sampling: Bridging theory and practice,” IEEE
Signal Processing Magazine, vol. 28, no. 6, pp. 98–124, 2011.
[58] D. Cohen and Y. C. Eldar, “Sub-Nyquist sampling for power spectrum sensing in cognitive
radios: A unified approach,” IEEE Transactions on Signal Processing, vol. 62, no. 15, pp.
3897–3910, 2014.
[59] D. Cohen and Y. C. Eldar, “Sub-Nyquist cyclostationary detection for cognitive radio,” IEEE
[60] D. Cohen, A. Dikopoltsev, R. Ifraimov, and Y. C. Eldar, “Towards sub-Nyquist cognitive
radar,” in IEEE Radar Conference, 2016, pp. 1–4.
[61] K. Gedalyahu, R. Tur, and Y. C. Eldar, “Multichannel sampling of pulse streams at the rate
of innovation,” IEEE Transactions on Signal Processing, vol. 59, no. 4, pp. 1491–1504,
2011.
[62] L. L. Scharf and B. Friedlander, “Matched subspace detectors,” IEEE Transactions on Signal
Processing, vol. 42, no. 8, pp. 2146–2157, 1994.
[63] P. B. Tuuk and S. L. Marple, “Compressed sensing radar amid noise and clutter using
interference covariance information,” IEEE Transactions on Aerospace and Electronic
Systems, vol. 50, no. 2, pp. 887–897, 2014.
[64] Y. Yu, S. Sun, and A. P. Petropulu, “A Capon beamforming method for clutter suppression
in colocated compressive sensing based MIMO radars,” in SPIE Defense, Security, and
Sensing, 2013, pp. 87 170J.
[65] K. Sun, H. Zhang, G. Li, H. Meng, and X. Wang, “A novel STAP algorithm using sparse
recovery technique,” in IEEE International Geoscience and Remote Sensing Symposium,
vol. 5, 2009, pp. V–336.
[66] X. Yang, Y. Sun, T. Zeng, and T. Long, “Iterative roubust sparse recoery method based on
focuss for space-time adaptive processing,” in IET International Radar Conference 2015,
2015, pp. 1–6.
[67] Z. Ma, Y. Liu, H. Meng, and X. Wang, “Jointly sparse recovery of multiple snapshots in
STAP,” in IEEE Radar Conference, 2013, pp. 1–4.
[68] Z. Wang, H. Li, and B. Himed, “A sparsity based GLRT for moving target detection in
distributed MIMO radar on moving platforms,” in Asilomar Conference on Signals, Systems
and Computers, 2015, pp. 90–94.
[69] S. Kay, “Optimal signal design for detection of Gaussian point targets in stationary Gaussian
clutter/reverberation,” IEEE Journal of Selected Topics in Signal Processing, vol. 1, no. 1,
pp. 31–41, 2007.
[70] L. E. Brennan and L. Reed, “Theory of adaptive radar,” IEEE Transactions on Aerospace
and Electronic Systems, no. 2, pp. 237–252, 1973.
[71] L. E. Brennan and I. S. Reed, “Optimum processing of unequally spaced radar pulse
trains for clutter rejection,” IEEE Transactions on Aerospace and Electronic Systems, no. 3,
pp. 474–477, 1968.
[72] T. Wimalajeewa, Y. C. Eldar, and P. K. Varshney, “Recovery of sparse matrices via matrix
sketching,” arXiv preprint arXiv:1311.2448, 2013.
[73] H. Griffiths, L. Cohen, S. Watts, E. Mokole, C. Baker, M. Wicks, and S. Blunt, “Radar
spectrum engineering and management: Technical and regulatory issues,” Proceedings of
the IEEE, vol. 103, no. 1, pp. 85–102, 2015.
[74] G. M. Jacyna, B. Fell, and D. McLemore, “A high-level overview of fundamental limits
studies for the DARPA SSPARC program,” in IEEE Radar Conference, 2016, pp. 1–6.
[75] J. R. Guerci, R. M. Guerci, A. Lackpour, and D. Moskowitz, “Joint design and operation of
shared spectrum access for radar and communications,” in IEEE Radar Conference, 2015,
pp. 0761–0766.
48 Mishra and Eldar
[76] K. V. Mishra, A. Zhitnikov, and Y. C. Eldar, “Spectrum sharing solution for automotive
radar,” in IEEE 85th Vehicular Technology Conference, 2017, pp. 1–5.
[77] A. R. Chiriyath, B. Paul, G. M. Jacyna, and D. W. Bliss, “Inner bounds on performance of
radar and communications co-existence,” IEEE Transactions on Signal Processing, vol. 64,
no. 2, pp. 464–474, 2016.
[78] D. Cohen, S. Tsiper, and Y. C. Eldar, “Analog to digital cognitive radio: Sampling, detection
and hardware,” IEEE Signal Processing Magazine, vol. 35, no. 1, pp. 137–166, 2018.
[79] M. Mishali and Y. C. Eldar, “Blind multi-band signal reconstruction: Compressed sensing
for analog signals,” IEEE Transactions on Signal Processing, vol. 57, no. 3, pp. 993–1009,
2009.
[80] D. Cohen, S. Tsiper, and Y. C. Eldar, “Analog to digital cognitive radio,” in Handbook of
Cognitive Radio, W. Zhang, Ed. Springer Singapore, 2017.
[81] N. Vaswani and W. Lu, “Modified-cs: Modifying compressive sensing for problems with
partially known support,” IEEE Transactions on Signal Processing, vol. 58, no. 9, pp. 4595–
4607, 2010.
[82] V. Stankovi, L. Stankovi, and S. Cheng, “Compressive image sampling with side informa-
tion,” in IEEE International Conference Image Processing, 2009, pp. 3037–3040.
[83] J. Huang, T. Zhang, and D. Metaxas, “Learning with structured sparsity,” Journal of Machine
Learning Research, vol. 12, no. 11, pp. 3371–3412, 2011.
[84] M. Mishali, Y. C. Eldar, O. Dounaevsky, and E. Shoshan, “Xampling: Analog to digital at
sub-Nyquist rates,” IET Circuits, Devices & Systems, vol. 5, pp. 8–20, 2011.
[85] M. Mishali and Y. C. Eldar, “Expected RIP: Conditioning of the modulated wideband
converter,” in IEEE Information Theory Workshop, 2009, pp. 343–347.
[86] E. Fishler, A. Haimovich, R. Blum, D. Chizhik, L. Cimini, and R. Valenzuela, “MIMO radar:
An idea whose time has come,” in IEEE Radar Conference, 2004, pp. 71–78.
[87] J. Li and P. Stoica, “MIMO radar with colocated antennas,” IEEE Signal Processing
Magazine, vol. 24, no. 5, pp. 106–114, 2007.
[88] D. Cohen, D. Cohen, and Y. C. Eldar, “High resolution FDMA MIMO radar,” arXiv preprint
arXiv:1711.06560, 2017.
[89] K. V. Mishra, Y. C. Eldar, E. Shoshan, M. Namer, and M. Meltsin, “A cognitive sub-Nyquist
MIMO radar prototype,” arXiv preprint arXiv:1807.09126, 2018.
[90] C. F. Barnes, Synthetic Aperture Radar: Wave Theory Foundations: Analysis and Algo-
rithms. Barnes (self-published), 2014.
2 Clutter Rejection and Adaptive
Filtering in Compressed
Sensing Radar
Peter B. Tuuk
2.1 Introduction
Clutter returns have posed challenges to radar designers and engineers since the early
days of technology development and use. Indeed, as early as the Second World War,
attention was paid to mitigating unwanted detections from terrain [1]. Generally, clutter
returns were mitigated by constructing the observation geometry so that targets were
above the radar and sensed against the background of sky. Early techniques included a
simple notch filter at the transmitted frequency that removed returns with zero Doppler
shift. This was effective in some cases for stationary radar systems, but did not com-
pensate for the effects of platform motion that shifted the frequency of clutter returns.
A subsequent development, pioneered in the 1950s, was the displaced phase center
technique that phase-shifted the returns from a series of pulses to align them, allowing
for more effective cancellation [2]. The next major development in airborne radar was
a revolution: pulse-Doppler radar, enabled by high-accuracy timing circuitry and early
digital memory, which coherently processed a set of pulses for clutter rejection and other
purposes. These techniques allowed effective airborne early warning development and
look-down, shoot-down modes for fighter aircraft [3]. One such early pulse-Doppler
radar, the AWG-10, was employed on the McDonnell-Douglas F-4 Phantom.
In the late 1970s and 1980s, space–time adaptive processing (STAP) was introduced
[4]. STAP takes advantage of improvements in digital signal processing to extend the
clutter cancellation to the two-dimensional domain. It does so by introducing a spa-
tial or array channel dimension. These additional degrees of freedom allow improved
cancellation in the joint domain and extend work on clutter cancellation to that of
other structured interference sources. In the years since, STAP theory and practice
have improved with the introduction of more array channels, prior-knowledge-aided
processing, and the introduction of computationally expensive matrix decompositions.
As STAP becomes a more mature technology, it is migrating to smaller platforms.
But cost, size, weight, power, and other considerations make large, multi-channel, high-
bandwidth array antennas infeasible in these settings. Compressed sensing (CS) offers
the hope that lower sampling requirements and data volumes could simplify data acqui-
sition requirements and allow advanced techniques on lower-end platforms. In some
contexts the computational costs of CS reconstruction are prohibitive today. But in oth-
ers, the signal acquisition problem is intractable under traditional Nyquist-rate sampling.
49
50 Tuuk
As CS techniques, approaches, and technologies mature, the need to consider additional

sources of interference beyond noise becomes more pressing.
It is at this point that this chapter picks up the thread, examining the topic of CS in
radar with a focus on mitigating structured interference, such as clutter, in the CS con-
text. To do so, we introduce work from adaptive filtering and low-rank matrix approx-
imation. Recent results in this area show that if the interference has low rank statistics
of the covariance can be reliably estimated from highly compressed measurements. In
addition, the covariance of the interference can be incorporated into the CS estimation
process to improve performance.
2.2 Problem Formulation
The radar identifies objects within its field of regard by transmitting radio frequency
electromagnetic energy into the surrounding medium. This energy propagates through
the medium impinging on objects in that environment. The objects reflect some portion
of the energy back to the radar where it is processed to estimate characteristics of the
environment.
2.2.1 Data Cube

The most basic use of a radar system is to calculate range to a target by measuring
the time between transmission of a pulse and the time the reflection from the target is
received. Another fundamental measurement that may be made with a radar receiver
is to calculate target velocity by measuring the Doppler shift of the reflected pulse.
For multi-pulse radar the Doppler shift is calculated over multiple pulses to increase
the observation time, thereby improving the Doppler resolution. Multichannel digital
receivers may estimate the angle of arrival of reflected energy using the differential time
or phase delay between measurements at the sampled channels. These three sampling
dimensions correspond to three dimensions in the target space:
1. Receiver Channel: Elements of the antenna array are separated into some number
of channels. The received energy collected by the elements of a channel is coher-
ently combined and sampled. This sampling dimension is used to determine the
direction of the arrival of signals from targets.
2. Slow Time: The radar transmits a series of pulses, samples the returns from each,
and processes these samples coherently. This set of pulses constitutes a coherent
processing interval (CPI), and the pulses are transmitted with some frequency, the
pulse repetition frequency (PRF). This sampling dimension is used to determine
the range rate of targets.
3. Fast Time: The analog-to-digital converter samples the incoming radio-frequency
signal at a rate determined by the bandwidth of the transmitted waveform. This
sampling dimension is used to determine the range of targets.
This three-dimensional conception of the received signal is known as the data cube [5].
Let the dimensionality of this cube be Nc × Ns × Nf , for the channel, slow time, and
Clutter Rejection and Adaptive Filtering 51
fast time sampling dimensions. Further define n = Nc Ns Nf . This data cube can be
vectorized as y ∈ Cn and approximated by a linear combination of basis elements in a
sensing operator:
y = Sx, (2.1)
where y contains the samples in time and space. The vector x is the unknown vector
that describes the target scene; it is a vector that gives radar a cross section of scatterers
at each location in the observation extent. Generally this vector is sparse because many
gridpoints contain no target.
2.2.2 Linear Sensing Model

The sensing operator S is a linear transform from the discretized target space (angle,
radial velocity, and range of dimension Na × Nv × Nr ) to the sampled data cube. The
full sensing matrix can be constructed from the bases that describe the response along
the range, angle, and Doppler dimensions:
S = Sr ⊗ Sa ⊗ Sd, (2.2)
where ⊗ is the Kronecker product.
These models describes simple propagation phenomena. Let there be a target at
range ri from the antenna and angle θi from the array boresight with range rate vi .
(Range rate being the time derivative of range, vi = dt d
ri .) This target is illuminated
by a series of ns identical waveforms with carrier frequency f0 , i.e., each waveform
w(t) = e2πjf0 t e2πj φ(t) with pulse repetition interval Ts . The waveform w(t) has phase
φ(t) and bandwidth β, whether by swept frequency chirp, phase code sequence, or some
other modulation function. The illumination experienced at the ith target is then
s −1
n
ei (t) = αi w(t − qTs − (ri + vi qTs )/c) (2.3)
q=0
for some scalar αi , where c is the speed of light.

A moving target imparts a Doppler frequency shift on the waveform proportional
to its radial velocity (positive shift for decreasing range), and some of this energy is
reflected back to the antenna array to be received. The array consists of ne individual
array elements uniformly separated by a distance d. We neglect the element pattern
of any array element, and instead model them as isotropic receivers. Each of these
array elements makes nf uniformlyspaced fast-time samples on in-phase and quadrature
channels, i.e., these samples are points in the complex plane. The signal reflected from
target i received at element k is
s −1
n
2(ri + qvi Ts ) − kd sin θi v
2πj i
fi,k (t) = γi w t − qTs − e f0 c (2.4)
c
q=0
for some scalar γi . In a digital receiver fi,k (t) is sampled in fast time every Ts =
1/β seconds.
52 Tuuk
To express this in the range basis, S r , each column is a shifted copy of the transmitted
waveform, w(t), with leading and trailing zeros. The first column is the response from
a target at the minimum range in the range window and in each subsequent column the
waveform is shifted by one entry.
⎡ ⎤
w(Ts ) 0 0 ··· 0
⎢ w(2 Ts ) w(Ts ) 0 ··· 0 ⎥
⎢ ⎥
⎢ w(3 Ts ) w(2 Ts ) w(Ts ) · · · 0 ⎥
Sr = ⎢ ⎥ (2.5)
⎢ .. ⎥
⎣ . ⎦
0 0 0 ··· w(nr Ts )
The angle and Doppler bases, S a and S d , are both frequency bases. The Doppler
frequency is evaluated across pulses in the coherent processing interval. The angle
basis is the relative phase delay introduced at the elements of the antenna array as an
incoming planar wavefront reaches each element sequentially. The time delay between
receivers becomes a simple phase shift for narrowband signals. For wideband signals,
the spatial phase pattern becomes frequency-dependent. To coherently process signals
in a wideband setting, other approaches can be used to minimize the losses due to
phase mismatch. These include true time delay units or corrections applied in the
digital domain to counteract the known phase errors introduced. For radar systems
made up of subarrays, true phase combining can be used at the subarray level if the
subarray is small enough or the bandwidth is small enough to adequately control phase
migration.
This linear model can be represented explicitly as a matrix, with each column of
the matrix S being the return from a target at the corresponding range-angle-Doppler
position in space. But the size of the matrix grows rapidly as the dimensions of the
sample space and search space increase. For instance, for a system with 8 channels, 128
pulses, and 512 range samples, the matrix S has 2.8 × 1011 elements. Therefore, the
most efficient way to implement this model is not by explicitly storing the matrix but by
performing discrete Fourier transforms along the angle and velocity dimensions and a
convolution in the range dimension. For computational gains, these can be implemented
using the fast Fourier transform (FFT). By this means the storage requirements and
processing time can be reduced considerably.
This linear model suffices for analysis and estimation. Of course targets move through
continuous space, and so any discretization will necessarily be only an approximation.
Finer discretization can reduce the associated errors, but at the cost of increasing the
correlation between columns of the matrix. With any processing there are diminishing
returns as the discretization becomes finer than the fundamental resolution of the sensor.
This chapter expresses many of these concepts in a single dimension assuming a
uniform linear array, but all these results are generalizable to planar and nonuniform
arrays with somewhat more complicated notation. In this book we neglect that fourth
dimension in the interest of clarity and computational tractability. Furthermore, any
polarization effects are neglected in this model. For systems that record dual polarization
an added dimension could be used to represent that variable.
ϕ
Antenna Subarray Elements
Signal and Data Processor

ϕ
... A
D
Figure 2.1 A simplified block diagram of a single subarray of a large airborne phased array radar.
The signal path includes a low-noise amplifier, analog filtering, phase shifting, RF combining,
mixing to IF, and sampling. The digital samples from this and all other subarrays feed into the
digital signal and data processor for pulse compression, beamforming, detection, association,
tracking, prediction, and other functions.
2.2.3 Matched Filtering

The received-signal processing chain is built of a number subsystems: antenna, analog
signal processing, digital signal processing, and data processing. These subsystems
sequentially refine this input signal. This includes using a set of matched filters to
generate an estimate of the true range profile. These matched filters can be expressed as
the conjugate transpose of the sensing model:
x̂ mf = S H y. (2.6)
The matched filters in the spatial and Doppler dimensions amount to Fourier transforms,
and the matched filter in the fast-time dimension is a convolution, which can be per-
formed in the Fourier domain as well. The upstream portion of this processing is shown
in Figure 2.1 for a notional radar subarray.
2.3 Interference Sources
2.3.1 Measurement Noise

White noise is the simplest and most commonly treated interference type. In some
applications it is the dominant interference source, particularly for targets at long ranges
and for radars that detect targets against an empty background. Thermal noise that
accumulates along the analog processing chain and at the analog-to-digital converter
(ADC). This noise can be represented as a circular Gaussian vector n ∈ Cn with a
variance set by the noise level of the receivers and other elements in the processing
chain. Noise can be introduced into the sensing model as
y = Sx + n. (2.7)
Quantization noise, though deterministically related to the sampled signal, can also be
treated as Gaussian for ADC converters with typical resolution.
54 Tuuk
2.3.2 Correlated Interference

Other sources of interference exist, some of which exhibit structure according to the
mechanism by which they are caused. Mitigating these sources of interference requires
a different approach than using longer waveforms or more pulses; such techniques
increase the target energy but also increase the correlated interference energy. This
interference is said to be correlated because its structure can be described statistically
by its expected autocorrelation. Correlated interference takes on a number of forms,
including distributed ground clutter, in-band radio frequency interference (RFI), and
differences between the true and the modeled sensing system.
Clutter returns come from energy that is reflected back to the radar by objects other
than the intended targets. Thus, the definition of clutter is application-specific. For
a system designed to detect motor vehicles in forested terrain, the trees and ground
are the clutter. However, for the purpose of performing geographic land use surveys,
the ground and trees are the desired scene. Clutter and other structured interference
degrade performance in this and other applications. This work considers the case where
the targets are moving objects, especially motor vehicles, and the clutter consists of
terrain, foliage, and buildings.
To include this in the sensing model, a clutter vector c is introduced, which contains
the geometry- and terrain-dependent clutter reflectivity at the grid locations.
y = S(x + c) + n. (2.8)
The clutter is illuminated by the same waveform as the targets themselves and thus
signal processing that is used to improve the signal-to-noise ratio of the targets will
cause the clutter to be accentuated as well. So more energetic waveforms or more
integration will not be useful in improving the signal-to-clutter ratio (SCR). By contrast
RFI exibits strong spatial correlation but little or no correlation between pulses, so
for this interference source the signal-to-interference ratio can be improved by longer
integration times or higher power.
Clutter can result from land or sea surface reflections, and the statistics of the
interference depend strongly on the particularities of the terrain being surveilled. Much
research has focused on describing the expected returns from clutter as a function
of terrain, radar band, radar resolution, and other parameters. The simplest model is
to assume a Gaussian distribution; for low-bandwidth radars this assumption is often
sufficient. A common distribution for clutter amplitude is gamma function [6,7]. But
detailed analysis of several collected datasets shows significant skewness and kurtosis
that do not match either the normal or gamma distributions. Having examined K,
log-normal, Weibull, and Rayleigh distributions, work in [8] showed that the Weibull
distribution best described the data collected in a flight test over open farmland in
Saskatchewan. In [9] research led to a compound distribution composed of two
separate gamma distributions that describe the modulation and speckle observed in
high-resolution radar.
Probability of Occurance
One Realization
0.8 Underlying Distribution
0.6
0.4
0.2
0
0 0.5 1 1.5 2 2.5 3 3.5 4
Amplitude Value (Scaled to Mean)
Figure 2.2 The sampled distribution of the observed clutter in one realization, along with the
underlying gamma distribution, which is defined by a shape parameter of 10/3.
So with all this in mind, we have elected to use a constant gamma-distributed random
model variable as in [6], with a shape factor of 10/3 in our examples. This model
approximates a terrain with relatively open land and the absence of man-made scatterers.
Figure 2.2 shows the amplitude of clutter observed in a realization of this clutter model.
Other models may be more appropriate at very low grazing angles and for other types
of terrain [8,9].
2.4 Signal Processing Treatment of Clutter
Processing using only the matched filtering applies a nonadaptive technique to a tar-
get scene. However the inter-bin correlation structure can be used to improve target
detection performance. Several approaches to treating the correlated clutter problem are
described in this section.
2.4.1 Early Techniques

The two-pulse canceler is a filtering technique in which the returns from two consecutive
pulses are subtracted. If the clutter is stationary from pulse to pulse, it will be nulled
by this filtering. If targets exhibit radial motion, their return will not be subject to as
much nulling. The null produced by this technique can be quite broad, and might hide
slow-moving targets or targets moving nearly perpendicular to the radar’s radial vector.
In addition, this filter introduces nulls at Doppler frequencies equal to integer multi-
ples of the PRF, which correspond to blind velocities [10]. The canceler technique
can be employed noncoherently with reduced performance if radar system phase tol-
erances are not sufficient to support pulse-to-pulse coherent processing. A number of
improvements can be made to this basic filtering concept. Differencing filters can be
cascaded to improve filter null depth. Additionally, the PRF can be staggered within
a CPI to resolve blind velocities and null the clutter at zero frequency while passing
moving targets.
56 Tuuk
2.4.2 Space–Time Adaptive Processing

The principal technique for reducing the effects of structured interference in a mul-
tichannel radar system is adaptive filtering. This technique estimates the interference
statistics to produce a filter, which is applied to the received signals. In the multichan-
nel pulse-Doppler radar case under examination, a two-dimensional filter, STAP, may
be employed. STAP uses the three-dimensional data cube to estimate the interference
structure over the joint angle-velocity space using the range samples as training data
[4,11–13]. Especially in downward-looking airborne radar, the clutter may interfere with
the target in either the Doppler or angle dimension but by processing them jointly the
target often falls outside the clutter support.
The STAP filter defines the optimal filter for maximizing signal-to-noise ratio in
Gaussian interference. The measurements defined in (2.8) with noise, clutter, and other
interference y = S(x + c) + n can be broken into signal and interference portions:
y = (Sx) + (Sc + n).
Let the measurement data y be reorganized as Y = [y 1 | . . . |y Nf ] ∈ CNcs ×Nf , where
Ncs = Nc Ns . Putting aside any target energy, each of these vectors can be approximated
by a complex normal distribution y i ∼ CN (0Ncs ,R). This interference distribution
has both clutter and noise components: the noise contributes a diagonal component to
the covariance matrix, while the clutter contributes a component that is low-rank (or
approximately so) as a result of the sensing geometry and the timeline. Thus
R = R n + R c = ν 2 I Ncs + V DV H , (2.9)
where ν 2 is the measurement noise variance, V ∈ CNcs ×k with structure determined

by the characteristics of the observed clutter, and D is diagonal with entries σ12,. . .,σk2 .
Here we model the clutter as having rank k ≤ Ncs , though the degree to which this
low-rank model holds true will be examined in greater detail in this chapter. Clutter is
stronger than noise (necessitating adaptive filtering), though this may be quite variable.
The sample covariance matrix used in STAP is then defined as:
R̂ = Nf−1 Y Y H . (2.10)
As more samples from this distribution are collected, Nf /Ncs → ∞, and this estimate
will converge to the true covariance R. This estimate of the interference statistics can
be inverted to form a signal-to-interference-plus-noise ratio (SINR)-maximizing linear
filter, [12] which leads to an estimate of the target scene for range bin i as
−1
x̂ i = κ S H R̂ yi, (2.11)
where the scaling factor κ = (S H R −1 S)−1 .

The quantity of training data available to estimate the covariance, as in (2.10), is
often limited. The statistics of the interference must be estimated from data in adjacent
or nearby range bins to avoid using nonrepresentative statistics in nonhomogeneous
environments. The Reed–Mallet–Brennan rule states that a stable estimate can be made
from a set of training data with twice as many snapshots as degrees of freedom in the
covariance [14]. Various approaches that have been applied in attempts to maximize
performance with limited data are discussed next.
Diagonal loading is a commonly used technique to improve sample covariance accu-

racy and stability [15]. It adds a diagonal (usually a scaled identity) component to reg-
ularize the covariance matrix estimate before inversion. After introducing the diagonal
loading term δ, the sample covariance matrix estimate becomes:
R̂ f ull = Nf−1 Y Y H + δI Ncs , (2.12)
which is identified as the “full” estimate because it uses the full (Nyquist-rate) set
of data, and to distinguish it from other estimators to be introduced in this chapter.
Diagonal loading has the effect of reducing the matrix condition number if the matrix is
badly conditioned. It also places a limit on the filter null depth. The optimal loading
level depends on the type of covariance estimate, as well as the high sensitivity to
the clutter-to-noise ratio and even the temporal sampling rate. Optimal selection is
largely based on heuristic approaches and empirical results. For this work we use a
diagonal loading factor of 10−6 , which will appropriately regularize the inverse without
degrading performance by overwhelming the observed sample structure.
Figures 2.3 and 2.4 illustrate the ability of the STAP filter to reduce the contribution
of clutter in the estimate while maintaining target detectability in uncluttered regions.
Figure 2.3 The matched filter estimate reshaped into the range-angle-Doppler cube and projected
along each of the three dimensions. The true target location is r = 290 m, θ = −30◦ , and
v = 1 m/s. The marker indicates the true target location in each view.
A
A
V
R R V
Figure 2.4 The STAP estimate reshaped into the range-angle-Doppler cube and projected along
each of the three dimensions. The true target location is r = 290 m, θ = −30◦ , and v = 1 m/s.
The marker indicates the true target location in each view.
58 Tuuk
The line of clutter appears as a slice in the third frame of the figures due to the interaction
of the sensor platform motion with the stationary ground clutter. In the matched-filter-
only estimate that clutter line is such a strong ridge in the angle-Doppler plane that it
renders the target invisible. In the STAP estimate, the clutter has been nulled by the filter
and the target can be easily located in all three projections. This is a high SCR example
to enhance the visibility of the target relative to the clutter. This same framework will be
used to generate experimental results in the latter portion of this chapter, although with
different parameters for signal, clutter, and noise power.
The accuracy of the estimate of the interference covariance decreases with the
number of degrees of freedom in the estimate (for a fixed training data set). To improve
the stability of the estimates at the expense of resolution, reduced-rank STAP is utilized.
These techniques work well because the interference information can be described
using relatively few basis vectors. In [16], it is shown that signal dependent rank
reduction can improve performance. The cross-spectral metric for designing the rank
reducer incorporates information on the desired signal steering vectors, not just the
interference statistics to improve performance in cases of limited training data. In [17],
it is shown that reduced-rank STAP techniques offer better performance than full-rank
estimation under constraints on training time and data. In [18], it is shown how one
may select a dimensionality reduction basis efficiently. In addition to a performance
improvement given limited training data, these reduced-rank techniques also offer
a reduced computational burden relative to a full-rank algorithm. The optimal rank
may be set by processing timeline constraints, but even in the unconstrained case,
learning the optimal number of dimensions to use for estimating the covariance is not
straightforward.
2.5 Measurement Compression
If y or Y is the set of Nyquist-sampled measurements, then it represents all the (ban-

dlimited) electromagnetic information passing over the measurement aperture during
the period of observation (a single coherent processing interval). The compressed mea-
surements, which undersamples the incident signals by a factor of u, are modeled as
another vector
z = Cy = C(Y ), (2.13)
where z ∈ Cm , C ∈ Cm×n , m = Nc Ns Nf /u = n/u, and linear operator
C(·) : CNcs ×Nf → Cm (2.14)
with adjoint C ∗ (·). To be effective in a CS estimation framework, these compressive

measurements must be incoherent with the sparsifying basis S.
A simple estimate of the target vector from the compressed measurements z can be
computed by performing a compressed matched filter, which applies the matched filter
to the available measured data
x̂ cmf = S H C H z. (2.15)
This method has low computational cost but does not necessarily yield a sparse solution.
A CS estimate of the target vector can be computed by solving a convex linear
optimization problem, such as an 1 -regularized least-squares
x̂ cs = arg min ||z − CSx||22 + λ ||x||1 , (2.16)

x
where the first term of the minimization objective is the Euclidean norm of the residual
that enforces fidelity to the measured data, and the second term is the 1 norm of the
estimate that promotes sparsity in the solution. The parameter λ enables a trade off
between these competing priorities.
Much attention has been paid to how the random sampling at the core of CS can be
realized in hardware. The exact content of the compression matrix C will depend on the
measurement process it describes. In pulse-Doppler radar, this process may introduce
incoherence in the following ways:
• in fast time by mixing incoming signals with pseudo-random modulation

sequences before low-pass filtering and sampling slowly [19]
• in slow time by staggering the pulse repetition interval [20]
• in the spatial domain using a random measurement array [21,22] or coprime
thinned array [23].
2.6 Estimating Interference Statistics from Compressed Measurements
We have seen that interference covariance can be estimated from samples via the STAP
approach. The central question of this work is how well samples produced from a
compression operation C, as in (2.13), can be used to form an estimate of the covariance
matrix. The simplest approach is compressed sample matrix inversion (SMI), in which
the adjoint of the compression operator is used to bring the samples back to the full
ambient signal space before forming the covariance estimate:
R̂ comp = Nf−1 C ∗ (z)C ∗ (z)H + δI Ncs . (2.17)
As in (2.12) a diagonal loading term is introduced to regularize the matrix inverse.
2.6.1 Matrix Completion

Related to CS recovery of sparse vectors is the recovery of low-rank matrices from fewer
samples than matrix elements. This area of work is known as matrix completion. In the
same way that the 1 norm offers a convex relaxation of a direct sparsity (0 “norm”)
objective, the nuclear norm (denoted as ||·||∗ ), which is the sum of the matrix singular
values, offers a convex relaxation of the rank(·) objective.
Let some unknown matrix X ∈ Cn×n exist with rank r. Some number m observations,
M i,j , of this matrix are available only at the support set (i,j ) ∈ . In [24] it is shown
60 Tuuk
that low-rank matrix recovery from few measurements is not ill-posed and is convex.
The convex optimization problem
min ||X||∗ s. t. X i,j = M i,j , (i,j ) ∈ (2.18)
will recover the matrix X exactly with high probability if m ≥ Cn1.2 r log n, for a
specified constant C.
In [25], the topic of matrix completion is surveyed and it is shown that n × n matrices
of rank r can be recovered from m noise-corrupted direct samples via nuclear norm
minimization with high probability if m ≥ Cnr log2 n with an error on the order of
the noise level. In [26], this idea is expanded to show that matrices can be recovered
from expansion coefficients with respect to a known matrix basis as long as that basis is
not coherent with the matrix being recovered. This is directly analogous to the required
incoherence between the spasifying basis and the sensing basis in CS theory. The work
in [27] provides information theoretic lower bounds on the number of samples needed
to recover certain types of low-rank matrices. One result is that m = Cnr log n is the
lower limit on the number of samples needed to recover a random n × n matrix with
rank r.
The work in [28] treats the topic of estimating simultaneously sparse and low-rank
matrices from rank-one measurements. This measurement model consists of a series of
sketches of the underlying matrix. In [29] this same “sketching” measurement model is
considered and shows specifically that the number of measurements required for stable
estimation scales linearly with the rank of the matrix and the sparsity of the matrix and
with the logarithm of the number of rows. Though this model differs from that which we
will develop in this chapter, the results provide a basis for confidence that by exploiting
the low-rank nature of the covariance, an improvement in performance can be expected.
In [30], it is shown that the low-rank assumption can be used to improve the esti-
mation accuracy of the covariance matrix using a standard STAP benchmark dataset in
the case of limited training data. Their proposed algorithm out-performs other lower
computational complexity algorithms by using a dictionary learning approach.
In [31,32], a CS–STAP technique is developed in which a small amount of training
data, in some cases one snapshot, can be used to estimate the covariance statistics and
build the whitening filter. This technique assumes direct sparsity in the interference
covariance matrix. The validity of this assumption depends on the type of interference
being described. For certain types of electromagnetic interference, or for certain types
of man-made clutter, a concentration in this domain can be very pronounced. For other
types of natural ground cover, the spectrum can be more distributed. This distribution is
especially pronounced for foliage being blown in the wind.
In [33] an approach is proposed to estimate the sample covariance matrix (that would
be obtained from the uncompressed data) from a set of compressed measurements.
The goal of estimating the sample covariance matrix is slightly different than that of
estimating the underlying interference statistics, but does isolate the two stages of sam-
pling limits: first, a sample covariance matrix deviates from the true covariance matrix
because it is based on a limited number of realizations, and second, the compressed
sample covariance matrix differs from the full sample covariance matrix because of the
dimensionality reduction. The nature of the second limit depends on the manner in
which the samples are compressed and the estimator used to generate the sample
covariance matrix. Also, we note that approaches from the low-rank matrix approxi-
mation body of theory have been applied to the moving-target detection problem, for
example in [34].
2.6.2 Iterative Singular Value Thresholding

In [35], an algorithm for efficient matrix completion is provided; this approach uses
iterative singular value thresholding (SVT) and projection back onto the observation
set. This algorithm is able to recover large matrices in low run-time relative to interior-
point methods. Starting from M, the observations, and P (·), the projection onto the
observation domain, the iteration involves repeated application of
X = shrink(Y,τ)
(2.19)
Y = Y + δP (M − X)
returning X. In this iteration, shrinkage to the threshold τ imposes low rank (spec-
tral sparsity) by reducing the magnitude of the singular values, as specified in (2.20).
A larger value of τ results in a lower-rank representation. The shrink(·) operation is
soft-thresholding of the singular values of its argument. When the singular value decom-
position (SVD) of matrix U SV H , where S = diag(s) = diag([s1 ,. . ., sn ]T ),
shrink(A;τ) := U diag(soft(s;τ))V H ,
soft(s;τ) := [soft(s i ;τ), i ∈ 1 . . . n]T , (2.20)
si
soft(s i ;τ) := max(0,|si | − τ).
|si |
The iteration in (2.19) solves
1
min τ ||X||∗ + ||X||2F
2 (2.21)
s. t. P (X) = P (M).
This iteration approaches the direct nuclear norm objective for large values of τ, and
smaller τ improves the stability of the solution. This technique solves for large 1,000 ×
1,000 matrices in several seconds on a personal computer. This technique is the basis
for the low-rank covariance estimation technique developed in this chapter.
2.6.3 Performance Evaluation

To review, the identified covariance estimates are:
• R true : true second-order statistics of the interference in (2.9)

• R̂ f ull = Nf−1 Y Y H + δI Ncs in (2.12)
• R̂ svt : following iteration in (2.19)
• R̂ comp = Nf−1 C ∗ (z)C ∗ (z)H + δI Ncs in (2.17)
• R̂ none = I Ncs
62 Tuuk
0
10
Eigenvalue Mag.
10–1
Structure 1: Simple Plateau

–2
10 Struecture 2: Triple Plateau
Structure 3: Exponential Decay
Structure 4: Plateau and Decay
–3
10
0 5 10 15 20 25 30
Eigenvalue Index
Figure 2.5 Synthetic structured interference eigenvalue decay functions. These functions are
shown with width parameter of 10. For other values, of this parameter, the spectral structure is
stretched or compressed proportionally.
We test four different spectrally structured synthetic data matrices to better understand
the performance of the SVT algorithm under various conditions. For an input rank
width r:
1. The first decay function has r equal eigenvalues.

2. The second decay function has r eigenvalues with unity magnitude, r with
magnitude 1/4, and r with magnitude 1/16.
3. The third decay function has eigenvalues decaying, with the ith eigenvalue
having magnitude
$ %
exp −(35i/r)7/10
σi = $ %. (2.22)
exp −(35/r)7/10
4. The fourth decay funtion has eigenvalues decaying, with the ith eigenvalue
having magnitude

1, $ %
i≤r
σi = exp −(35(i−r)/r)7/10 (2.23)
exp(−(35/r)7/10 )
, otherwise.
These four synthetic interference structures are illustrated in Figure 2.5.

This set of spectral structures includes two with exponential decay in the true eigen-
values of the interference. This condition is included in an attempt to represent true
sensing problems in which the interference does not vanish. Selected parameters used
for these simulations are provided in Table 2.1.
To build up to the performance summary statistics we first include details for a
particular example of generating these estimates and using them to filter interference. In
this case, the true interference structure is structure 4, shown in Figure 2.5. The number
of snapshots is 500, the number of spatial channels is 8, the number of pulses is 128,
Table 2.1 Parameters used for synthetic

structured interference experiments.
Parameter Value
Range bins 100–500

Pulses 128
Spatial channels 8
Clutter spectral decay width 5–20
Clutter-to-noise ratio 0–40 dB
Signal-to-interference ratio 0 dB
Eigenvalue Magnitude
10 –2
10 –3
True Cov.
–4 Full SMI
10 Compr. SMI
Compr. SVT
0 1 2 3
10 10 10 10
Eigenvalue Index
Figure 2.6 The spectral decay structures of the various covariance estimates of the synthetic
structured interference. The SVT estimate more closely matches the truth than the compressed
SMI does, which uses the same data as input.
the CNR is 15 dB, and the rank width parameter is 10. The various covariance estimates
exhibit varying spectral structure and are shown in Figure 2.6.
To illustrate performance over a broader problem space we conducted a parametric
sweep over several relevant problem input variables: under-sampling factor, clutter-to-
noise ratio (CNR), number of range samples (snapshots), and clutter eigenvalue decay
function. These results are all generated using Na = 8 spatial channels and Ns = 128
pulses. The results of this evaluation are shown the following figures.
Figure 2.7 shows how performance varies as the under-sampling factor varies. It is
evident that for low and moderate under-sampling factors the SVT estimator is able to
perform as well as the full-data SMI, but under these same conditions the performance of
compressed SMI falls off sharply. If the experiment is limited to favorable interference
conditions (low rank, simple structure), the SVT estimate performs very well, out to an
under-sampling factor of 20, as shown in Figure 2.8.
Figure 2.9 shows how performance varies as the number of pre-compression range
bins varies. It shows that the SVT estimate is best able to take advantage of a longer
observation interval to improve the estimation accuracy, even if those range bins will
64 Tuuk
1
True Cov.
Full SMI
Probability of Detection
0.8
Compress SVT
Compress SMI
No Cov.
0.6
0.4
0.2
0
1 2 5 10 20
Under-Sampling Factor
Figure 2.7 Average probability of detection as a function of the data under-sampling factor for
cases of synthetic structured interference. As the under-sampling factor increases and less data is
available for the two compressed estimates, the accuracy of those estimates degrades. Notably,
the SVT estimate maintains much better performance than compressed SMI as the USF
increases.
1
0.8
0.6
0.4
True Cov.
Full SMI
Compress SVT
0.2
Compress SMI
No Cov.
0
1 2 5 10 20
Under-Sampling Factor
Figure 2.8 Average probability of detection as a function of the data under-sampling factor
for cases with favorable interference structure. The SVT estimate based on the compressed data
performs better than the diagonally loaded estimate based on the full data and nearly as well as the
true covariance matrix. The interference rank is 5 and the structure is simple (structure ID = 1).
be compressed to reduce the number samples, a larger ambient dimension improves the
estimation performance. This somewhat challenges at least one motivation for reducing
the number of samples: non-stationarity interference forcing a reduced range sample
space. But for cases in which acquiring the samples themselves is a greater challenge
0.8
True Cov.
Full SMI
0.6 Compress SVT
Compress SMI
No Cov.
0.4
0.2
0
100 200 300 400 500
Number of Range Samples
Figure 2.9 Average probability of detection as a function of the number of range samples for
cases of synthetic structured interference. As the number of range samples (snapshots) increases,
the two compressed estimates, which use compressed data (compressed snapshots), both
improve in estimation accuracy.
1
0.8
True Cov.
0.6
Full SMI
Compress SVT
Compress SMI
0.4
No Cov.
0.2
Clutter Structure ID
Figure 2.10 Average probability of detection as a function of interference structure for cases of
synthetic structured interference. The performance of various estimators is tested with four
different clutter structures, having the four eigenspectra illustrated in Figure 2.5. The
performance of the estimators varies as a function of the spectra, with lower rank clutter being
easier to filter out and higher rank clutter being more difficult.
than inherent interference constraints, this result shows that the SVT estimator could
provide improved performance.
Figure 2.10 shows how performance varies over the different clutter eigenspectra
shown in Figure 2.5.
66 Tuuk
2.7 Mitigating Clutter in Compressed Sensing Estimation
Estimating the statistics of clutter from compressed measurements is necessary, but not
sufficient for the detection of targets embedded therein. A variety of techniques exist
for estimation of the scene, including an approaches that uses an estimated interference
covariance matrix and other approaches that do not.
One of the first approaches to this problem, from [36], uses a mask over the presumed
clutter ridge of the angle-Doppler domain. This allows traditional CS-based solution
techniques to be applied outside that mask where targets are presumed sparse. The extent
of the clutter region may be reasonably estimated from platform motion and sensing
geometry. However, this limits minimum detectable velocity and does not readily extend
to electromagnetic interference (EMI) and other structured interference sources.
In [37], the clutter covariance is assumed known and a set of Capon beamforming
weights is built into the sparsity matrix to favor detection of targets and suppress detec-
tion of clutter. The objective vector is assumed to be block-sparse, with high response in
all snapshots. This is combined with an arbitrary random sensing matrix and tools from
convex optimization.
The approach in [38] uses the clutter covariance to modify the optimization norm
or, equivalently, to modify the system model. For a combined sensing and sampling
model, matrix A and interference covariance matrix R with inverse R −1 , which have
a Cholesky decomposition P H P , a modified model matrix Ā = P A, and modified
measurement vector ȳ = P y, are introduced. These can be used in a standard sparse
recovery formulation:
2
x̂ = arg min = Āx − ȳ 2 + λ ||x||1 .
x
This is solved with the FISTA algorithm in the published work, but any desired opti-
mization routine could be applied.
Using the statistics of the clutter to develop a whitening filter is also the approach
taken in [39]. In this case, the whitening filter is derived in the frequency domain without
any spatial degrees of freedom. The resultant technique is applied to an experimental
CS-based radar system and provides improvement up to a factor of the number of pulses.
None of the mentioned CS solution methods, as described, are equipped to handle
structured interference. If R̂ is some estimate of the space–time covariance matrix of the
fully sampled interference, and R̃ = R̂ ⊗ I Nf is its expansion into the full measurement
domain, then the covariance matrix of the interference in the compressed domain can be
expressed as R̂ c = C R̃C H . This use of the dimensionally reduced covariance matrix
is a non-data-adaptive reduced-dimension STAP formulation. The literature of reduced-
dimension STAP provides a natural way to obtain an estimate of the scene from the
compressed measurements while including the covariance information. Define
−1
x̂ cstap = S H C H R̂ c z. (2.24)
This is analogous to the x̂ stap solution in that the matched filter is post-multiplied by
the covariance matrix inverse to whiten the interference. It is identical to the STAP
solution in the case that C is the identity matrix (i.e., the fully sampled case). To compute
this estimate, one incurs the cost of inverting the covariance matrix (or parametrically
estimating the inverse), but gains a good deal of clutter suppression as we will show in
our results. However, this technique does not necessarily favor sparse solutions.
Work in [40,41] proposes a covariance-aware CS (CA CS) that accounts for structured
interference in a CS framework. As in STAP methods, this is accomplished via the
interference covariance matrix inverse. The original 1 regularized least squares problem
of (2.16) can be modified as follows:
−1
x̂ cacs = arg min (z − CSx)H R̂ c (z − CSx) + γ ||x||1 . (2.25)
x
This generalization of (2.16) is identical (for some value of γ) in the clutter-free case.
−1
In that case, R̂ c is a scaled identity matrix. Here γ is set relative to the entries
. $in R̂ c %.
−1
Specifically γ = λ R̂ c , using the Frobenius norm where ||Q||F = Tr QQH
F
and Tr(·) gives the trace of a matrix. This relationship is used so as to provide similar
performance as in (2.16) for a selected value of λ.
The first term in the objective function penalizes deviations from the measurements
in interference-free regions, but less so in interference regions. This term is akin to a
Mahalanobis distance. The second term penalizes large entries in the solution. Thus
0.8
0.6
CA CS, usf = 20
Full STAP
0.4 Full Adjoint
CS, usf = 20
CSTAP, usf = 20
Comp. MF, usf = 20
0.2
0
–50 –40 –30 –20 –10 0
Input SCR (dB)
Figure 2.11 The CA CS method subsamples the data just as the standard CS method does,
however it takes into account the covariance matrix that describes the interference structure.
By doing so, it improves the probability of detection over the CS case as well as beyond the
fully sampled, matched filter case that does not use the covariance information. These results
are shown with a probability of false alarm of 0.005.
68 Tuuk
the entries in the interfered region are unconstrained by the first term and they are
allowed to be driven to zero by the second term. The first term maintains fidelity to
the measurements in the clear areas while the second term promotes sparsity. Here the
γ parameter serves the same purpose as λ in (2.16): balancing the weight of the sparsity-
promoting 1 norm against the fidelity-preserving 2 norm.
Figure 2.11 compares the performance of various estimation techniques on problems
with clutter as the dominant interference. Here CA CS shows success beyond that of
the CS estimates. By using the covariance information, CA CS can even surpass the
performance of the fully sampled (but covariance-ignorant) matched filter. Still, the fully
sampled adaptive STAP filter remains the gold standard. Of particular interest is the 20×
under-sampled CA CS solution that achieves nearly the same detection performance
as the STAP solution. Also notable is the fact that the performance of the classically
formulated estimation methods (compressed STAP and compressed adjoint) is equal to
that of the CS-formulated ones (CA CS and CS).
When compared to the compressed STAP (CSTAP) methods, there is no advantage to
using CA CS. It is also evident that the compressed adjoint performs as well as the CS
solution in the interference limited case. This result holds for a range of under-sampling
factors, u, as well as for both the covariance-aware (CA CS vs. CSTAP) and covariance-
unaware (CS vs. compressed adjoint) cases. Furthermore, this performance equivalence
holds over a range of tested probability of false alarm settings.
2.8 Summary
As has been shown, clutter can be of great detriment to terrestrial and airborne radar sys-
tems. And mitigation of clutter can be a significant driver of radar design requirements,
in terms of dynamic range and space–time degrees of freedom. These statements hold
true whether in a classical or CS radar system. A rich body of literature that describs
techniques for mitigating clutter has led to the STAP filter that estimates interference
statistics for a range bin of interest from samples of nearby bins. This estimate is used to
whiten the bin under test prior to subsequent processing. This approach, leavened with
ideas from sparse estimation and low-rank matrix approximation, can be extended to the
compressed domain.
For the estimation of interference covariance information, the mechanisms of low-
rank matrix approximation have given new perspective to prior work in rank-reduced
and dimensionality-reduced STAP estimation. And recent results show that even in the
presence of sampling compression the interference covariance matrix can be accurately
estimated if it has a compact eigen-structure. This structure can be used with singular
value thresholding or other low-rank approximation techniques to recover the full and
the compressed covariance matrices to high accuracy.
And for the covariance information can be used in the recovery of a sparse scene from
compressed measurements. The whitening approach taken in STAP can be extended into
the sparsity-favoring convex objective function by a reshaping of the distance norm. This
allows the full machinery of CS to be applied to the problem.
However, there is significant computational cost associated with estimation of a large

spatiotemporal covariance matrix through low-rank matrix approximation and appli-
cation of that covariance matrix to estimate the targets in an iterative optimization
algorithm. This cost is, at present, too high to see widespread adoption of compressed
measurements in the airborne radar domain.
So work remains in these areas. A number of promising avenues exist, including the
application of advances in online and streaming CS to clutter estimation and mitigation
problems, further incorporation of fundamental results in random matrix theory to help
set algorithm thresholds, and the improvement of random sampling for large arrays in
both the time and space domains.
References
[1] L. Brown, A Radar History of World War II: Technical and Military Imperatives. Institute
of Physics Publishing, 1999.
[2] F. R. Dickey, M. Labitt, and F. M. Staudaher, “Development of airborne moving target
radar for long range surveillance,” Aerospace and Electronic Systems, IEEE Transactions
on, vol. 27, no. 6, pp. 959–972, Nov. 1991.
[3] L. P. Goetz and J. D. Albright, “Airborne pulse-doppler radar,” IRE Transactions on Military
Electronics, vol. MIL-5, no. 2, pp. 116–126, Apr. 1961.
[4] L. Brennan and L. Reed, “Theory of adaptive radar,” Aerospace and Electronic Systems,
IEEE Transactions on, vol. 9, no. 2, pp. 237–252, Mar. 1973.
[5] M. A. Richards, Fundamentals of Radar Signal Processing. McGraw-Hill, 2005.
[6] D. Shnidman, “Radar detection in clutter,” Aerospace and Electronic Systems, IEEE
Transactions on, vol. 41, no. 3, pp. 1056–1067, July 2005.
[7] D. Shnidman, “Generalized radar clutter model,” Aerospace and Electronic Systems, IEEE
Transactions on, vol. 35, no. 3, pp. 857–865, July 1999.
[8] J. Billingsley, A. Farina, F. Gini, M. Greco, and L. Verrazzani, “Statistical analyses of
measured radar ground clutter data,” Aerospace and Electronic Systems, IEEE Transactions
on, vol. 35, no. 2, pp. 579–593, Apr. 1999.
[9] V. Anastassopoulos, G. Lampropoulos, A. Drosopoulos, and N. Rey, “High resolution radar
clutter statistics,” Aerospace and Electronic Systems, IEEE Transactions on, vol. 35, no. 1,
pp. 43–60, Jan. 1999.
[10] R. McAulay, “A theory for optimal MTI digital signal processing part i. receiver
synthesis,” Massachusetts Institute of Technology Lincoln Laboratory, Lexington, MA,
Tech. Rep. 1972-14, Feb. 1972. [Online]. Available: www.ll.mit.edu/mission/aviation/
publications/publication-files/technical_notes/McAulay_1972_TN-1972-14i_WW-18358
.pdf.
[11] J. Ward, “Space-time adaptive processing for airborne radar,” MIT Lincoln Laboratory,
Lexington, MA, Tech. Rep. 1015, 1994.
[12] W. Melvin, “A STAP overview,” Aerospace and Electronic Systems Magazine, IEEE, vol. 19,
no. 1, pp. 19–35, Jan. 2004.
[13] J. Guerci, Space-Time Adaptive Processing for Radar. Artech House, 2003.
[14] I. Reed, J. Mallett, and L. Brennan, “Rapid convergence rate in adaptive arrays,” Aerospace
and Electronic Systems, IEEE Transactions on, vol. 10, no. 6, pp. 853–863, Nov. 1974.
70 Tuuk
[15] B. D. Carlson, “Covariance matrix estimation errors and diagonal loading in adaptive
arrays,” Aerospace and Electronic Systems, IEEE Transactions on, vol. 24, no. 4, pp. 397–
401, July 1988.
[16] J. Guerci, J. Goldstein, and I. Reed, “Optimal and adaptive reduced-rank STAP,” Aerospace
and Electronic Systems, IEEE Transactions on, vol. 36, no. 2, pp. 647–663, Apr. 2000.
[17] C. Peckham, A. Haimovich, T. Ayoub, J. Goldstein, and I. Reid, “Reduced-rank STAP
performance analysis,” Aerospace and Electronic Systems, IEEE Transactions on, vol. 36,
no. 2, pp. 664–676, Apr. 2000.
[18] R. Fa, R. de Lamare, and L. Wang, “Reduced-rank STAP schemes for airborne radar based
on switched joint interpolation, decimation and filtering algorithm,” Signal Processing,
IEEE Transactions on, vol. 58, no. 8, pp. 4182–4194, Aug. 2010.
[19] J. Tropp, J. Laska, M. Duarte, J. Romberg, and R. Baraniuk, “Beyond Nyquist: Efficient
sampling of sparse bandlimited signals,” Information Theory, IEEE Transactions on, vol. 56,
no. 1, pp. 520–544, Jan. 2010.
[20] L. Zhen, W. Xizhang, and L. Xiang, “CS-based moving target detection in random PRI
radar,” in Geoscience and Remote Sensing Symposium (IGARSS), 2012 IEEE International,
July 2012, pp. 7476–7479.
[21] L. Carin, “On the relationship between compressive sensing and random sensor arrays,”
Antennas and Propagation Magazine, IEEE, vol. 51, no. 5, pp. 72–81, Oct. 2009.
[22] L. Carin, D. Liu, and B. Guo, “Coherence, compressive sensing, and random sensor arrays,”
IEEE Antennas and Propagation Magazine, vol. 53, no. 4, pp. 28–39, Aug. 2011.
[23] P. P. Vaidyanathan and P. Pal, “Sparse sensing with co-prime samplers and arrays,” Signal
Processing, IEEE Transactions on, vol. 59, no. 2, pp. 573–586, Feb. 2011.
[24] E. J. Candès and B. Recht, “Exact matrix completion via convex optimization,” Foundations
of Computational Mathematics, vol. 9, no. 6, pp. 717–772, Dec. 2009.
[25] E. J. Candès and Y. Plan, “Matrix completion with noise,” Proceedings of the IEEE, vol. 98,
no. 6, pp. 925–936, Apr. 2010.
[26] D. Gross, “Recovering low-rank matrices from few coefficients in any basis,” Information
Theory, IEEE Transactions on, vol. 57, no. 3, pp. 1548–1566, Mar. 2011.
[27] E. J. Candès and T. Tao, “The power of convex relaxation: Near-optimal matrix completion,”
Information Theory, IEEE Transactions on, vol. 56, no. 5, pp. 2053–2080, May 2010.
[28] Y. Chen, Y. Chi, and A. J. Goldsmith, “Exact and stable covariance estimation from
quadratic sampling via convex programming,” Information Theory, IEEE Transactions on,
vol. 61, no. 7, pp. 4034–4059, July 2015.
[29] S. Bahmani and J. Romberg, “Near-optimal estimation of simultaneously sparse and low-
rank matrices from nested linear measurements,” Information and Inference: A Journal of
the IMA, vol. 5, no. 3, p. 331, Sept. 2016.
[30] L. Bai, S. Roy, and M. Rangaswamy, “Compressive radar clutter subspace estimation using
dictionary learning,” in Radar Conference (RADAR), 2013 IEEE, Apr. 2013, pp. 1–6.
[31] K. Sun, H. Zhang, G. Li, H. Meng, and X. Wang, “A novel STAP algorithm using
sparse recovery technique,” in Geoscience and Remote Sensing Symposium, 2009 IEEE
International, vol. 5, July 2009, pp. V-336-V-339.
[32] K. Sun, H. Meng, Y. Wang, and X. Wang, “Direct data domain STAP using sparse
representation of clutter spectrum,” Signal Processing, vol. 91, no. 9, pp. 2222–2236, Sept.
2011.
[33] F. Pourkamali-Anaraki, “Estimation of the sample covariance matrix from compressive
measurements,” IET Signal Processing, vol. 10, no. 9, pp. 1089–1095, Dec. 2016.
[34] S. Sen, “Low-rank matrix decomposition and spatio-temporal sparse recovery for stap
radar,” Selected Topics in Signal Processing, IEEE Journal of, vol. 9, no. 8, pp. 1510–1523,
Dec. 2015.
[35] J.-F. Cai, E. J. Candès, and Z. Shen, “A singular value thresholding algorithm for matrix
completion,” SIAM Journal on Optimization, vol. 20, no. 4, pp. 1956–1982, Mar. 2010.
[36] I. Selesnick, S. Pillai, K. Y. Li, and B. Himed, “Angle-doppler processing using sparse regu-
larization,” in Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International
Conference on, Mar. 2010, pp. 2750–2753.
[37] Y. Yu, S. Sun, and A. P. Petropulu, “A capon beamforming method for clutter suppression in
colocated compressive sensing based MIMO radars,” in SPIE Conference Proceedings, vol.
8717, May 2013, p. 87170J. [Online]. Available: https://doi.org/10.1117/12.2015635.
[38] J. T. Parker and L. C. Potter, “A Bayesian perspective on sparse regularization for STAP
post-processing,” in Radar Conference, 2010 IEEE, May 2010, pp. 1471–1475.
[39] Y. C. Eldar, R. Levi, and A. Cohen, “Clutter removal in sub-nyquist radar,” IEEE Signal
Processing Letters, vol. 22, no. 2, pp. 177–181, Feb. 2015.
[40] P. B. Tuuk and S. L. Marple, “Compressed sensing radar amid noise and clutter using
interference covariance information,” Aerospace and Electronic Systems, IEEE Transactions
on, vol. 50, no. 2, pp. 887–897, Apr. 2014.
[41] P. B. Tuuk and S. L. Marple, “Compressed sensing radar amid noise and clutter,” in Signals,
Systems and Computers, 2012 Conference Record of the Forty-Sixth Asilomar Conference
on, Nov. 2012, pp. 446–450.
3 RFI Mitigation Based on Compressive
Sensing Methods for UWB
Radar Imaging
Tianyi Zhang, Jiaying Ren, Jian Li, David J. Greene, Jeremy A. Johnston, and Lam
H. Nguyen
3.1 Introduction
Ultra-wideband (UWB) radar operating at low frequencies (for example, from under
100 MHz to several GHz) has been used in a wide range of applications, including
landmine and unexploded ordinance (UXO) detection using ground-penetrating radar
(GPR), imaging with foliage-penetrating (FOPEN) radar, as well as detecting hidden
humans or objects via through-wall imaging [1]. Examples of such radar systems,
including the US Army Research Laboratory (ARL)’s synchronous impulse reconstruc-
tion (SIRE) system, which is a forward-looking GPR (FLGPR) system, and the ARL’s
BoomSAR system, are shown in [1]. These UWB systems are important to both military
and civilian applications.
One significant challenge for the proper operations of an UWB radar system is deal-
ing with the severe radio frequency interference (RFI) they encounter, since there are
many competing users within the UWB frequency range in which they operate. Typical
RFI sources include FM radio transmitters, TV broadcast transmitters, cellular phones;
their operating frequency bands tend to overlap with that of UWB radar systems [2].
Figure 3.1 shows an example of the spectrum of a radar echo signal from a single pulse
repetition interval (PRI), the spectrum of an RFI source (containing strong narrowband
interferes), and the spectrum of the radar echo signal merged with the RFI. Note that
the presence of RFI causes significant distortions to the spectrum of the received sig-
nal. Therefore, the RFI signal poses a significant hindrance to the proper operation of
UWB radar in terms of reduced signal-to-noise ratio and degraded radar image quality.
Therefore, effective RFI mitigation is critically important for the proper functioning of
a UWB radar system.
RFI mitigation is a notoriously challenging problem because RFI signals are dif-
ficult to predict and model accurately due to their dynamic range and diverse mod-
ulation schemes. Many methods have been developed for RFI mitigation, including
RFI suppression via filtering techniques and RFI extraction based on RFI estimation.
The former suppression approaches, including notch-filtering, subband filtering and
adaptive filtering, though popular due to their simplicity, usually introduce sidelobes
in the time-domain and suffer from filter transients and reduced data length [2–6]. The
latter RFI extraction methods are composed of techniques based on, for example, para-
metric modeling [7], spectral decomposition [8], eigensubspace decomposition [9,10],
72
RFI Mitigation Based on Compressive Sensing Methods for UWB Radar Imaging 73
160
Original SAR Data
RFI data
140 SAR+RFI
120
Amplitude(dB)
100
80
60
0 500 1000 1500

Frequency (MHz)
Figure 3.1 Comparison of spectrum obtained from experimental data collected by ARL’s
BoomSAR.
and independent component analysis (ICA) [10,11]. Most of these methods can only
provide satisfying performance under certain assumptions or constraints; most assump-
tions and constraints are often invalid in practice due to the difficulties of modeling
complex RFI sources. For instance, principal-component-based techniques, such as ICA
and eigendecomposition, heavily depend on orthogonal subspaces and have difficulty in
distinguishing between RFI sources and UWB radar echoes when they have similar
power levels within the same subspace [12].
The recent development of compressive sensing (CS) theory has stimulated numerous
investigations on exploiting sparsity and low-rank properties for RFI mitigation. Early
sparsity-based recovery methods solve the RFI mitigation problem by modeling both
the desired UWB radar echo signal and the RFI sources as sparse with respect to well-
designed dictionaries [1,12]. These methods work well but suffer from a significant
drawback: they require an additional dictionary-learning step based on the measured or
estimated data. In [13,14], an improved method based on a joint sparse and low-rank
model is proposed. By simply exploiting the low-rank property of the RFI sources, the
method does not require any specific prior knowledge on the interferences. The robust
principal component analysis (RPCA) approach can be used to extract RFI sources from
the observed data [15]. More specifically, this approach models the RFI contamination
across multiple PRIs within a coherent processing interval (CPI) using a general, low-
rank structure while treating the UWB radar echo signals as sparse impulse outliers.
Unlike the previous CS-based approaches, this technique is completely adaptive to
highly time-varying operating environments and does not require any prior knowledge
on the dictionary for the desired UWB radar signals and the unwanted RFI sources.
Additionally, the RPCA-based RFI mitigation method can be easily incorporated into
diverse UWB radar systems as a preprocessing stage before further signal processing
74 Zhang, Ren, Li, Greene, Johnston, and Nguyen
and image formation [15]. However, RPCA requires the fine-tuning of a user parameter,
which is nontrivial in practical applications due to the lack of prior information on the
RFI sources and the desired radar echo signals.
Since the desired UWB radar echo signals are relatively weak with a flat spectrum and
RFI sources are typically strong, with sparse, narrow spectral peaks, the RFI sources can
be approximately modeled as sparse spectral lines. The parameters of RFI sources can
be estimated using the CLEAN algorithm, which is a conceptually and computationally
simple approach widely used in diverse applications [16–19]. After parameterization,
the RFI sources can be extracted out from the observed RFI-contaminated data. More-
over, the CLEAN algorithm can be used with the Bayesian information criterion (BIC)
[20–22] to estimate the number of spectral lines needed to model the RFI sources.
Unlike the aforementioned RPCA method, the CLEAN-BIC approach does not require
the selection of any user parameter and hence can be easily used in practical applica-
tions. However, the recovered UWB radar echoes that were obtained by the CLEAN-
BIC approach are not sparsified, and the resulting radar images appear noisy in the
presence of severe RFI and noise contaminations.
To take advantage of the merits of both RPCA and CLEAN-BIC, we consider a hybrid
method, referred to as HM. We first utilize the CLEAN-BIC algorithm to estimate the
signal-to-interference ratio (SIR) of the received RFI-contaminated data. If the estimated
SIR is above a certain threshold, we use CLEAN-BIC to recover the desired UWB radar
echo signals. Otherwise, we use RPCA with the user parameter recommended in [23]
for RFI mitigation.
Furthermore, we introduce a framework of choosing the proper user parameter for the
RPCA method for RFI mitigation via first using CLEAN-BIC to estimate the SIR. The
RPCA algorithm, with its user parameter determined by the estimated SIR, is then used
for RFI mitigation. We refer to this approach as RPCA-CB.
Finally, both simulated and experimental results are presented to evaluate the RFI mit-
igation performance of the aforementioned algorithms in the presence of various levels
of RFI contaminations. Specifically, our experiments are conducted using the measured
RFI-free radar echo data set with two different RFI data sets: a simulated RFI-only
data set and measured RFI-only data set. The measured UWB data set was collected by
the ARL using their impulse-based, low-frequency UWB BoomSAR system covering
a frequency band from approximate 50 MHz to 1150 MHz. The UWB BoomSAR was
mounted on a platform that emulated an airborne geometry. Two transmitters and two
receivers were used to collect data in different polarizations. The measured data used
in these experiments was configured in a horizontal transmit, horizontal receive (HH)
polarization. The simulated RFI-only data set has the RFI sources modeled as a sum
of sinusoids, whereas the measured RFI-only data set is collected by the ARL radar
receiver with the antenna pointing toward Washington, DC [1,12]. We show that RPCA-
CB outperforms the aforementioned RPCA, CLEAN-BIC, and HM algorithms.
Notation: Most of the notation used in this chapter is listed in the list of Symbols
in the Preface. For this chapter, we need to add some definitions. det (·) denotes the
determinant of a matrix. x k· and x ·k refer to the k-th row and k-th column of matrix
X, respectively. R ∈ CN×M denotes the complex-valued N × M matrix. For a matrix,
·p means the p element-wise norm of this matrix, ·F is the Frobenius norm of a
matrix. · ∞ means the infinite norm of a matrix and · ∗ denotes the nuclear norm of
a matrix, that is, the sum of the singular values of the matrix. The subscript of I denotes
/ to denote the estimate
the size of the identity matrix. To avoid confusion, we also use (·)
of a parameter. X,Y = tr(X Y ) denotes the inner product of two matrices X and Y .
T
3.2 RPCA for RFI Mitigation
3.2.1 Problem Formulation

Consider an RFI-contaminated UWB radar system, with M PRIs within a CPI and N
samples, referred to as fast-time samples, per PRI. Then, the observed data, denoted by
matrix Y ∈ RN×M , can be modeled as follows:
Y = X + R, (3.1)
where each column y ·m of Y denotes a data vector within a PRI with N fast-time
samples, composed of the desired UWB radar echo signal x ·m and the RFI-only signal
r ·m , m = 1,2,. . .,M. Here, m is the PRI index, i.e., the slow-time index. Our goal is
to extract the desired UWB radar signal X and the RFI signal R from the observed
RFI-contaminated data Y .
We exploit two main assumptions to accomplish the RFI mitigation task: (1) the
desired UWB radar echo matrix X is sparse and (2) the RFI matrix R is low rank [15].
The sparse nature of the desired UWB radar echo matrix is confirmed and illustrated in
Figure 3.2, where a typical example of the UWB impulse radar echoes from multiple
PRIs, i.e., across multiple slow-time indices, within a CPI, as measured by an ARL
radar receiver in the absence of RFI, is depicted. These measured UWB radar echo
signals contain occasional narrow backscattered pulses, which indicate the distances
between the targets and the radar and the reflection coefficients of the targets. Due to
the sparsity of strong targets, X is in general quite sparse in the fast-time domain. It
is worth mentioning that this sparse property of the radar echo signal is also valid for
stepped-frequency or chirp UWB radar systems, since the echoes received by these radar
systems can be easily converted into sparse narrow pulses through pulse compression.
The low-rank property of the RFI sources has been observed and utilized in [13,14].
Figure 3.3 gives the fast-time RFI-only spectra from multiple PRIs within a CPI using
data measured by the ARL radar. RFI sources, such as FM radios, digital TV and cellular
phones, tend to have strong sparse peaks in the fast-time frequency domain, whereas, in
contrast, the entire frequency band of the UWB radar system is occupied by the radar
echoes. Figure 3.4 shows that the singular values of the measured RFI-only data matrix
R decreases rapidly and the bottom 90% of these singular values are zero or close to
zero, confirming again that R possesses the low-rank property. Within a small time
window, such as within a CPI, the low-rank property of the RFI sources appears to be
due to the sinusoidal carrier approximations resulting from various modulation schemes
popular in today’s wireless communications systems [15].
(dB)
0
200
–5
400
600 –10
Fast-Time Samples
800
–15
1000
1200 –20
1400
–25
1600
–30
1800
2000
–35
200 400 600 800 1000 1200 1400 1600 1800
Slow-Time Index
Figure 3.2 An example of a measured ARL UWB impulse radar echo signal in the fast-time vs.
slow-time domain in the absence of RFI.
10 8 (dB)
15 0
–5
–10
–15
10
Frequency (Hz)
–20
–25
–30
5
–35
–40
–45
0 –50
200 400 600 800 1000 1200 1400 1600 1800
Slow-Time Index
Figure 3.3 An example of fast-time RFI spectrum vs. slow-time index of the RFI-only data
measured by the ARL radar receiver.
12
10
8
Singular Value
0
0 200 400 600 800 1000 1200 1400 1600 1800
Figure 3.4 The singular values of the RFI-only matrix R measured by the ARL radar receiver.
3.2.2 RPCA
RPCA has been widely used to recover the low-rank and sparse components from their
mixtures [23]. A natural choice of extracting X and R from the collected data Y is to
use the following RPCA optimization metric [15]:
min ||R||∗ + ζ||X||1

(3.2)
s.t. Y = X + R,
where · ∗ denotes the nuclear norm that promotes the low rank property, · 1
represents the 1 element-wise norm, which is a sparsity-enforcing metric, and ζ is a
user parameter used to balance the trade-off between the two components. The user
parameter ζ is recommended to be ζ 0 = √max{N,M}
1
in [24].
Following from [24], we employ the augmented Lagrange multiplier (ALM) method
to solve the RPCA problem efficiently. First, we rewrite the RPCA problem (3.2) as
follows:
min ||R||∗ + ζ||X||1
(3.3)
s.t. Y − X − R = 0.
Then, the Lagrangian function is given by:

μ
L(X,R,Z,μ) = ||R||∗ + ζ||X||1 + < Z,Y − X − R > + ||Y − X − R||2F , (3.4)
2
where Z is the Lagrange multiplier and μ > 0 is the penalty parameter.
Algorithm 1 (RPCA via the Inexact ALM method)
Input: Observation matrix Y ∈ RN×M , and tuning parameter ζ

1: Z ∗0 = Y /J (Y ), where J (Y ) = max(||Y ||2,ζ−1 ||Y ||∞ );
2: X0 = 0;μ0 > 0; k = 0.
3: while not converged do
4: // Lines 5-6 solve R ∗k+1 = arg minR L(Xk ,R,Z k ,μk ).
5: (U,,V ) = svd(Y − X k + μ−1 k Z k );
6: R k+1 = U Sμ−1 []V T , where Sε []ij = max{1 − ε/| ij |,0} ij ;
k
7: // Lines 8 solves Xk+1 = arg minX L(X,R k+1,Z k ,μk ).
8: X k+1 = Sζ/μk [Y − R k+1 + μ−1k Zk ]
9: Z k+1 = Z k + μk (Y − R k+1 − Xk+1 ).
10: Update μk to μk+1
11: k ←k+1
12: end while
Output: (Xk ,R k ).
There are two types of ALM algorithms that can be used to solve the RPCA problem:
the exact ALM algorithm and the inexact ALM algorithm [24]. Compared with the
inexact ALM algorithm, which updates X∗k and R ∗k once when solving the sub-problem:
(X∗k+1,R ∗k+1 ) = arg min L(X,R,Z ∗k ,μk ), (3.5)

X,R
the exact ALM method, though performing sightly better due to using the iterative
thresholding approach to solve this sub-problem, requires a much longer computation
time even for moderate N and M. Therefore, we focus herein on using the inexact
ALM method. The detailed steps of the inexact ALM algorithm are summarized in
Algorithm 1 [24]. More implementation details can be found in [24].
In this section we provide initial evaluations of the RFI mitigation performance of
the RPCA algorithm using the measured data collected by ARL’s BoomSAR system.
(Further performance evaluations are given in Section 3.5.) The measured data consists
of the measured RFI-free radar echo data (see Figure 3.2) and the measured RFI-only
data (see Figure 3.3). We scale the RFI-only data based on the desired SIR value, i.e.,
||X||2F
, and add the scaled version of the RFI-only data to the RFI-free data X to obtain
||R||2F
the RFI-contaminated data Y .
By using all the data within the CPI, we can form synthetic aperture radar (SAR)
images. Figure 3.5 shows the original RFI-free SAR image that was obtained from
the data in Figure 3.2. Figure 3.6 shows the RFI-contaminated SAR image that was
obtained from Y for SIR = −10 dB. Figures 3.7 and 3.8 compare the recovered
SAR images after RFI mitigation using the RPCA method with different ζ values for
SIR = −10 dB. Visually, for SIR at −10 dB, RPCA with ζ = 0.4ζ0 significantly
outperforms RPCA with the recommended ζ0 , since the recovered SAR image that
was obtained by the latter is much sparser than the original RFI-free SAR image, with
many small targets missing. Consider now the case of more severe RFI; Figure 3.9
shows the RFI-contaminated SAR image for SIR = −30 dB. Figures 3.10 and 3.11
(dB)
40 0
50 –5
60 –10
Down Range (meters)
70 –15
80 –20
90 –25
100 –30
110 –35
–740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
Cross Range (meters)
Figure 3.5 Original RFI-free SAR image obtained by using the back projection algorithm on the
measured RFI-free ARL radar data.
(dB)
40 0
50 –5
60 –10
Down Range (meters)
70 –15
80 –20
90 –25
100 –30
110 –35
–740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
Figure 3.6 SAR image of the measured SAR data set contaminated by measured RFI signals with
a SIR of −10 dB.
(dB)
40 0
50 –5
60 –10
Down Range (meters)
70 –15
80 –20
90 –25
100 –30
110 –35
–740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
Figure 3.7 Recovered SAR image from the measured SAR data set contaminated by measured
RFI signals with a SIR of −10 dB obtained by using RPCA with ζ = 0.4ζ0 .
(dB)
40 0
50 –5
60 –10
Down Range (meters)
70 –15
80 –20
90 –25
100 –30
110 –35
–740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
RFI signals with a SIR of −10 dB, obtained by using RPCA with ζ = ζ0 .
(dB)
40 0
50 –5
60 –10
70 –15
80 –20
90 –25
100 –30
110 –35
–740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
Figure 3.9 SAR image from the measured SAR data set contaminated by measured RFI signals
with a SIR of −30 dB.
(dB)
40 0
50 –5
60 –10
Down Range (meters)
70 –15
80 –20
90 –25
100 –30
110 –35
–740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
RFI signals with a SIR of −30 dB obtained by using RPCA with ζ = 0.4ζ0 .
(dB)
40 0
50 –5
60 –10
Down Range (meters)
70 –15
80 –20
90 –25
100 –30
110 –35
–740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
RFI signals with a SIR of −30 dB obtained by using RPCA with ζ = ζ 0 .
compare the recovered SAR images after RFI mitigation using the RPCA method with
different ζ values for SIR = −30 dB. Note that most of the target features are still
retained in the resulting SAR image that was obtained by RPCA with ζ = ζ 0 , whereas
the SAR image obtained by RPCA with ζ = 0.4ζ 0 has a fairly high noise level, with
only a few strong targets discernible. Therefore, it appears that the choice of the user
parameter ζ has a significant impact on the RFI mitigation performance of the RPCA
algorithm and the resulting SAR image quality. Thus, a fine-tuning of ζ based on the
SIR value is warranted. However, this parameter selection is not straightforward in
practical applications, since the SIR value is usually not known a priori.
3.3 CLEAN-BIC for RFI Mitigation
3.3.1 RFI Sinusoidal Model

Using the fact that the RFI sources tend to have strong, narrow peaks, we now consider
approximately modeling the narrowband RFI sources within each PRI as sparse spectral
lines in the frequency domain (see Figure 3.1). Since the UWB radar target echo signals
are relatively weak and have flat spectrum in the fast-time frequency domain, they, along
with other uncertainties and approximations, are modeled as noise. Within a PRI, the
observed data, denoted by vector y ∈ RN , can be modeled as follows:
y = x + r, (3.6)
where r ∈ RN denotes the RFI signal, which can be modeled as a sum of sinusoids:

Kc
r(n) = ak cos(ωk n + φk )
k=1
Kc 0
ak ak −j (ωk n+φk ) 1
= ej (ωk n+φk ) + e (3.7)
2 2
k=1

2Kc
= ãk ej (ω̃k n+φ̃k ), n = 0,1,. . .,N − 1
k=1
with r(n) denoting the n-th element of r, ak ,ωk ,φk denoting the amplitude, frequency
and phase of the k-th RFI source, and Kc denoting the number of the RFI sources.
Since r(n) is real-valued, we state that ãk = ãKc +k = a2k , ω̃k = −ω̃Kc +k = ωk , φ̃k =
−φ̃Kc +k = φk , k = 1,. . .,Kc . In (3.6), x ∈ RN denotes the desired UWB radar echo
signal within one PRI plus other uncertainties and approximations, and x will be treated
as additive noise when estimating the sinusoidal parameters in r. Our goal is to recover
x by estimating the sinusoidal parameters in r and removing the estimated r̂ of r from
the RFI-contaminated data y.
To reduce approximation errors, the sinusoidal parameters are allowed to change from
one PRI to another. Hence the CLEAN-BIC algorithm presented in the next subsection
is applied to one PRI at a time.
3.3.2 CLEAN-BIC
To remove the RFI sources, modeled as a sum of sinusoids, from the observed RFI-
contaminated data, we first estimate the sinusoidal parameters of these RFI sources.
CLEAN, with its name meaning “cleaning the undesired component from the data,” is
a simple and robust sinusoidal parameter estimation algorithm [18,19], first introduced
in [16]. From as early as 1988, CLEAN has been used for the extraction of a single
interference signal [17]. Since then, several works have considered using extensions of
CLEAN for RFI mitigation [25–28].
The basic idea of CLEAN is to first estimate the parameters of the strongest sinusoid
by minimizing the following nonlinear least-squares (NLS) criterion:
ˆ 1, α̃ˆ 1 } = arg min ||y − w(ω̃1 ) α̃ 1 ||2,

{ ω̃ (3.8)
2
{ω̃1, α̃ 1 }
where α̃ k = ãk ej φ̃k ,k = 1,2,. . .,2Kc , w(ω̃) = [1 ej ω̃ · · · ej ω̃(N−1) ]T . The minimiza-

tion of the cost function above with respect to ω̃1 , α̃ 1 yields:
ˆ 1 = arg max |wH ( ω̃1 )y|2,
ω̃ (3.9)
ω̃1
wH ( ω̃1 )y
α̃ˆ 1 = ˆ1
. (3.10)
N ω̃1 =ω̃
Hence, ω̃ˆ 1 is obtained as the location of the dominant peak of the scaled periodogram,
|w ( ω̃1 )y|2 , which can be efficiently computed by using the FFT with the data sequence
H
y padded with zeros. Note that we consider fitting y to a complex-valued sinusoid in

(3.8) because we can implement (3.9) using computationally efficient FFT operations.
Note also that the accuracy of the frequency estimate of CLEAN can be improved
by using Newton’s method following the FFT so that the accuracy of the frequency
estimate of CLEAN is not limited to the grid size determined by the zero-padding that
is used for the FFT operations. Once ω̃ ˆ 1 is determined, calculating α̃ˆ 1 using (3.10) is
ˆ
straightforward. We can obtain ãˆ 1 and φ̃1 from α̃ˆ 1 easily, too. Next, since we have
mentioned earlier that ãk = ãKc +k = a2k , ω̃k = −ω̃Kc +k = ωk , φ̃k = −φ̃Kc +k =
φk , k = 1,. . .,Kc , due to the real-valued data vector r, the dominant RFI source can be
estimated as
ˆ ˆ ˆ ˆ
r̂1 (n) = ãˆ 1 ej (ω̃1 n+φ̃1 ) + ãˆ 1 e−j (ω̃1 n+φ̃1 )
(3.11)
= â1 cos(ω̂1 n + φ̂1 ), n = 0,1,. . .,N − 1.
From here, it is subtracted out from the RFI-contaminated data to get the residue data
as follows:
y 2 = y − r̂ 1, (3.12)
where r̂ 1 is a vector formed from {r̂1 (n)}N−1

n=0 . Then, the same process is repeated on
the residue data y 2 to estimate the second-strongest sinusoid. For k = 3,. . .,Kc , let
y k = y k−1 − r̂ k−1 . This process is repeated, to obtain r̂ k from y k , until the desired
number of sinusoids Kc is reached.
The CLEAN algorithm can be used with the BIC [20–22] to estimate the number of
sinusoids needed to model the RFI sources.
The BIC rule, when assuming K sinusoids in zero-mean white Gaussian noise with
unknown variance σ 2 , has the following form [20,29,30]:
BIC(K) = −2 ln p(y, θ̂ K ) + ln[det(Ĵ K )], (3.13)
where p(y, θ̂) denotes the likelihood function of y, θK = [γTK ,σK 2 ]T denotes the
unknown parameter vector of y, γ K = [a1,. . .,aK ,ω1,. . .,ωK , φ1,. . .,φK ]T denotes
the unknown sinusoidal parameter vector, and J K is the following Fisher information
matrix (FIM):

∂ 2 ln p(y,θK )
J K = −E . (3.14)
∂ θK ∂ θTK
The negative log-likelihood function of y has the form:
||y − r̆ K ||22
− 2 ln p(y,θK ) = N ln 2π + N ln σK
2
+ , (3.15)
σK
2
where r̆ K is the same as r, except that it corresponds to the assumed K (instead of Kc )

sinusoids. The maximum likelihood (ML) estimates of γ K and σK 2 can be obtained by
minimizing the negative log-likelihood function in (3.15):
γ̂K = arg min ||y − r̆ K ||22, (3.16)

γK

1
σ̂K
2
= ||y − r̆ K ||22 . (3.17)
N γ K =γ̂ K
Then, the corresponding value of the likelihood function is given by [20]:
− 2 ln p(y, θ̂K ) = constant + N ln σ̂K

2
, (3.18)
where
2
1
N K

σ̂K
2
= y(n) − âk cos(ω̂k n + φ̂k ) . (3.19)
N
n=1 k=1
Note that in the case of sinusoidal signals, we have the following approximation for
sufficiently large values of N [20,31,32]:
K N Ĵ K K N ≈ K N J K K N = O(1), (3.20)
where
2 3
1
I
N 3/2 K
0
KN = 1 , (3.21)
0 I
N 1/2 2K+1
with I K denoting the K × K identity matrix. Then we obtain by a simple calculation:
ln[det(Ĵ K )] = ln[det(K −2
N )] + ln[det(K N Ĵ K K N )]
= (2K + 1) ln N + 3K ln N + O{1} (3.22)
= (5K + 1) ln N + O{1}.
Hence, the BIC metric takes on the following form [20] for our problem of interest:
BIC(K) = N ln σ̂K
2
+ 5K ln N, (3.23)
where we have kept only the terms that depend on K.

Because obtaining the ML estimate of γ is computationally expensive, we use
CLEAN instead. In practical applications, where the sinusoidal data model is in itself
an approximation, our empirical experience suggests that using CLEAN in lieu of ML
yields similar RFI extraction performance. Combined with the BIC rule, the steps of the
CLEAN-BIC algorithm for RFI removal can be summarized as follows:
Algorithm 2 (CLEAN-BIC)
Input: Observation vector y ∈ RN , maximum number of sinusoids Kmax .

1: Assume K = 1. Let y 1 = y. Estimate the sinusoidal parameters {â1, ω̂1, φ̂1 } from the
original data vector y 1 using FFT followed by using the Newton’s method for fine search.
Determine r̂ 1 .
2: Calculate BIC(1) by (3.23).
3: for K = 2 : Kmax
4: Compute y K = y K−1 − r̂ K−1 .
5: Estimate the sinusoidal parameters {âK , ω̂K , φ̂K } from y K .
6: Calculate BIC(K).
7: end
8: Determine K̂c that minimizes BIC(K), K = 1,. . .,Kmax .
9: Obtain γ̂ K̂ , the estimate of γ, and the corresponding r̂ K̂ , the estimate of r.
c c
10: x̂ = y − r̂ K̂ .
c
Output: (x̂, r̂ K̂ ).
c
104
4.9
4.8
4.7
4.6
BIC(K)
4.5
4.4
4.3
4.2
0 50 100 150 200 250 300
K
Figure 3.12 BIC curve obtained via CLEAN-BIC for one PRI using the RFI-contaminated data
measured by the ARL radar for SIR = −30 dB.
Note that we need to iterate the algorithm until the iteration number Kmax is reached,
since the BIC curve is usually not a smooth convex function of K. Figure 3.12 shows
BIC(K) versus K for one PRI via CLEAN-BIC on the RFI-contaminated data that was
measured by the ARL radar when SIR = −30 dB.
10 8 (dB)
15 0
–5
–10
–15
10
Frequency (Hz)
–20
–25
–30
5
–35
–40
–45
0 –50
200 400 600 800 1000 1200 1400 1600 1800
Slow-Time Index
Figure 3.13 Fast-time RFI spectrum vs. slow-time index of the noise-free, simulated RFI data.
We now evaluate the performance of the CLEAN-BIC algorithm using the same
measured data set as used in Section 3.2. First, we simulate a set of noise-free RFI
data using 10 sinusoids. Figure 3.13 shows the fast-time spectrum versus slow-time
index image of the simulated RFI sources. The RFI-contaminated data is generated by
combining the measured radar echo data with the scaled simulated RFI data based on
the desired SIR. Figure 3.14 shows the contaminated SAR image at SIR = −10 dB.
Figure 3.15 shows the recovered SAR image at SIR = −10 dB. Figure 3.16 shows the
contaminated SAR image at SIR = −30 dB. Figure 3.17 shows the recovered SAR
image at SIR = −30 dB. Comparing Figures 3.16 and 3.17 with the original RFI-free
SAR image in Figure 3.5, we note that for the noise-free simulated RFI data, most targets
in the recovered SAR images are discernible and the noise levels of the recovered SAR
images are low, meaning that CLEAN-BIC does an excellent job of eliminating the
simple simulated and noise-free RFI sources even at very low SIR levels.
Next, we use CLEAN-BIC on the UWB radar echo data contaminated by the mea-
sured RFI signal (see Figure 3.3). Figures 3.18 and 3.19 show the recovered SAR images
after RFI mitigation using CLEAN-BIC for SIR = −10 dB and −30 dB, respectively.
Visually, for SIR = −10 dB, CLEAN-BIC performs well because it removes most of
the RFI signals and retains most targets. However, with a SIR of −30 dB, the recovered
SAR image that was obtained by CLEAN-BIC suffers from a high noise level, as
compared to its RPCA counterpart in Figure 3.11. The high noise level is due to the
fact that the measured RFI data inevitably contains noise (including thermal noise,
model approximation errors, and other uncertainties), which is amplified when the SIR
(dB)
40 0
50 –5
60 –10
Down Range (meters)
70 –15
80 –20
90 –25
100 –30
110 –35
740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
Figure 3.14 SAR image of the measured SAR data set contaminated by simulated RFI signals
(dB)
40 0
50 –5
60 –10
Down Range (meters)
70 –15
80 –20
90 –25
100 –30
110 –35
–740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
Figure 3.15 Recovered SAR image of the measured SAR data set contaminated by simulated RFI
signals with a SIR of −10 dB, obtained with CLEAN-BIC.
(dB)
40 0
50 –5
60 –10
Down Range (meters)
70 –15
80 –20
90 –25
100 –30
110 –35
–740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
Figure 3.16 SAR image of the measured SAR data set contaminated by simulated RFI signals
(dB)
40 0
50 –5
60 –10
Down Range (meters)
70 –15
80 –20
90 –25
100 –30
110 –35
–740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
Figure 3.17 Recovered SAR image of the measured SAR data set contaminated by simulated RFI
signals with a SIR of −30 dB obtained by using CLEAN-BIC.
(dB)
40 0
50 –5
60 –10
Down Range (meters)
70 –15
80 –20
90 –25
100 –30
110 –35
–740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
Figure 3.18 Recovered SAR image of the measured SAR data set contaminated by measured RFI
(dB)
40 0
50 –5
60 –10
Down Range (meters)
70 –15
80 –20
90 –25
100 –30
110 –35
–740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
Figure 3.19 Recovered SAR image of the measured SAR data set contaminated by measured RFI
is decreased. Moreover, unlike RPCA, CLEAN-BIC does not make any attempt to
sparsify the radar echo signal, and returns a recovered SAR image with a high noise
level, especially when the SIR is low.
3.4 Enhanced Algorithms for RFI Mitigation
3.4.1 The Hybrid Method

To combine the merits of both RPCA and CLEAN-BIC, we consider a hybrid method
/ via CLEAN-BIC. Then, the
(HM). We first obtain the estimate of the RFI sources, R,
/ /
SIR is estimated as Y − RF /RF . In our previous analysis on RPCA, we have shown
2 2
that RPCA with the recommended ζ0 works well when the SIR is low. We have also
shown that CLEAN-BIC does a satisfying job at high SIR. Therefore, if the estimated
/ that was obtained via CLEAN-BIC
SIR is high, such as −10 dB or higher, we use the R
to recover the desired UWB radar echo signals via X/ = Y − R. / If the estimated SIR is
lower than −10 dB, we then use RPCA with the recommended ζ0 to obtain X. /
3.4.2 RPCA-CB
We further consider using CLEAN-BIC to determine the user parameter ζ for RPCA,
since it appears that ζ should be tuned based on the SIR. This method is referred to
as RPCA-CB (RPCA with its user parameter determined via CLEAN-BIC). Similar to
the HM method, we first use CLEAN-BIC to obtain R. / Then, the user parameter ζ
for RPCA is selected based on the estimated SIR. The process of RPCA-CB can be
illustrated by Figure 3.20. More details on the selection of ζ based on the estimated SIR
will be depicted in Section 3.5.
Figure 3.20 A flow graph depicting the connections between RPCA and CLEAN-BIC in
RPCA-CB.
3.5 Performance Evaluations
We evaluate the RFI mitigation performance of the aforementioned four methods and
compare them to each other in this section, using both simulated and experimentally
measured data.
3.5.1 Data Sets

Our performance evaluations are conducted using the measured RFI-free UWB radar
echo data set with two different RFI-only data sets: a realistically simulated RFI-only
data set and a measured noisy RFI-only data set. The measured RFI-free UWB radar
echo data set (see Figures 3.2 and 3.5) is collected by using the ARL BoomSAR system,
with a frequency band spanning from approximately 50 MHz to 1150 MHz. The UWB
BoomSAR is mounted on a platform that emulates an airborne geometry. Two trans-
mitters and two receivers are used to collect data in different polarizations. The mea-
sured data used herein is configured in a horizontal transmit, horizontal receive (HH)
polarizations. The measured RFI-only data set (see Figure 3.3) is collected by the ARL
radar receiver with the antenna pointing toward Washington, DC [1,12]. Since the ARL
experimental hardware system is used to measure the RFI-only data (or more precisely,
the data that is free of the UWB radar echoes), the measured RFI-only data is inevitably
contaminated only by the hardware system noise. The realistically simulated RFI-only
data set (see Figure 3.21) is noise-free and composed of sinusoids estimated by using
10 8 (dB)
15 0
–5
–10
–15
10
Frequency (Hz)
–20
–25
–30
5
–35
–40
–45
0 –50
200 400 600 800 1000 1200 1400 1600 1800
Slow-Time Index
Figure 3.21 Fast-time RFI spectrum vs. slow-time index for the noise-free realistically simulated
RFI data.
the CLEAN-BIC algorithm on the measured ARL RFI-only data. Note that the realis-
tically simulated RFI-only data utilized herein is much more complex than that used in
Section 3.3 and consists of about 80 sinusoids per PRI. We note that Figure 3.21 resem-
bles its counterpart, obtained from measured data. All data sets have N = 2048 fast-time
samples and M = 1892 slow-time samples.
The received RFI-contaminated radar data is obtained by adjusting the power of the
RFI-only data based on the desired SIR and adding it to the RFI-free UWB radar echo
signal. Since the measured RFI-only data set contains noise, as we lower the SIR, we
are increasing the noise level of the received RFI-contaminated radar data as well. Note
that we have no prior information on the noise in the measured data.
3.5.2 Evaluation Metric

The SAR image is obtained using the back projection algorithm on the recovered radar
data after RFI mitigation. We utilize the signal-to-noise ratio (SNR) metric of the recov-
ered SAR image to benchmark the RFI mitigation performance:
ZF
SNR Metric = 20log10 4 4 , (3.24)
4Z − /
Z4 F
where Z is the normalized, original, and RFI-free SAR image, and /

Z is the normalized,
recovered SAR image.
3.5.3 Analysis of RPCA

First, we investigate the performance of RPCA by varying the user parameter ζ for vari-
ous SIR levels. Tables 3.1 and 3.2 compare the SNR metric of the recovered SAR images
that were obtained by applying RPCA to RFI-contaminated data over a wide range of
SIR levels. Apparently, for both the simulated and measured RFI-only data, the choice
of the user parameter ζ has a significant impact on the RFI mitigation performance. For
the given data sets, the proper selection of ζ (boldfaced in Tables 3.1 and 3.2) is related
to the SIR levels.
Table 3.1 The SNR metric of the recovered SAR images obtained by using RPCA with data obtained from
the ARL RFI-free radar data and the scaled simulated RFI-only data.
SNR Metric (dB) SIR (dB)

0 −10 −20 −30
ζ
0.4ζ0 9.22552 3.42188 −3.42708 −12.12427
0.6ζ0 10.76778 7.92373 4.78473 0.56811
0.8ζ0 7.47976 6.84925 5.93832 4.57629
1.0ζ0 4.94639 4.66251 4.38511 4.11642
1.2ζ0 3.49384 3.24972 2.99793 2.93142
Table 3.2 The SNR metric of the recovered SAR images obtained by using RPCA with data obtained from
the ARL RFI-free radar data and the scaled measured ARL RFI-only data.
SNR Metric (dB) SIR (dB)

0 −10 −20 −30
ζ
0.4ζ0 9.16536 3.09982 −4.20045 −13.2707
0.6ζ0 10.7068 7.40891 2.49460 −5.52660
0.8ζ 0 7.49281 6.72988 4.22138 −3.01897
1.0ζ0 4.98311 4.76060 3.92003 −1.74045
1.2ζ0 3.53173 3.39350 3.11635 −0.95133
1.2
1.1
1
Parameter Coefficient β
0.9
0.8
0.7
0.6
0.5
–30 –25 –20 –15 –10 –5 0
SIR
Figure 3.22 Exponential curve to fit the relationship between the SIR of the observed data and the
corresponding proper user parameter ζ. Data was obtained from combining the ARL RFI-free
radar data with the scaled measured ARL RFI-only data (ζ = βζ0 , where β is the parameter
coefficient).
Note that the relationship between the proper ζ and the SIR should be nonlinear
because the appropriate ζ should approach infinity as the SIR approaches negative
infinity, and zero as the SIR approaches infinity. A natural choice is to use the following
exponential equation to express the relationship between the proper ζ denoted as ζ =
βζ0 and the SIR:
β = ηexp(−γSIR), (3.25)
where η and γ are the parameters that can be obtained by using, for example, the MAT-
LAB Curve Fitting Toolbox. Figure 3.22 gives the curve fitting for the RFI-contaminated
Table 3.3 The SNR metric of recovered SAR images obtained by using the
CLEAN-BIC algorithm. Data was obtained from combining the ARL RFI-free radar
data with the scaled, realistically simulated RFI-only data.
SIR (dB) 0 −10 −20 −30
SNR Metric (dB) 9.51874 6.54657 5.21395 1.40243
Table 3.4 The SNR metric of recovered SAR images obtained by using the
CLEAN-BIC algorithm. Data was obtained from combining the ARL RFI-free radar
data with the scaled, measured ARL RFI-only data.
SIR (dB) 0 −10 −20 −30
SNR Metric (dB) 9.44348 5.90590 1.41839 −6.64253
data that was obtained using the measured RFI-only data with the measured RFI-free
radar echo data, based on (3.25) (η = 0.5027, γ = 0.02761). It can be demonstrated that
the curve in Figure 3.22 also works well for the realistically simulated RFI-only data.
For the RPCA-CB algorithm, we use this curve to determine the RPCA user parameter
based on the estimated SIR that was obtained via CLEAN-BIC.
3.5.4 Analysis of CLEAN-BIC

Next, Tables 3.3 and 3.4 compare the RFI mitigation performance using the SNR metric
for the CLEAN-BIC algorithm over a wide range of SIR levels. Compared with the
simple simulated RFI-only data used in Section 3.3, the realistically simulated RFI-only
data used here is composed of many sinusoids and the frequencies of some of the sinu-
soids are very closely spaced. Due to the limited resolution of CLEAN-BIC, it cannot
remove the sinusoids completely, even though the sinusoids are obtained by CLEAN-
BIC from the measured RFI-only data. As a result, the SNR metric of the recovered
SAR images of CLEAN-BIC decreases with the increasing SIR level. Moreover, in the
case of measured RFI-only data, the SNR metric of the recovered SAR images is more
severely affected by the SIR levels due to the presence of noise in the measured RFI-only
data and the sinusoidal model error.
3.5.5 Analysis of Enhanced Methods and Comparisons

Finally, we evaluate the RFI mitigation performance of the enhanced algorithms,
including HM and RPCA-CB, and compare their performance with their RPCA and
CLEAN-BIC counterparts. Herein, the user parameter ζ in RPCA is set as ζ0 , as
recommended in [24]. Since both of the enhanced algorithms are based on the estimated
SIR values that were obtained via CLEAN-BIC, we first show the SIR estimation
results of CLEAN-BIC in Table 3.5. Apparently, the CLEAN-BIC algorithm can
provide sufficiently accurate SIR estimates for the enhanced algorithms.
Table 3.5 The estimated SIR obtained via CLEAN-BIC for data obtained from combining the ARL RFI-free
radar data with the scaled realistically simulated and measured ARL RFI-only data.
True SIR (dB)
0 −10 −20 −30
Estimated SIR (dB)
Simulated RFI 1.6975 −8.7983 −18.8360 −26.5150
Measured RFI 1.8753 −8.3260 −16.1593 −19.6084
(dB)
40 0
50 –5
60 –10
Down Range (meters)
70 –15
80 –20
90 –25
100 –30
110 –35
–740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
Figure 3.23 Recovered SAR image of the measured SAR data set contaminated by realistically
simulated RFI-only signals with a SIR of −10 dB obtained by using HM.
Figures 3.23–3.26 show the recovered SAR images using these two enhanced algo-
rithms with data obtained from combining the measured ARL radar data with realisti-
cally simulated RFI-only data for SIR = −10 dB and SIR = −30 dB, respectively. Com-
pared with the RFI-free case in Figure 3.5, we see that the enhanced algorithms work
well for both strong and weak realistically simulated RFI cases. Figure 3.27 compares
the SNR metric of the recovered SAR images for various SIR levels for the realistically
simulated RFI case. Clearly, RPCA achieves a relatively low SNR value with a SIR
value higher than −10 dB, and CLEAN-BIC has a low SNR value in the case of severe
RFI. However, the enhanced algorithms, which combine the merits of both RPCA and
CLEAN-BIC, has a satisfying performance over a wide range of SIR levels. Moreover,
the RPCA-CB algorithm has the best performance among all four algorithms.
Figures 3.28–3.31 present the recovered SAR images using these two enhanced algo-
rithms with the RFI-contaminated data that was obtained by combining the measured
(dB)
40 0
50 –5
60 –10
Down Range (meters)
70 –15
80 –20
90 –25
100 –30
110 –35
–740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
simulated RFI-only signals with a SIR of −10 dB obtained by using RPCA-CB.
(dB)
40 0
50 –5
60 –10
Down Range (meters)
70 –15
80 –20
90 –25
100 –30
110 –35
–740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
simulated RFI-only signals with a SIR of −30 dB obtained by using HM.
(dB)
40 0
50 –5
60 –10
Down Range (meters)
70 –15
80 –20
90 –25
100 –30
110 –35
–740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
simulated RFI-only signals with a SIR of −30 dB obtained by using RPCA-CB.
12
10
4
SNR(dB)
RPCA
0
CLEAN–BIC
–2
HM
–4
RPCA–CB
–6
–8
–30 –25 –20 –15 –10 –5 0
SIR(dB)
Figure 3.27 Comparison of RFI mitigation performances for various SIR values using the
simulated RFI-only data and the measured ARL RFI-free UWB radar data.
(dB)
40 0
50 –5
60 –10
Down Range (meters)
70 –15
80 –20
90 –25
100 –30
110 –35
–740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
Figure 3.28 Recovered SAR image of the measured SAR data set contaminated by measured-only
RFI signals with a SIR of −10 dB obtained by using HM.
(dB)
40 0
50 –5
60 –10
Down Range (meters)
70 –15
80 –20
90 –25
100 –30
110 –35
–740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
Figure 3.29 Recovered SAR image of the measured SAR data set contaminated by measured
RFI-only signals with a SIR of −10 dB obtained by using RPCA-CB.
(dB)
40 0
50 –5
60 –10
Down Range (meters)
70 –15
80 –20
90 –25
100 –30
110 –35
–740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
RFI-only signals with a SIR of −30 dB obtained by using HM.
(dB)
40 0
50 –5
60 –10
Down Range (meters)
70 –15
80 –20
90 –25
100 –30
110 –35
–740 –720 –700 –680 –660 –640 –620 –600 –580 –560 –540
RFI-only signals with a SIR of −30 dB obtained by using RPCA-CB.
12
10
4
SNR(dB)
2
RPCA
0
CLEAN
–2 HM
RPCA-CB
–4
–6
–8
–30 –25 –20 –15 –10 –5 0
SIR(dB)
Figure 3.32 Comparison of RFI mitigation performances for various SIR values using the
measured ARL RFI-only data and the measured ARL RFI-free UWB radar data.
RFI-free SAR data with the scaled measured RFI-only data for SIR = −10 dB and
SIR = −30 dB, respectively. Similar to the realistically simulated RFI case discussed
in the previous paragraph, the enhanced algorithms perform better than their RPCA
counterparts (see Figures 3.8 and 3.11) and their CLEAN-BIC counterparts (see Figures
3.18 and 3.19). Figure 3.32 compares the RFI mitigation performance using the SNR
metric of the recovered SAR images using all four methods for various SIR values for
the case of measured RFI data. Again, the enhanced algorithms outperform their RPCA
and CLEAN-BIC counterparts. Note also that RPCA-CB outperforms HM slightly at
high SIR values.
3.6 Conclusions
In this chapter, we proposed several sparse signal recovery methods for effective RFI
mitigation. We first demonstrated that the RFI sources are low-rank and sparse in the
frequency domain; in contrast, the desired UWB radar echoes are sparse in the time
domain. RPCA can be used to exploit these properties for effective RFI mitigation;
however, RPCA requires fine tuning of a user parameter that is nontrivial in practical
applications due to the lack of prior knowledge about the RFI sources and UWB radar
echoes. To avoid the user parameter tuning problem, we introduced the CLEAN-BIC
algorithm for RFI mitigation, via modeling the RFI sources within a PRI as the sum
of sinusoids. We have shown that CLEAN-BIC can be used to remove dominant RFI
sources effectively; however, since the sparse property of the desired UWB radar echoes
is not utilized by CLEAN-BIC, the resulting SAR images contain high noise levels,
especially at low SIR levels. To combine the merits of both RPCA and CLEAN-BIC, we
proposed a hybrid method, or HM, wherein the use of RPCA or CLEAN-BIC is based
on the estimated SIR value obtained via CLEAN-BIC. Furthermore, we have considered
determining the user parameter for the RPCA algorithm based on using the estimated
SIR value that was obtained via CLEAN-BIC; the resulting algorithm is referred to as
RPCA-CB. These enhanced algorithms were applied to RFI-contaminated data that was
obtained from combining the measured RFI-free radar echo data with either realistically
simulated or experimentally measured RFI-only data for performance evaluations. We
have shown that the estimated methods outperformed both RPCA and CLEAN-BIC.
Finally, RPCA-CB has been shown to outperform HM slightly, especially at high SIR
levels.
3.7 Acknowledgment
This material is based upon work supported in part by the US Army Research Labora-
tory and the US Army Research Office under grant number W911NF-16-2-0223.
References
[1] L. H. Nguyen, T. Tran, and T. Do, “Sparse models and sparse recovery for ultra-wideband
SAR applications,” IEEE Transactions on Aerospace and Electronic Systems, vol. 50, no. 2,
pp. 940–958, 2014.
[2] T. Koutsoudis and L. A. Lovas, “RF interference suppression in ultrawideband radar
receivers,” in Algorithms for Synthetic Aperture Radar Imagery II. International Society
for Optics and Photonics, 1995, pp. 107–119.
[3] D. O. Carhoun, “Adaptive nulling and spatial spectral estimation using an iterated principal
components decomposition,” in 1991 International Conference on Acoustics, Speech, and
Signal Processing, 1991, pp. 3309–3312.
[4] H. Subbaram and K. Abend, “Interference suppression via orthogonal projections: a
performance analysis,” IEEE Transactions on Antennas and Propagation, vol. 41, no. 9,
pp. 1187–1194, 1993.
[5] V. T. Vu, T. K. Sjögren, M. I. Pettersson, and L. Håkasson, “An approach to suppress
RFI in ultrawideband low frequency SAR,” in Radar Conference, 2010 IEEE. IEEE, 2010,
pp. 1381–1385.
[6] X. Luo, L. M. H. Ulander, J. Askne, G. Smith, and P. O. Frolind, “RFI suppression in ultra-
wideband SAR systems using LMS filters in frequency domain,” Electronics Letters, vol. 37,
no. 4, pp. 241–243, 2001.
[7] T. Miller, L. Potter, and J. McCorkle, “RFI suppression for ultra wideband radar,” IEEE
Transactions on Aerospace and Electronic Systems, vol. 33, no. 4, pp. 1142–1156, 1997.
[8] X. Wang, W. Yu, X. Qi, and Y. Liu, “RFI suppression in SAR based on approximated spectral
decomposition algorithm,” Electronics Letters, vol. 48, no. 10, pp. 594–596, 2012.
[9] F. Zhou, R. Wu, M. Xing, and Z. Bao, “Eigensubspace-based filtering with application
in narrow-band interference suppression for SAR,” IEEE Geoscience and Remote Sensing
Letters, vol. 4, no. 1, pp. 75–79, 2007.
[10] F. Zhou and M. Tao, “Research on methods for narrow-band interference suppression
in synthetic aperture radar data,” IEEE Journal of Selected Topics in Applied Earth
Observations and Remote Sensing, vol. 8, no. 7, pp. 3476–3485, 2015.
[11] F. Zhou, M. Tao, X. Bai, and J. Liu, “Narrow-band interference suppression for SAR
based on independent component analysis,” IEEE Transactions on Geoscience and Remote
Sensing, vol. 51, no. 10, pp. 4952–4960, 2013.
[12] L. H. Nguyen and T. D. Tran, “Efficient and robust RFI extraction via sparse recovery,”
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 9,
no. 6, pp. 2104–2117, 2016.
[13] L. H. Nguyen, M. D. Dao, and T. D. Tran, “Estimating unknown sparsity in compressed
sensing,” in 2014 IEEE International Conference on Image Processing (ICIP), 2014,
pp. 116–120.
[14] L. H. Nguyen, M. D. Dao, and T. D. Tran, “Joint sparse and low-rank model for radio-
frequency interference suppression in ultra-wideband radar applications,” in 2014 48th
Asilomar Conference on Signals, Systems and Computers, 2014, pp. 864–868.
[15] L. H. Nguyen and T. D. Tran, “RFI-radar signal separation via simultaneous low-rank and
sparse recovery,” in 2016 IEEE Radar Conference (RadarConf), 2016, pp. 1–5.
[16] J. A. Högbom, “Aperture synthesis with a non-regular distribution of interferometer
baselines,” Astronomy and Astrophysics Supplement Series, vol. 15, p. 417, 1974.
[17] J. Tsao and B. D. Steinberg, “Reduction of sidelobe and speckle artifacts in microwave
imaging: The CLEAN technique,” IEEE Transactions on Antennas and Propagation,
vol. 36, no. 4, pp. 543–556, 1988.
[18] P. T. Gough, “A fast spectral estimation algorithm based on the FFT,” IEEE transactions on
Signal Processing, vol. 42, no. 6, pp. 1317–1322, 1994.
[19] J. Li and P. Stoica, “Efficient mixed-spectrum estimation with applications to target feature
extraction,” IEEE transactions on signal processing, vol. 44, no. 2, pp. 281–295, 1996.
[20] P. Stoica and Y. Selen, “Model-order selection: a review of information criterion rules,”
IEEE Signal Processing Magazine, vol. 21, no. 4, pp. 36–47, 2004.
[21] T. Yardibi, J. Li, P. Stoica, M. Xue, and A. B. Baggeroer, “Source localization and
sensing: A nonparametric iterative adaptive approach based on weighted least squares,”
IEEE Transactions on Aerospace and Electronic Systems, vol. 46, no. 1, pp. 425–443, 2010.
[22] P. Stoica and P. Babu, “On the proper forms of BIC for model order selection,” IEEE
[23] E. J. Candès, X. Li, Y. Ma, and J. Wright, “Robust principal component analysis?” Journal
of the ACM, vol. 58, p. 11, 2011.
[24] Z. Lin, M. Chen, and Y. Ma, “The augmented lagrange multiplier method for exact recovery
of corrupted low-rank matrices,” arXiv preprint arXiv:1009.5055, 2010.
[25] P. A. Fridman and W. A. Baan, “RFI mitigation methods in radio astronomy,” Astronomy &
Astrophysics, vol. 378, no. 1, pp. 327–344, 2001.
[26] A. Camps, J. Gourrion, J. Miguel Tarongí et al., “RFI analysis in smos imagery,” in
Geoscience and Remote Sensing Symposium (IGARSS), 2010 IEEE International. IEEE,
2010, pp. 2007–2010.
[27] X. Peng, F. Hu, F. He, D. Zhu, Y. Cheng, H. Hu, and T. Zheng, “An improved clean
algorithm for RFI mitigation in aperture synthesis radiometers,” in 2017 IEEE International
Geoscience and Remote Sensing Symposium (IGARSS). IEEE, 2017, pp. 3448–3451.
[28] F. Hu, X. Peng, F. He, et al., “RFI mitigation in aperture synthesis radiometers using a
modified clean algorithm,” IEEE Geoscience and Remote Sensing Letters, vol. 14, no. 1,
pp. 13–17, 2017.
[29] G. Schwarz, “Estimating the dimension of a model,” The Annals of Statistics, vol. 6, no. 2,
pp. 461–464, 1978.
[30] J. Rissanen, “Modeling by shortest data description,” Automatica, vol. 14, no. 5,
pp. 465–471, 1978.
[31] P. Stoica and R. L. Moses, Introduction to Spectral Analysis. Prentice Hall, 1997, vol. 1.
[32] P. M. Djuric, “Asymptotic MAP criteria for model selection,” IEEE Transactions on Signal
4 Compressed CFAR Techniques
Laura Anitori and Arian Maleki
4.1 Introduction
In this chapter we study the problem of target detection from a set of compressive radar
measurements that are corrupted by additive white Gaussian noise. The complications in
the calculation of false alarm and detection probabilities that are caused by the nonlinear
nature of target recovery schemes in compressed sensing represent a major limitation in
the application of such techniques in real radar systems. In this chapter, we show how
recent advances in the asymptotic analysis of the recovery algorithms help us overcome
this challenge.
We first review some standard tools that are used in classical radar detection, such
as the matched filter and the constant false alarm rate (CFAR) processor. Then, we
clarify the main challenges one would face in implementing such schemes for target
detection from compressed measurements. We show how the recently developed
asymptotic analyses of recovery algorithms enable us to overcome these challenges, and
develop fully adaptive target detection schemes. Finally, we evaluate the performance
of these schemes through theoretical analyses and extensive simulations on synthetic
and experimental data.1
4.2 Radar Signal Model
Consider a one-dimensional radar, which measures the target echoes as a function of

range (or time delay) x(t) over an observation interval t ∈ (Tmin,Tmax ).2 The radar
transmits a radio frequency (RF) signal modulated by a waveform a(t), and, in the
absence of noise and interference, the received and demodulated signal can be math-
ematically represented by the convolution

y(t) = C a(t − τ)x(τ)δτ, (4.1)
where C is a complex constant including the target radar cross section (RCS), phase
terms and propagation effects. The time delay variable t can be mapped to the range via
1 This chapter is based on chapters 2, 3, and 4 of [1] and ©[2013] IEEE. Reprinted, with permission,
from [2].
2 The extension to azimuth and Doppler domains can be easily obtained in a similar way; see for example
[3,4].
105
106 Anitori and Maleki
the equation t = 2rc , where c is the speed of light and r is range (or distance). If there are
k point targets located at ranges ri ,i = 1,. . .,k, corresponding to time delays τi = 2rc i ,

the target reflectivity distribution can be expressed as x(t) = ki=1 xi δ(t − τi ), where
xi is the ith target RCS [5–7]. Hence, the (complex) baseband, received signal can be
rewritten as
k
y(t) = xi a(t − τi ), (4.2)
i=1
where xi ,i = 1,. . .,k is a (complex) amplitude proportional to, among others, the target
RCS, target distance and transmitted power [8]. In the remainder of this chapter we
consider |xi |2 as the power received from a target at position i.
It is often convenient to discretize the range and represent signals in vector forms.
To this end, let the vector x represent the target response (or scene) at discretized range
bins,3 i.e., r = [r1,. . .,rn ], with r1 = cT2min , rn = cTmax 2 = r1 + n R, where R is
the range bin size. Furthermore, assume that targets can only be present at locations
corresponding exactly to discrete grid points. Using the Nyquist sampling theorem, the
received signal y(t) is sampled at a rate fs ≥ B, where B is the bandwidth of the
transmitted signal. Then, the sampled received signal y(tl ),l = 1,. . .,L in (4.2) can be
rewritten in vector form as
k
y = Ax = xi a i , (4.3)
i=1
where each column a i ,i = 1,. . .,n of the matrix A is a time delayed version of the
sampled transmitted waveform corresponding to the received signal from a target at
range bin i (the ith model), and x is a length n vector with amplitude xi at indices i
corresponding to target located at ranges ri and zero elsewhere. Taking the noise into
account, we obtain
y = Ax + n, (4.4)
where we consider n ∼ CN (0,σ2 I ).4
4.3 Classical Radar Detection
4.3.1 The Matched Filter

It is well known that the matched filter (MF) is the filter that optimizes the signal-to-
noise ratio (SNR) of a known signal in white Gaussian noise; see, e.g., [9]. The impulse
response of the MF is a time-reversed and conjugated copy of the (known) transmitted
signal, and the range response of a point target after MF is given by the autocorrelation
function of the transmitted waveform. For a radar transmitting an unmodulated pulse,
3 Because of the relation between time delay and range, we will use the two interchangably.
4 In case of clutter or correlated noise, prewhitening filters are often applied to obtain the formulation in
(4.4). In that case, the matrix A includes also the pre-whitening.
Compressed CFAR Techniques 107
the range resolution is given by δR = cT2 . Hence, improving the range resolution
requires shortening the pulse duration, and results in reduced transmitted energy (for
a given fixed peak power). A common way to improve resolution without reducing the
pulse length is to use frequency or phase modulated pulses such as linear frequency
modulation (LFM) or chirp waveform, Barker codes, and pseudorandom noise (PN)
sequences. In this case, the output of the MF is a compressed pulse with resolution
δR = 2B c
, where B is the bandwidth of the transmitted pulse. Because of this property,
the operation of matched filtering is referred to as pulse compression. The SNR gain of
the MF after pulse compression is given by the time bandwidth product BT . Although
pulse compression by matched filtering results in significant range resolution improve-
ment compared to unmodulated pulses, one of the issues that needs to be addressed is
the sidelobe level (SLL). For example, the autocorrelation function of an LFM pulse
exhibits large sidelobes (about −13 dB) with respect to the mainlobe [10]. Large side-
lobes of strong targets may result in masking of weaker targets in a multiple targets
scenario and in a severe increase of the false alarm rate.
4.3.2 Target Detection

For the detection of a single target (with known parameters) embedded in white Gaus-
sian noise with known variance, the use of statistical decision theory shows that the
optimum (Neyman–Pearson) receiver consists of an MF followed by a fixed-threshold
detector [11–13]. Using the Neyman–Pearson theorem, it is possible to design the detec-
tor threshold to achieve a false alarm probability not exceeding a predetermined value
of, say, α. Note that the optimum detector, based on MF, is derived for the case of
one known signal in white Gaussian noise. This is hardly the case in practical opera-
tions, where the target amplitude, phase, time delay (range), and Doppler frequency are
unknown. When the target parameters are unknown, a common approach is to set up a
generalized likelihood ratio test (GLRT) for each discrete time delay τi on a predefined
grid. If the target’s phase has a uniform distribution, then we obtain the test statistic [13]
T
H1
a (t − τi )y(t)dt ≷ γ,
∗
(4.5)

0 H0
where T is the received signal length. In other words, GLRT computes the envelope (or
power) of the MF output at all discrete time delays and compares it with a threshold
to determine the presence (declare hypothesis H1 ) or absence (declare hypothesis H0 )
of a target at time delay τi . Using the discrete linear model introduced in (4.4), we can
rewrite the test statistic in (4.5) as
|x̂i | = |a i H y|, i = 1,. . .,n (4.6)
where superscript H indicates the conjugate transpose of a vector. In words, the MF

computes the cross-correlation of the received signal with time-delayed versions of the
transmitted waveform. Combining the MF outputs for all time delays, we can write the
MF discrete output signal in vector form as x̂ MF = [x̂1,. . ., x̂n ]T , where
x̂ MF = AH y. (4.7)
The envelope (or power) of each component of the vector x̂ MF is compared to the
threshold γ to decide upon the presence of a target. For the envelope detector, the
threshold γ should be set to

γ = −σ 2 ln α (4.8)
to achieve a false alarm probability (FAP) equal to α [9].
4.3.3 Detection of Multiple Targets

Often in real radar applications, there is more than one target in the received window.
The task of a radar is to determine how many targets are present and to estimate their
parameters. If an MF is used at the receiver, expanding (4.7) we obtain
x̂ MF = AH y = AH (Ax + n) = x + (AH A − I )x + AH n. (4.9)
It can be seen in (4.9) that each entry x̂ i of the MF output is the sum of the true target
response at location i, xi , plus the interference caused by the presence of other possible

targets at locations j != i ( nj=1,j !=i xj a H
i a j ), plus Gaussian noise.
Note that the interference is proportional to the cross-correlation between the
time-delayed version of the transmitted waveform [14]; in the ideal case that the
time-delayed versions of the transmitted waveforms are orthogonal and have unit norm,
i.e., AH A = I , we obtain
x̂ MF = x + z, (4.10)
where z ∼ CN (0,σ2 I ).
Therefore, if each target is exactly on a grid point and the matrix A is orthogonal,
the components of the vector x̂ MF are independent of one another and each range bin
can be treated independently. One can then apply hypothesis tests to each time delay to
estimate simultaneously the number of targets and their time delays.
For practical frequency or phase-modulated waveforms, the orthogonality condition
is never met. Hence, each target interferes, through its sidelobe, with the detection of the
others. This phenomenon is known as target masking. A reduction of the SLL can be
accomplished by applying a weighting function during matched filtering [9]. However,
the weighted MF output is no longer matched to the transmitted signal, and therefore,
in addition to reducing the SLL, the output SNR is also reduced. Alternatively, one can
design waveforms with low SLLs [15–17] or use different pulse compression filters,
such as adaptive pulse compression (APC) [5,6] or mismatched filters [18–22]. A review
of mismatched filters can be found in [14]. Note that the design of mismatched filters is
based on iterative algorithms and the optimum filter weights have to be estimated sepa-
rately for each range bin. Therefore, such schemes are computationally more demanding
than the MF.
Performing an independent binary GLRT at each range bin does not take into
account the interaction between targets. Hence when the sidelobes are not sufficiently
suppressed, a better detection/estimation strategy would be the multiple hypothesis

test. In this approach one considers all possible combinations of the number of targets
and their locations, thus taking into account the interaction among all possible targets.
Note that this approach should perform 2n different tests, which is computationally
prohibitive for n > 30. In Section 4.4, we introduce the 1 -norm minimization as an
alternative for estimating the location and magnitudes of one or multiple targets.
4.3.4 Constant False Alarm Rate Detectors

In (4.8) and all the discussion so far we have assumed that the noise power is known,
and can be used to set the threshold γ in (4.5). This assumption is not always met in
practice, since the noise plus interference power is varying and not known in advance.
Hence, in classical radar detectors a constant false alarm rate (CFAR) processor is
employed. In CFAR schemes, the cell under test (CUT) x̂i (which corresponds to the
output of the receive filter at time delay τi ), is tested for the presence of a target
against a threshold that is derived from an estimated clutter plus noise power. The
2Nw cells (CFAR window) surrounding the CUT are used to derive an estimate of the
local background and they are assumed to be target-free. Commonly, 2NG guard cells
immediately adjacent to the CUT are excluded from the CFAR window to deal with
extended targets and large sidelobes. The advantage of CFAR schemes is that they are
able to maintain a CFAR via adaptation of the threshold to a changing environment.
The general form of a CFAR test is
H1
X ≷ βY, (4.11)
H0
where the random variable X represents some function (generally envelope or power)
of the CUT x̂i , β is a threshold multiplier that controls the false alarm rate, and
Y is also a random variable that is obtained from the cells in the CFAR window
[x̂i−Nw −NG ,. . ., x̂i−NG −1, x̂i+NG +1,. . ., x̂i+Nw +NG ]. A cell-averaging CFAR (CA-CFAR)
detector is a well-known CFAR scheme in which Y is the average of the power (or
envelope) of the cells in the CFAR window [23–28]. The CA-CFAR detector is optimal
for the detection of a target in the presence of homogeneous i.i.d. Gaussian noise.
However, when clutter changes rapidly, interfering targets are present in the CFAR
window, or the clutter and noise distributions are not Gaussian, the CA-CFAR detector
performance degrades severely. For this reason many alternative CFAR schemes
have been devised, such as greatest of (GO), smallest of (SO), trimmed mean (TM),
logarithmic (LOG), and order statistic (OS) CFAR processors [26,29–36]. Each CFAR
scheme is suitable for a specific clutter and interference scenario. In most CFAR
detectors it is assumed the distribution of the noise follows a parametric model, such
as Gaussian, with unknown mean and variance, and that the parameters of the model
can be estimated from the data. Depending on the characteristics of the expected noise
and interference scenario, the most appropriate CFAR scheme can be designed. Clearly,
one has to know the relation between the CFAR threshold multiplier and the probability
of false alarm, so that β can be adjusted to maintain a constant false alarm probability
during the observation time.
4.4 CS Radar Detection
4.4.1 1 −Norm Minimization

Compressive sensing (CS) is a novel technique for data acquisition and processing that
allows reconstruction of sparse signals from a number of measurements m much smaller
than the one dictated by the Shannon–Nyquist sampling theorem. Thus, CS uses the
same model as in (4.4), except that now the number of measurements is reduced from
n to m, and the sensing matrix A is of size m × n, with m < n. Since we want to
estimate the target response x of size n from a set of measurements m, the problem
is ill-posed. However, exploiting the prior information about sparsity of the scene, it
has been proved [37–39] that a good estimate of x can be obtained from the noisy,
subsampled measurements y by solving the 1 -regularized least squares, also known as
the LASSO or basis pursuit denoising (BPDN) [40,41], given by
1
x̂ = arg min y − Ax22 + λ x1, (4.12)
x 2
where λ is called the regularization parameter, and controls the trade-off between the
sparsity of the solution and the 2 -norm of the residual. Alternatively, one can solve the
constrained problem [39,42]
min x1 s.t. y − Ax2 ≤ ε, (4.13)

x
where ε is a threshold proportional to the noise variance. The relation between λ and ε
that makes the two problems equivalent is data-dependent, and does not have an explicit
form.
Standard techniques, such as interior point method, can be used for solving (4.12).
However, the computational costs of such methods have encouraged researchers to con-
sider iterative algorithms with inexpensive per-iteration computations. See for example
[43,44] and references therein. These iterative algorithms exploit the fact that the opti-
mization problem arg minx 21 u − x22 + λx1 has an explicit solution x = η(u;λ)
(|u|− λ)ej u 1(|u| > λ), where η(·;λ) is called the complex soft thresholding function
and 1 is the indicator function. The complex soft thresholding function acts (componen-
twise) on the amplitudes of the input vector u and produces a sparse signal by shrinking
to zero all the elements of u whose amplitude is below the threshold, or regularization
parameter, λ, thus enforcing sparsity on the solution. The components of u that are
above the threshold will be shrinked toward zero by an amount equal to the threshold
λ, and their phase is unchanged. The complex soft thresholding function is shown in
Figure 4.1.
Iterative soft thresholding (IST) uses the following iterations to solve (4.12). Starting
with x̂ 0 = 0, at each iteration t the estimate x̂ of the vector x is updated using
& '
x̂ t+1 = η x̂ t + AH (y − Ax̂ t );λ . (4.14)
Therefore, at each iteration, the current residual is projected along the waveforms, and
added to the previous solution. In other words, at each iteration of (4.14), the algorithm
(a) Amplitude (b) Phase
Figure 4.1 Complex soft thresholding function.
moves in the negative gradient direction (of the objective function) −AH (y − Ax̂ t ),
and then applies the soft thresholding function to enforce sparsity on the solution. At
convergence, the estimated sparse signal x̂ will contain many zero components, and a
few non–zeros. As λ increases, the solution to (4.14) will become more sparse and, if
λ > AH y∞ , the only feasible solution is x̂ = 0.
It is clear that 1 -minimization is implicitly performing target detection; we can inter-
pret the nonzero elements of x̂ as detected targets. However, note that if the recovered
amplitude at a target location, say x̂ i , is zero, then there is no way to recover the target,
not even in any subsequent processing stage. In other words, if the estimated amplitude
resulting from 1 -norm minimization is zero, then the target is simply lost, i.e., we have
a missed detection. Furthermore, at locations not containing the targets, we expect most
of the estimates resulting from 1 -norm minimization to have zero amplitude. However,
depending on the SNR and the ratio m/n, some of the noise samples are reconstructed
with non–zero amplitude, and therefore, if no further processing is applied, they will
produce false targets in the recovered range profile. Increasing the threshold λ will
reduce the number of reconstructed noise samples, but it may also result in suppressing
the targets. In this sense, the sparse estimate produced by 1 -norm minimization is
comparable to the output of a classical detector, with the parameters λ or ε used in
the recovery controlling both the detection and false alarm probabilities.
Despite this similarity with classical radar detection, we should emphasize that the
relation between λ and false alarm probability in each cell is extremely complicated. In
particular, especially for m < n, the interference among targets’ sidelobes is not negli-
gible and is not easy to model either due to the nonlinear nature of the 1 -minimization.
These complications make the design of CFAR schemes very challenging for CS radar
systems. In the rest of this chapter, we will focus on this problem and describe how the
asymptotic theory of compressed sensing can help us address this challenge.
4.4.2 Main Challenge: The Design of CS Radar Detectors

In this chapter, we focus on the design and analysis of CS radar detectors. Particularly,
we are interested in (1) determining a strategy for the optimal detection of targets from
CS measurements, and (2) designing an adaptive CFAR detector to achieve a desired
pair of detection probability Pd , and false alarm probability Pf a . To design an adaptive
detector in the CS framework, one has to deal with a number of issues that are related
mostly to the non-linearity of the 1 -recovery. As explained in the previous sections,
classical radar architectures (without CS) use well-established signal processing and
detection schemes, such as MF and CFAR processors. Unlike classical radars, in which
the relation between the threshold parameter and the false alarm is straightforward,
in CS the relation between λ (or ε) and the false alarm rate is data-dependent and
complicated. For the design of possible CS radar detectors, it is essential to determine
the (statistical) properties of the signal obtained from 1 -norm minimization algorithms.
To solve the CS detection problem, we employ some of the recent advances in the
asymptotic analysis of compressed sensing algorithms. We will show that a debiasing
step can make the problem of CS radar detection similar to that of classical radar
detection. We will then combine this debiasing step with classical tools such as CFAR to
build practical CS radar detection schemes. A key ingredient for obtaining the debiasing
step is the complex approximate message passing (CAMP) algorithm [44], which will
be introduced in the next section.
4.5 Complex Approximate Message Passing (CAMP) Algorithm
The complex approximate message passing [44] is an iterative algorithm for solving
(4.12). CAMP is an extension of the original approximate message passing (AMP)
algorithm, first proposed in [45] for real-valued signals, to the case of signals and
measurements in the complex domain. The AMP algorithm and its properties have been
thoroughly investigated in [46–52]. However, in radar it is most common to work with
signals in the complex domain. Therefore, in the remainder of this chapter we will
concentrate exclusively on CAMP. As we will see in the next sections, properties of
CAMP enable us to achieve the following objectives:
• characterize the distribution of the noise after 1 -norm minimization;

• establish the relation between the regularization parameter λ of LASSO and the
quality, in terms of SNR, of the recovered solution;
• adaptively set the regularization parameter λ in a way that optimizes the recovery
SNR;
• design a fully adaptive CS radar detector that can be combined with classical
CFAR processing.
We first review the iterations of the CAMP algorithm, that is given in Table 4.1. In the
algorithm, · denotes the average of a vector, ηI and ηR are the imaginary and real
∂ ηR
parts of the complex soft thresholding function, ∂xR is the partial derivative of ηR with
∂ ηI
respect to the real part of the input, ∂xI is the partial derivative of ηI with respect to the
imaginary part of the input, δ = m/n is the compression factor, and maxiter is the (user
specified) maximum number of iterations. Furthermore, note that the soft thresholding
function is applied with threshold parameter τσt , where σt is the standard deviation of
the noise at iteration t and τ is a (fixed) user specified threshold. We will see in the
Table 4.1 Pseudocode for the ideal CAMP algorithm. ©[2013] IEEE.
Ideal CAMP Algorithm
Input: y, A, τ, x, δ = m/n
Initialization x̂ 0 = 0, z0 = y
for t = 1 : maxiter
x̃ t = A† zt−1 + x̂ t−1
σt = std(x̃ t − x)
1 $ ∂ η (x̃ t ;τσ ) + ∂ η (x̃ t ;τσ )%
R I
zt = y − Ax̃ t−1 + zt−1 2δ ∂xR t ∂xI t
x̂ t = η(x̃ t ;τσt )
end
Output: x̃, x̂,σ∗
next subsection how the parameter τ of CAMP relates to the regularization parameter
λ in (4.12).
We now explain each variable in the CAMP algorithm:
(i) x̂ t : the estimate of x at iteration t. Under a proper tuning of τ, x̂ t → x̂(λ) as
t → ∞. The relation between τ and λ is given later in (4.17).
(ii) x̃ t : the non-sparse and “noisy” estimate of x. Define the vector wt = x̃ t − x,
which represents the “noise vector” at iteration t of CAMP. The distribution of wt
is “close” to a zero-mean Gaussian probability density function. This property of
wt will be clarified and proved in Section 4.5.1.
(iii) σt : the standard deviation of wt . Furthermore, we define σ∗ limt→∞ σt .
Using this terminology, the operations executed in CAMP can be explained as fol-
lows. First, CAMP calculates a noisy, non-sparse estimate of the signal x, which is
given by x̃ t = A† zt−1 + x̂ t−1 . Then, to make this estimate sparse, the soft thresholding
function is applied to x̃ t to obtain the sparse vector x̂ t = η(x̃ t ;τσt ). This algorithm is
referred to as ideal CAMP. The word ideal in the algorithm’s name refers to the fact
that the sought x is used inside the iterations of CAMP for estimating the noise standard
deviation σt . Therefore although this is not a practical algorithm, since it uses in its
iterations the vector that is trying to estimate, this assumption is only used here for the
clarity of presentation. A practical scheme for estimating σt is described in Section 4.7.
4.5.1 State Evolution: A Framework for the Analysis of CAMP

Before we proceed to the design of practical CS radar detectors, we need to answer
the following three questions: (i) Under what conditions is the Gaussianity of wt accu-
rate? (ii) Can the performance of CAMP be predicted theoretically? (iii) What is the
connection between CAMP and LASSO that was defined in (4.12)?
The first question has been carefully studied in [44,48,50,53]. It is proved that under
the asymptotic setting n → ∞, while δ = m/n and ρ = k/m are fixed, the Gaussianity
heuristic is correct. To clarify this claim consider the following definition from [50].
definition 4.1 For a given (δ,ρ) ∈ [0,1]2 , a sequence of instances {x(n),A(n),n(n)}

is called a converging sequence if the following conditions hold:
• The empirical distribution of x(n) ∈ Rn converges weakly to a probability mea-

sure pX with bounded second moment as n → ∞.
• The empirical distribution of n ∈ Rm (m = δn) converges weakly to a probability
measure pn with bounded second moment as n → ∞.
• The elements of A(n) ∈ Rm×n are i.i.d. drawn from a Gaussian distribution.
theorem 4.2 [50] Let {x(n),A(n),n(n)} be a converging sequence, and let x̃ t (n) be
the estimate provided by the CAMP algorithm. The empirical law of wt (n) = x̃ t (n) −
x t (n) converges to a zero-mean Gaussian distribution almost surely as n → ∞.
Theorem 4.2 has been proved for Gaussian measurement matrices. However, empiri-
cal studies have already confirmed that this theoretical prediction holds for other sensing
matrices with i.i.d. elements other than Gaussian [45]. Also, in Section 4.8.1 we study
the validity of Theorem 4.2 for partial Fourier matrices, which are of particular interest
in radar applications.
Using the Gaussianity of the noise vector wt , we can answer the second question
raised at the beginning of this subsection, regarding the theoretical performance of the
algorithm. In fact, given that the noise has a zero-mean Gaussian distribution, to predict
the performance of CAMP we only need to track the standard deviation of the noise σt
across the iterations of the algorithm. This is performed through what is called the state
evolution (SE). Under the asymptotic setting, the value of the standard deviation at time
t + 1 is calculated from σt according to the following equation:
σt+1
2
= (σt ), (4.15)
where
1 & 2 '
(σt ) = σ2 + E η(X + σt Z;τσt ) − X , (4.16)
δ
Z ∼ CN (0,1), σ2 is the input noise variance and the expectation is with respect to
the two independent random variables X ∼ pX and Z, where pX denotes the marginal
distribution of x. It has been proved in [44] that the function is a concave function
of σt2 , and therefore the iteration (4.15) has at most one stable fixed point, which we
refer to as σ∗2 . Also, CAMP converges to this fixed point exponentially fast (linear
convergence according to optimization literature). An example of how the function
can be calculated in closed form for a given distribution pX of X is provided in appendix
A of [2].
Finally, let us answer the third question we raised at the beginning of this subsection
regarding the connection between LASSO and CAMP. In [44] it is proved that if τ
satisfies
R
1 ∂η ∂ ηI
λ τσ∗ 1− E (X + σ∗ Z;τσ∗ )+ (X + σ∗ Z;τσ∗ ) , (4.17)
2δ ∂xR ∂xI
then CAMP with threshold τ solves the LASSO in (4.12) with parameter λ.
4.6 Target Detection Using CAMP
Using the properties of the CAMP algorithm described earlier, in this section we propose
two CS target detection schemes and analyze their performance through their receiver
operating characteristic (ROC) curves. Let k be the number of targets, i.e., the number
of nonzero coefficients in x, and define G as the distribution of the nonzero elements
of x. In this section k, G, and σ are assumed to be known. A more realistic case in which
none of these parameters are known will be studied in Section 4.7.
The two architectures we consider are displayed in Figure 4.2. In Architecture 1,
the measurements y are given to a recovery algorithm (say CAMP or LASSO). This
algorithm returns a sparse vector x̂ that has a few nonzero values. In this case the
nonzero elements of x̂ can be considered as detected targets, where the soft thresh-
olding operation inside the recovery algorithm performs the detection function. Since
the estimated x̂ (support and amplitude of the nonzero entries) depends on threshold
parameter τ in CAMP or the regularization parameter λ in LASSO, this parameter
will also automatically control both the false alarm α (or Pf a ) and detection (Pd )
probabilities.
√
proposition 1 Consider the CAMP iteration with threshold τα = − ln α. If
A(n),x(n),w(n) is a converging sequence, then
1
n
lim lim 1{x̂ t !=0,x i =0} = α (4.18)
t→∞ n→∞ n − k i
i=1
almost surely, where 1{.,.} denotes the indicator function. Also, τα is the only value of τ
for which (4.18) holds.
Proof Define z ∼ CN (0,1). According to [53] we have
1
n
1{x̂ i (n)!=0,xi (n)=0} = Prob(|σ∗ z| > τα σ∗ ) = e−τα .
2
lim lim
t→∞ n→∞ n − k
i=1
Note that this proposition does not provide any information on the relation between τ
and the detection probability. This issue will be discussed later.
In contrast to Architecture 1, Architecture 2 is inspired by classical radar detection
schemes. In fact, classical radar detection usually comprises two sequential stages: in the
first stage, called the estimation stage here, a noisy estimate of the signal is computed
often through a matched filtering with the goal of maximizing the output SNR (and
therefore Pd ).5 In the second stage, the noisy estimate obtained from the first stage is fed
to a detection block, whose threshold parameter is set to achieve a predefined Pf a = α.
Using a similar approach, in Architecture 2 we propose to first use CAMP to obtain a
noisy, non-sparse estimate of the signal x̃ = x + w. Similarly to the matched filter in
5 As explained in Section 4.3.2, the MF is optimal (in terms of SNR) only for the case of a single target in
white Gaussian noise. For the case of multiple targets, the optimality is only satisfied if the matrix A is
orthogonal.
(a) Architecture 1
(b) Architecture 2
Figure 4.2 Block diagrams of the proposed architectures for radar detection in the CS framework.
©[2013] IEEE. Reprinted, with permission, from [2].
0.6
δ = 0.6
0.55
δ = 0.2
0.5
0.45
σ*
0.4
0.35
0.3
0.25
0.5 1 1.5 2 2.5 3
τ
Figure 4.3 Fixed point σ∗ versus threshold τ for ideal CAMP with σ = 0.23, δ = 0.2 (dashed
line), and δ = 0.6 (solid line), ρ = 0.1. The nonzero entries in x all have amplitudes equal to 1
and phase uniformly distributed between −π and π. The sensing matrix A has i.i.d. Gaussian
entries. These curves are obtained using the analytical equation derived in appendix A of [2].
classical radar, in the first stage we aim to maximize the output SNR. This can be done
by setting the CAMP threshold τ to a value that minimizes σ∗ . Figure 4.3 exhibits the
dependence of σ∗ on τ for two distinct values of δ. As it is clear from the figure, there
is a value of τ, say τo , for which σ∗ is minimized. In Section 4.7, we will propose a
simple method for calculating τo .
√
In the second stage, a detection bock with fixed threshold κ = σ∗ − ln(α) is applied
to the noisy estimate x̃ to control the false alarm rate. From the Gaussianity of w, in the
asymptotic setting this choice of κ results in the false alarm probability α as derived in
Proposition 1.
With respect to Architecture 1, Architecture 2 has two properties that are very useful
for practical radar applications. Namely:
1. All the parameters can be optimized and estimated adaptively and efficiently, even
without prior knowledge of k, G, and σ. This will be clarified later.
1
0.8
0.9998
0.6
0.2 0.4 0.6
d
P
0.4
0.2 Arch. 2, Theoretical

Arch. 1, Theoretical
MC
0 −6 −4 −2 0
10 10 10 10
Pfa
Figure 4.4 ROC curves for Architectures 1 and 2 with δ = 0.6, ρ = 0.1 and σ 2 = 0.05. All the
nonzero coefficients of x have amplitude equal to 1 and phase uniformly distributed between −π
and π. The solid (Architecture 2) and dashed (Architecture 1) lines are obtained from the
theoretical predictions using the SE equation. The dots are obtained by Monte Carlo (MC)
simulations using Ideal CAMP. The sensing matrix for MC simulations is i.i.d. Gaussian.
2. From a detection perspective, Architecture 2 outperforms Architecture 1. Suppose

that the Gaussianity of wt holds. Let τo be the optimal value of τ that leads to the
minimum σ∗ . Then, the following theorem proves that the detection performance
of Architecture 2 is better than that of Architecture 1.
theorem 4.3 Set the probability of false alarm to α for both Architecture 1
that uses τα and Architecture 2 that uses τo in CAMP. If Pd,1 and Pd,2 are the
detection probabilities of the two schemes, then
Pd,1 ≤ Pd,2 .
Furthermore, the equality is satisfied at only one specific value α

= e−τo .
2
For a more general version of this result the reader may refer to [54]. Theorem 4.3 is
proved in [2]. Figure 4.4 exhibits ROC curves for Architectures 1 and 2. The solid and
dashed lines are obtained from the analytical equations derived in appendix A of [2].
The theoretical ROC curves are confirmed by Monte Carlo simulations (dots). In the
simulations, we run the ideal CAMP algorithm given in Table 4.1 for several values of
Pf a ranging from 10−1 to 10−5 . One interesting phenomenon that is observed in Figure
4.4 is that for Architecture 1, in the region around Pf a = 0.4 (the zoomed area), the
probability of detection, Pd,1 , decreases as Pf a increases. This is a counterintuitive
behavior, but can be explained in the following way. Recall that in Architecture 1
the CAMP threshold τα varies with Pf a . Therefore, σ∗ (and hence the CAMP recon-
struction SNR) also changes with Pf a , and is not constant along the ROC curve. This
explains why Pd,1 reaches its maximum at around Pf a = 0.4 and then decreases again
as the Pf a goes to 1 (the SNR is maximized at τ = τo and then it descreases again).
Instead, in Architecture 2, σ∗ is fixed to its minimum along the ROC curve (SNR is
constant), and therefore Pd,2 increases with increasing FAP. From this figure it can also
be observed that, as predicted by Theorem 4.3, Pd,1 = Pd,2 happens at only one value
of FAP (α
= 0.22).
4.7 Adaptive CAMP Algorithm
In the ideal CAMP algorithm, we made a few assumptions that are not met in real
applications.
(i) x is assumed to be known. This is required for the calculation of the standard
deviation of the noise.
(ii) ρ, σ, and G are assumed to be known for calculating the theoretical fixed point
solution.
In this section, we show how these assumptions can be removed. The main objective
here is to propose a practical, fully adaptive scheme that does not require any prior
information about the signal and can adapt to the changes in the signal and the noise
level. More specifically, the three main issues that we would like to address are: (i) how
to obtain a good estimate of σt without knowing x; (ii) how to obtain a good estimate of
τo for Architecture 2, efficiently and accurately without using the SE (which depends
on the unknown parameters ρ, σ, and G); (iii) how to replace κ in Architecture 2 with
an adaptive threshold that is able to maintain CFAR property in the multiple targets
scenario.
Several different schemes can be used to answer the first question. For the moment,
suppose that x = 0. If this assumption is true, then according to Theorem 4.2, x̃ t
is a Gaussian random vector with mean zero and variance σt . Hence, the following
expression gives a good estimate of σt .

1
/
σt = median(|x̃ t |). (4.19)
ln 2
However, in the presence of targets, i.e., x != 0, this is a biased estimator. More specif-
i.i.d.
ically, for large values of n, we have x̃ t = x + wt , where xi ∼ G(x) + (1 − )δ(x)
with = δρ 1 and δ(x) denotes a point mass at zero, and the elements of wt look
like a complex Gaussian random variable with mean zero and standard deviation σt .
The goal is to estimate the median u∗ of |wt |. However, since we only have access to x̃,
we estimate the median of |wt | as the û that satisfies Prob(|x̃it | > û) = 12 . Note that
since the median is robust to outliers (i.e., multiple targets), we still expect (4.19) to
offer a good estimate of σt . The following proposition confirms this claim.
proposition 2 [2] The error of the estimated median is bounded from above by
|û − u∗ | | ln(1 − )|
≤ √ . (4.20)
σ∗ 2 ln 2
Note that when the scene is very sparse, i.e., is small, | ln(1 − )| ≈ and, hence,
the error is proportional to the sparsity level. In such cases, the proposed estimator
Table 4.2 CAMP-based algorithms.
Algorithm Inputs Outputs
Ideal CAMP A,y,x,σ,τ x̂, x̃,σ∗

Median CAMP A,y,τ x̂, x̃, σ̂∗
Adaptive CAMP A,y x̂, x̃, σ̂∗, τ̂o
provides a good estimate of σt . We should also emphasize that, when the number of
targets is large and the median estimator is not reliable, we can use more recent and
better estimates of σt that are proposed in [52]. The algorithm that uses the estimate in
(4.19) instead of σt will be referred to as CAMP or median CAMP in the rest of this
chapeter.
The second question we raised at the start of this section was concerned with estimat-
ing the optimal threshold τo in Architecture 2. Suppose that we know or can estimate
τmax such that τo < τmax . We will discuss how this parameter can be set later. Define a
sequence of thresholds τ = {τ }L =1 such that τ1 = τmax and τ = τ−1 − δ τ , and δ τ
is a user-defined parameter that controls the step size. Starting from τmax , at each new
iteration , CAMP is initialized with x̂ 0 = x̂ −1 and z0 = z−1 . Using the solution of
CAMP at the previous iteration − 1 as an initial value for the current iteration, CAMP
needs only a few iterations to converge to the solution,6 and therefore the entire process
is very fast. After L iterations, we have a matrix of solutions X̂ = [x̂ 1, x̂ 2,. . ., x̂ L ] of
size n × L, where each column contains the CAMP solution for a given τ . Also, we
have L estimates { σ̂∗ }L
=1 . The optimum estimated threshold τ̂o is chosen as the one that
minimizes the estimated CAMP output noise variance σ̂∗2 .
At the first iteration ( = 1 and t = 1) τmax can be set as τmax = AH y∞ / σ̂0 . In
fact, if the CAMP algorithm is initialized with x̂ 0 = 0 and z0 = y, then x̃ = AH y,
where σ̂0 is an estimate of the standard deviation of the noise. In this case, any value of
τ larger than τmax will lead to the same estimate x̂ 1 = 0.
So far we have resolved two of the issues we raised at the beginning of this section,
i.e., estimating σ̂t and τ̂o . We will refer to the resulting algorithm as adaptive CAMP,
since both the noise variance σ̂t and the threshold τ̂o are adaptively estimated. To clarify
the differences between ideal, median, and adaptive CAMP, Table 4.2 shows the input
and output variables for each of the three algorithms. Please recall that ideal CAMP is
not an algorithm that can be used in practice, as it requires the knowledge of the true
vector x.
4.7.1 Adaptive CAMP CFAR Radar Detector

As discussed previously, if the noise variance σ2 is known, we can use the asymptotic
results to find the fixed threshold κ that achieves the desired FAP; see e.g., appendix B
in [1]. In practice, however, the noise statistics are not known in advance. In classical
6 Typically, 10 iterations are sufficent to converge to a solution.

Figure 4.5 Block diagram of the adaptive CAMP CFAR detector. ©[2013] IEEE. Reprinted, with
permission, from [2].
radar detectors, often a CFAR processor estimates the unknown background plus inter-
ference level. In Architecture 2, since the signal x̃ is modeled as the sum of targets plus
Gaussian noise (just as after a classical MF), this estimate can be directly input to a
conventional CFAR processor. A block diagram of the Adaptive CAMP CFAR detector
based on Architecture 2 is shown in Figure 4.5. Replacing the fixed threshold detector
with a CFAR detector in Architecture 2 provides similar results as in classical CFAR
without CS. All we need to determine is the input/output SNR relations of CAMP, so that
we can use the output SNR in the detector equation for the prediction of Pd . A method
for estimating the output SNR of CAMP for a given problem is provided in [2].
Please note that, since the sparse estimate x̂ contains many zeros, this vector could
not be used in a CFAR processor.
4.8 Simulation Results
In this section we investigate the performance of the proposed CS architectures using

Monte Carlo (MC) simulations and compare it with the theoretical results derived using
the SE. We compute ROC curves for the cases of fixed threshold and CA-CFAR detec-
tors. The proposed CS detection schemes are analyzed using both Gaussian sensing
matrices, for which SE applies, and partial Fourier matrices,7 which are of particular
interest in radar applications [55–57].
In the remainder of this chapter we will use as reference SNR the output SNR of an
(ideal) MF,8 so that the SNR given in the results will be independent of n or m. We define
the SNR at the input (SNRin ) and output (SNR) of the MF and CAMP Architecture 2
respectively as
|xi |2 |xi |2
SNRin,MF = , SNRin,CS = ,
nσ2 mσ2
|xi |2 |xi |2
SNRMF = , SNRCS = . (4.21)
σ2 σ∗2
where |xi |2 is the received power from a target at bin i. Given the output MF SNR,
the input SNR of both MF and CAMP, which depends through n and m on the specific
problem being investigated, can be easily derived using (4.21).
7 Recall that an m × n partial Fourier matrix can be obtained from an n × n discrete Fourier transform
matrix by preserving only a random subset m of the original n matrix rows.
8 Recall that, in a multiple target scenario, the MF SNR is optimum and independent of the number of
targets as long as each target is exactly on a grid point and the matrix A is orthogonal. We assume these
(ideal) conditions are satisfied when computing the MF SNR.
δ = 0.7, ρ = 0.2 δ = 0.2, ρ = 0.1 δ = 0.9, ρ = 0.5 δ = 0.2, ρ = 0.7

4 6 2 1
Density
4
2 1 0.5
2
0 0 0 0
−0.4 −0.2 0 0.2 0.4 −0.2 0 0.2 −0.5 0 0.5 −2 0 2
Data Data Data Data
(a) Histogram of the real part of w.
δ = 0.7, ρ = 0.2 δ = 0.2, ρ = 0.1 δ = 0.9, ρ = 0.5 δ = 0.2, ρ = 0.7

4 6 2 1
Density
4
2 1 0.5
2
0 0 0 0
−0.2 0 0.2 −0.2 0 0.2 −0.5 0 0.5 −1 0 1
Data Data Data Data
(b) Histogram of the imaginary part of w.
Figure 4.6 Histograms (bars) of (a) the real and (b) imaginary parts of the noise signal w for
different combinations (δ,ρ) using CAMP with threshold τ = 1.8, σ = 0.1 and n = 4000. The
solid line shows a Gaussian distribution fitted to the histograms. The sensing matrix is partial
Fourier. ©[2013] IEEE. Reprinted, with permission, from [2].
4.8.1 Gaussianity of w Using Partial Fourier Matrices

The Gaussianity of the reconstructed noise vector wt for a partial Fourier sensing matrix
was not demonstrated theoretically in the previous sections. Therefore, we resorted to
Monte Carlo simulations to investigate it using two methodologies. First we studied
the empirical distribution of w at convergence for different combinations of δ and ρ.
The histograms were obtained using CAMP with a fixed threshold τ (not necessarily
optimal) and for a fixed value of σ = 0.1. A few examples of such histograms are
shown in Figure 4.6.
We further investigated the Gaussianity of w using both quantile plots and the
Kolmogorov–Smirnov (KS) test [58], for different values and combinations of n, δ, and
ρ. The results of these simulations are reported in [1]. All our simulations confirmed
that the Gaussianity of the noise vector is preserved for partial Fourier matrices as well.
4.8.2 Accuracy of State Evolution

The accuracy of the SE was investigated by comparing the theoretical results obtained
from (4.16) with simulation results obtained using the ideal CAMP algorithm.
Additionally, we are interested in investigating how the behavior of σ∗ changes for
the case of a partial Fourier sensing matrix as compared to the case of Gaussian sensing
matrix, for which the theoretical results from SE apply. Figure 4.7 compares σ∗ obtained
from ideal CAMP for the case of complex Gaussian and partial Fourier sensing matrices
with the theoretical one from the SE. From this figure we observe that:
δ = 0.05 δ = 0.1
0.3 0.3
Ideal CAMP, Gaussian
0.29 0.29 Ideal CAMP, Fourier
Theoretical SE
0.28 0.28
*
*
σ
σ
0.27 0.27
0.26 0.26
0.25 0.25
2 2.5 3 1.5 2 2.5 3
τ τ
δ = 0.2 δ = 0.5
0.3 0.3
0.29 0.29
0.28 0.28
*
*
σ
σ
0.27 0.27
0.26 0.26
0.25 0.25
1.5 2 2.5 3 1 1.5 2 2.5 3
τ τ
Figure 4.7 σ∗ versus τ using ideal CAMP for both complex Gaussian and partial Fourier sensing
matrices. The empirical curves are obtained for several values of δ by averaging over 100 MC
samples. σ 2 = 0.05,ρ = 0.05, n = 4000. The theoretical SE curve shows the analytical σ∗ .
(i) SE correctly predicts the performance of CAMP for the Gaussian sensing matrix.
(ii) SE does not predict the performance of CAMP for partial Fourier matrices. How-
ever, for τ = τo , the value of σ∗ for Fourier and Gaussian matrices is very similar.
(iii) As δ → 0 the predictions of SE become more accurate for the partial Fourier
matrix. However, as δ → 1, i.e., as the number of measurements increases, the
columns of the partial Fourier matrix become deterministic and orthogonal, and
hence the true behavior deviates from the SE, which is derived assuming matrices
with i.i.d. entries.
(iv) For the partial Fourier matrix, the optimal threshold τo seems to be almost the
same for different values of δ. Interestingly, although the curves of σ∗ are dif-
ferent for different δ’s, for a fixed δ and for τ > τo the variation of the output
variance is much smaller in the partial Fourier case than in the Gaussian case.
This behavior will have an impact on the difference in performance between
Architecture 1 and Architecture 2 for the case of partial Fourier and Gaussian
matrices, as the SNR of Architecture 1 varies less along the ROC curves and it is
closer to the optimal SNR for the partial Fourier case.
4.8.3 Effects of the Median Estimator in CAMP

In this section, we investigate the performance of the proposed algorithms and archi-
tectures when the true σt is replaced with the estimated σ̂t from (4.19) for the case
x != 0. Figure 4.8 shows the estimated output noise standard deviation for both ideal
δ = 0.05 δ = 0.1
0.3 0.3
Median CAMP, Gaussian
0.29 0.29 Median CAMP, Fourier
Ideal CAMP, Gaussian
0.28 0.28
Ideal CAMP, Fourier
σ*
σ*
0.27 0.27
0.26 0.26
0.25 0.25
2 2.5 3 1.5 2 2.5 3
τ τ
δ = 0.2 δ = 0.5
0.3 0.3
0.29 0.29
0.28 0.28
*
*
σ
σ
0.27 0.27
0.26 0.26
0.25 0.25
1.5 2 2.5 3 1 1.5 2 2.5 3
τ τ
Figure 4.8 Output noise standard deviation versus τ for both ideal (σ∗ ) and median (/
σ∗ ) CAMP
for complex Gaussian and partial Fourier sensing matrices. The curves are obtained for several
values of δ by averaging over 100 MC realizations. σ2 = 0.05,ρ = 0.05, n = 4000. ©[2013]
IEEE. Reprinted, with permission, from [2].
1 1
0.9
0.95
0.8
0.9
0.7
Pd
d
0.6 0.85
P
0.5 0.8
0.4
0.75
Median CAMP Median CAMP
0.3
Ideal CAMP Ideal CAMP
−4 −3 −2
10 −4
10 −3
10 −2
10 −1 10 10 10 10−1
Pfa Pfa
(a) Gaussian sensing matrix (b) Partial Fourier sensing matrix
Figure 4.9 ROC curves for Architecture 1 using both ideal and median CAMP. n = 1000,
δ = 0.6, ρ = 0.1, a 2 = 1, and σ 2 = 0.05 (corresponding to a MF SNR = 13 dB).
and median CAMP. As expected, we observe that σ̂∗ deviates from the ideal CAMP case
because of the bias introduced by the estimator. Furthermore, the deviation diminishes
as = δρ decreases, as predicted by the upper bound provided in (4.20).
Overestimating /σ∗ in Architecture 1 results in a loss of detection performance. This
is shown in Figure 4.9, where we plot the ROC for Architecture 1 using both ideal
and median CAMP. The loss of detection performance is explained by the fact that
the soft thresholding function in CAMP uses the parameter τα σ̂∗ , which, for a fixed
τα , increases with σ̂∗ . If the overall threshold increases, then the detection probability
will decrease. In Figure 4.9, we also observe that Architecture 1 performs better when
the sensing matrix is the partial Fourier. This has to do with the behavior of the noise
variance σ∗2 , and therefore the SNR, versus the threshold, that changes along the ROC
curve. As can be seen from Figure 4.8, in the partial Fourier case the variance curve
becomes flatter than the curve obtained in the Gaussian case as δ increases. As a result,
when the false-alarm probability decreases, in the partial Fourier case the SNR along
the ROC curves for Architecture 1 deviates much less from the optimum SNR that is
achieved for τ = τo .
4.8.4 Performance of a Fully Adaptive CAMP CFAR Detector

In this section, we investigate the performance of the fully adaptive CAMP CFAR detec-
tor using ROC curves, and compare its performance to the compressive matched filter
(CMF) [56,59], which is the filter matched to the subsampled waveform. In Architecture
2, the CA-CFAR processor is preceded by a square law (SL) detector (see Figure 4.5)
and has a CFAR window of length 20 with 4 guard cells. In the Monte Carlo simula-
tions, we consider the case of a signal consisting of multiple targets, and the detection
probability is estimated for each target separately. Since all targets are generated having
equal amplitude, the plots show the results for only one of the targets. Please note that,
although most commonly in the radar literature the ROC plots are shown for the case of
a single target, in the CS case, having a single target represents an extremely sparse and
favorable scenario. Instead, in the case of multiple targets we observe both the effects of
reconstruction and of the CFAR processor. However, even if there are multiple targets,
the results of the CFAR processor will be independent on the number of actual targets
but will depend exclusively on the CAMP output SNR as long as the targets are not in
the CFAR window of one another. Therefore, in the MC simulations we never generate
targets at locations that fall within the CFAR window of another target. It is well known
that if there are interfering targets in the CFAR window of the CUT, the estimated
noise variance will raise, leading to an increased threshold that can potentially result
in missing the target yet to be detected. This is a classical problem in CFAR processing,
and in literature several alternative CFAR schemes have been proposed to deal with this
scenario, such as the ordered statistic (OS)-CFAR [32]. This and other types of CFAR
detectors can be similarly used in Architeture 2 instead of the CA-CFAR processor.
In Figure 4.10, we show the ROC curves for Architecture 2 for the cases of: (a)
ideal CAMP with an ideal (fixed threshold) detector, (b) ideal CAMP followed by the
CA-CFAR detector, (c) adaptive CAMP with an ideal (fixed threshold) detector, and (d)
fully adaptive scheme consisting of adaptive CAMP followed by a CA-CFAR processor.
In the same figure the theoretical curve of a CA-CFAR processor with the same window
length is also shown. The SNR used in the analytical CA-CFAR Pd equation is set
to 11.55 dB for the Gaussian sensing matrix and 11.9 dB for the partial Fourier sensing
matrix9 . For the Gaussian sensing matrix the optimal threshold for Architecture 2 (using
ideal CAMP) is computed analytically using the SE. For the partial Fourier sensing
9 The SNR has been estimated during simulations.
1
0.9 0.95
0.8 0.9
0.7 0.85
0.6 0.8
Pd
Pd
0.5 0.75
0.4 Ideal CAMP CA-CFAR
0.7 Ideal CAMP CA-CFAR
Adaptive CAMP FT Adaptive CAMP FT
0.3 Adaptive CAMP CA-CFAR
0.65 Ideal CAMP FT
Ideal CAMP FT Adaptive CAMP CA-CFAR
Theoretical CA-CFAR
0.2 CMF FT 0.6
Theoretical CA-CFAR
CMF FT
CMF CA-CFAR CMF CA-CFAR
0.1 –4 –3 –2 –1
10 10 10 10 10 –3 10 –2 10 –1
Pfa Pfa
(a) Gaussian sensing matrix (b) Partial Fourier sensing matrix
Figure 4.10 ROC curves for Architecture 2 with different combinations of CAMP algorithms and
detection schemes. Here n = 1000, δ = 0.6, ρ = 0.1, a 2 = 1, and σ 2 = 0.05 (corresponding to
a MF SNR = 13 dB). FT denotes the use of an (ideal) fixed threshold detector. ©[2013] IEEE.
matrix, the threshold in Ideal CAMP is derived from a plot like the ones shown in Figure
4.7 for the case δ = 0.6, and is equal to τo = 1.85.
As expected, adaptivity imposes extra losses in performance. One loss is due to the
use of adaptive instead of ideal CAMP, which is caused by the error in the estimated
τo . Another loss is caused by the CFAR processor and its estimate of the noise standard
deviation, and this is the well-known CFAR loss.
From Figure 4.10 the following observations can be made. First, adaptive CAMP
introduces almost no loss in the detection performance of Architecture 2. This can be
seen by observing that the perfomance of adaptive and ideal CAMP combined with
the same detector (either fixed threshold or CA-CFAR) are almost indistinguishable.
The reason for this is that, although σ̂∗ estimated in adaptive CAMP is biased, the
value of τ̂o at which the minimum σ̂∗ occurs is very close to the true optimal τo (see
Figure 4.8) computed in ideal CAMP, resulting in an almost optimal SNR even in the
adaptive case. The main loss instead is introduced by the adaptive CFAR detector, as
can be seen by comparing the curves obtained using the fixed threshold detector against
the ones obtained using the CA-CFAR processor, both with ideal or adaptive CAMP.
This loss is however not introduced by CAMP, but it is the well-known CFAR loss
[26]. Furthermore, if one compares the curve of adaptive CAMP plus CFAR with the
theoretical curve of a CA-CFAR processor (without CS) computed analytically using
the same parameters (SNR, CFAR window length, and number of guard cells), it can
be noted that the performance of the CA-CFAR detector appears to be independent
of the fact that the input to the detector is obtained by running CAMP instead of a
conventional MF.
Second, we observe that Architecture 2 significantly outperforms the CMF. This
should be expected since the output of the MF using the subsampled waveform will
result in severe target sidelobes (interference), which in turn leads to both an increase of
the false alarm rate and possibly the masking of weaker targets, there by reducing their
detection probability.
4 4
3.5 3.5
3 3
fa
fa
P
2.5
P
2.5
10
10
−log
−log
2 2
1.5 1.5
1 Arch.1 1 Arch.1
Arch.2 + CA−CFAR Arch.2 + CA−CFAR
1 1.5 2 2.5 3 3.5 4 1 1.5 2 2.5 3 3.5 4

−log α −log10 α
10
(a) (b)
Figure 4.11 Estimated FAP versus design FAP α for Architectures 1 and 2. n = 1000, and
δ = 0.6. (a) Complex Gaussian sensing matrix; (b) Partial Fourier sensing matrix.
By comparing Figures 4.9 and 4.10 it can be seen that, in the fixed threshold case,
Architecture 2 always outperforms Architecture 1, as predicted by Theorem 4.3. Also
in the adaptive case, Architecture 2 followed by a CA-CFAR processor outperforms
Architecture 1 using median-based CAMP. However, the difference between the two
schemes can vary significantly with the system parameters (δ,ρ,σ), sensing matrix
type, and CFAR window length. For instance, for the value of δ used in these figures,
we observe that Architecture 1 performs much better in the Fourier case than in the
Gaussian sensing matrix case. Also, the loss in detection performance is significantly
reduced compared to the adaptive detector. This again depends on the behavior of σ∗2
versus τ. In general, to predict how the two architectures will perform one should
observe the behavior of the output noise variance as a function of the threshold τ. If
the variation of σ∗ versus τ is small, in the ideal detector case the ROC curves of the
two architectures will be almost identical, with Architecture 2 always slightly better.
In Figure 4.11 the estimated FAP is shown for both Architecture 1 (which is nonadap-
tive, and uses median-based CAMP) and Architecture 2, which uses adaptive CAMP in
combination with a CA-CFAR processor. The desired FAP α, on the x-axis, is used to
obtain the threshold multiplier β for the CFAR processor in Architecture 2 and to derive
the value of the fixed CAMP threshold τα in Architecture 1.
As expected, Figure 4.11 shows that, in homogeneous Gaussian noise, the proposed
architectures posses the CFAR property. In simulating the FAP for Figure 4.11, accord-
ing to hypothesis H0 (target absent) we generated a measurement vector y with standard
Gaussian distribution and x = 0. However, in practical scenarios, where the noise level
may change across range, or in the presence of one or multiple targets located anywhere
in the signal x, Architecture 1 can not achieve CFAR. This is because the noise estimate
computed by median CAMP is not performed locally, as in a CFAR processor, but
is based on the whole received signal using the median, which is a biased estimator
if x != 0. In Architecture 2 instead, separating the reconstruction from the detection
stage gives more flexibility, and, e.g., in a multiple interfering targets scenario, the CA
processor can be replaced with a more robust CFAR scheme, such as OS-CFAR.
4.9 Experimental Results
To further validate our theorethical findings, we performed a set of experimental mea-

surements using the software defined LabRadOr experimental radar system at Fraun-
hofer FHR, Germany. A set of two CS digital stepped frequency (SF) waveforms were
designed to perform the measurements, and five stationary corner reflectors were used
as targets. A description of the radar systems and the transmitted waveform is provided
in the next two subsections, together with the results.
4.9.1 Radar System

LabRadOr is a software-defined pulsed radar, with maximum transmit power of 32 dBm
(and an attenuator of 1 dB step size) using separate transmit (TX) and receive (RX)
reflector antennas, with gain of 31.6 dB each. The digital waveform designed by the
user is transferred from the control computer to an FPGA, where a digital-to-analog
converter (DAC) converts the digital data to an analog signal. The analog waveform
is then transferred to an RF front-end for up conversion to the carrier frequency
fc = 8.9 GHz. At the receiver, after down conversion, the signal is returned to the
FPGA, where an analog-to-digital converter (ADC) samples the received analog signal
at 2 GHz sampling rate. The samples are then transferred to the control unit where they
are stored for further processing. Because of internal FPGA limitations, the maximum
number of samples per sweep that can be recorded is 1,024, thus limiting the receiver
record window length to 512ns. The start time of the record window can be set by
the user within the pulse repetition interval (PRI), which is fixed and equal to 10ms.
Since our objective is to perform SF measurements, but the maximum TX pulse length
is limited to 512ns, we transmit one frequency per pulse, and later combine all the m
frequencies (thus m pulses) to obtain a single SF measurement. Hence, we assume that
the scene is stationary at least to within m PRI seconds. Also, since both the corners and
the radar are fixed, the target amplitudes can be modeled as Swerling Case 0 [60].10
4.9.2 Transmitted Waveform

A set of stepped frequency waveforms is used in the experiments. The TX signal consists
of a number of discrete frequencies fm , covering the band from 100 to 900 MHz, with
a range resolution of δR = 18.75 cm. In the Nyquist case (that represents unambiguous
mapping of ranges to phases over the whole bandwidth) we used n = 200 frequencies
over a bandwidth of 800 MHz, and each frequency is transmitted for 0.512 μs, thus
implying a bandwidth of Bf = 1.95 MHz. Sequential frequencies are separated by
f = 4 MHz, resulting in an unambiguous range of Run = 37.5 m.
To obtain CS waveforms, a subset of m frequencies is chosen uniformly at random
from the Nyquist waveform (m < n). We considered the cases of m = 50 and 100,
10 Since we are interested in the detection problem from a single range measurements, we kept the targets
fixed and did not perform any Doppler measurements.
1000 1000
Frequency [MHz] 800 800
600 600
400 400
200 200
0 0
0 20 40 60 80 100 0 5 10 15 20 25
Time [us] Time [us]
(a) (b)
Figure 4.12 Spectrogram of TX waveform. (a) Nyquist waveform with n = 200. (b) CS waveform
with m = 50 and δ = 0.25.
that correspond, for n = 200 and k = 5, to δ = 0.5,0.25 and ρ = 0.05,0.1. The

spectrograms of the Nyquist waveform and of one of the CS TX waveforms for the case
m = 50 (δ = 0.25) are shown in Figure 4.12.
As shown in [55,61], for stepped frequency waveforms the sensing matrix A is a
Fourier matrix (partial, in the CS case). Please note that during the measurements, the
same total power is transmitted in each burst, irrespective of the number of transmitted
frequencies. This was achieved by adjusting accordingly the per frequency transmitted
power such that when the number of measurements is reduced by a factor δ, the power
per transmitted frequency PT is 1/δ times higher than in the Nyquist waveform case,
so that the total transmitted energy (PT × m/Bf ) stays the same for all waveforms.
This was done to ensure that the detection performance behavior for different under-
sampling factors was not caused by a reduction of transmitted power, but is solely due
to the quality of the recovery for different subsampling regimes.
4.9.3 Performance of Adaptive CAMP with CFAR Detector

In this section, we investigate the performance of the proposed CAMP based detection
schemes using the measured data. Figure 4.13 shows the signals reconstructed using the
proposed CAMP-based architectures in addition to MF. For Architecture 1 with median
CAMP, τα was set using Pf a = 10−4 . For Architecture 2, the CAMP threshold τ̂o is
adaptively estimated at each measurement. The 5 corner reflectors are indicated as T1,
T2, T3, T4, and T5 in the figure. Note that since targets are not exactly on grid points,
there is a leakage of target power into several range bins, both for MF as well as for CS.11
This is the so-called straddling loss, which is always present in real measurements.
In Figure 4.14 we plot the estimated output noise standard deviation (σ̂∗ ) for the
experimental data, where the range profiles were reconstructed using median CAMP
for different values of the threshold τ. The curve is obtained by averaging over all 300
11 In -norm minimization, as in classical MF, a clustering and interpolation step could be included after
1
detection to improve range estimation and reduce straddeling losses, see, e.g., [62].
T5
104 CAMP Arch. 1 T1 T2
CAMP Arch. 2 T3
MF T4
103
Amplitude [au]
102
101
0 5 10 15 20 25 30 35
Range [m]
Figure 4.13 Estimated range profile using: CAMP Architectures 1 and 2, and MF. For the MF
n = 200; for CS m = 100 and δ = 0.5. The y-axis is in log scale. ©[2013] IEEE. Reprinted,
with permission, from [2].
1600 δ = 0.5
δ = 0.25
1400
*
Estimated σ
1200
1000
800
1.5 2 2.5
τ
Figure 4.14 Estimated σ∗ versus τ using median CAMP.
measurements. Note that, for the same input noise variance, for δ = 0.25 the output
noise power is always higher than for δ = 0.5, also implying that for the same target
received power the SNR decreases with δ, as predicted by the SE. We also see that the
behavior of the estimated output noise standard deviation resembles the one shown is
Section 4.8.3 for the simulated data.
For obtaining the ROC and FAP curves, since the SNR is very high for all targets (in
all cases above 20 dB), to evaluate the performance of the detectors at medium SNR
values, we added white Gaussian noise to the raw frequency data samples, resulting in
an equivalent MF output SNR of 17.2, 16.6, 14, 10.2 and 26 dB respectively, from the
closest corner to the farthest one.
For estimating the ROC cuves, we adopted the following procedure. The FAP is
estimated by averaging the detections on all other cells, excluding the true targets’ cells
1 1
T1
0.9 0.9
T1
0.8 0.8
T3 T3
0.7 0.7
0.6 0.6 T4
d
d
P
P
0.5 0.5
0.4 0.4
0.3 T4 0.3
0.2
0.2
Arch. 2 CA-CFAR
0.1 Arch. 2 FT 0.1
Arch. 1
0
-3 -2 -1 0
10 10 10 10 10-3 10-2 10-1 100
Pfa Pf
(a) δ = 0.25 (b) δ = 0.5
Figure 4.15 ROC curves for Architecture 1 (a) and Architecture 2 (b) using both fixed threshold
(FT) and CA-CFAR detectors. The curves correspond to targets T1, T3, and T4 in Figure 4.13.
Adapted from [2], ©[2013] IEEE.
plus four guard cells (because of straddling). The detections instead are obtained for
each target separatly by counting the detections at the location of the corresponding
target highest peak. Figure 4.15 shows the ROC curves for three of the five targets
(T1, T3, and T4), having different SNRs, for δ = 0.5 and 0.25. The ROC curves are
estimated for each target using both Architecture 1 and Architecture 2. In Architecture
2, CAMP recovery is followed by either a fixed threshold (FT) detector or a CA-CFAR
processor.12 Please note that, since in Architecture 1 the estimated sparse signal x̂ can
never contain more than m out of n nonzero coefficients, FAPs higher than δ cannot be
estimated. Again, we emphasize that in CAMP the reconstruction SNR is the ratio of
the target power to the system plus reconstruction noise power (σ∗2 ). Therefore, while
the total transmitted power remains fixed for δ = 0.5 and 0.25, the reconstruction SNR
for each target depends on the architecture used in addition to the compression factor δ.
As observed in the simulated results, when the number of measurements is reduced the
reconstruction noise variance σ∗ increases, and CAMP SNR decreases. Since a loss in
SNR translates directly into a loss in detection probability, for a given FAP, CAMP will
perform better for larger δ.
In agreement with our theoretical findings, from Figure 4.15 we observe that the
detection probability of Architecture 2 with the fixed threshold detector is always higher
than the one of Architecture 1. Additionally, Architecture 2 followed by a CA-CFAR
processor exhibits a detection performance loss compared to the fixed threshold case.
This is again the CA-CFAR loss.
For the same figure we observe that the two proposed architectures perform very
similarly at very low or very high SNRs. However, at low FAPs and high Pd , which
is the most relevant case in practical situations, Architecture 2 always outperforms
Architecture 1. Also note that Architecture 1 is designed in a way that it is very similar
to an OS-CFAR detector, in which the CFAR window consists of the whole signal, i.e.,
12 For the CA-CFAR processor we used the same parameters as in the simulations, i.e., four guard cells and
a CFAR window of length 20.
3 3
2.5 2.5
2 2
−log10Pfa
−log10Pfa
1.5 1.5
1 1
0.5 δ = 0.5 0.5 δ = 0.5

δ = 0.25 δ = 0.25
0 0
0 1 2 3 0 1 2 3
−log10α −log10α
(a) Fixed threshold detector (b) CA-CFAR detector
Figure 4.16 Estimated FAP versus design FAP α for CAMP Architecture 2 using fixed threshold
detector (a), and CA-CFAR detector (b). δ = 0.5 (dashed line) and δ = 0.25 (solid line).
2Nw = m, and it also includes the CUT. Because in these measurements is very small,
the bias introduced by the median estimate used in Architecture 1 is also very small and
therefore the overall noise estimate of median CAMP is better than the noise estimate
of the CA-CFAR detector that is using only 20 bins. However, a serious disadvantage
of Architecture 1 is that, since the entire signal is used in the noise estimation and the
threshold τα is fixed, this scheme can inherently not adapt to local variation of noise
level. This makes Architecture 1 unsuitable for many radar applications. In Architecture
2 instead, one has the flexibility to choose both the most appropriate CFAR processor
and CFAR window length, depending on the specific scenario. This of course will also
depend on the target distribution in range. In fact, if for example multiple targets are
present in the same CFAR window, then Architecture 2 will have a significant loss in
performance, unless more clever CFAR schemes are used. Instead Architecture 1 is
insensitive to the targets’ locations, and therefore would have the same performance.
As we only performed measurements with targets, we are unable to evaluate the
CFAR property of Architecture 1. However, for CAMP Architecture 2, we can demon-
strate that our model x̃ = x + σ∗ w is correct by estimating the FAP from the recon-
structed noisy signal x̃ by excluding the range bins corresponding to the target locations
plus four guard cells. If our model is correct, and the noise in the signal x̃ is Gaussian,
then the estimated FAP should correspond to the design FAP used to set the detector
threshold. This should be true for both the fixed threshold and the CFAR detector. This
is demonstrated in Figure 4.16, where the estimated Pf a is plotted versus the design FAP
α for CAMP Architecture 2 using both the CA-CFAR and the fixed threshold detector.
From this figure we observe that, as expected, the estimated FAP matches the design
one, confirming that our model is correct.
4.10 Conclusions
In this chapter, we studied the problem of CS radar detection. The nonlinearity of target
reconstruction algorithms in CS makes the calculation of false alarm and detection
probabilities data-dependent and complicated. We showed how the recent advances

in the asymptotic analysis of CS recovery algorithms enable us to convert CS radar
detection to a problem that is similar to classical radar detection. This key step then
enabled us to design detection architectures based on compressive measurements and
combine them with conventional CFAR processing techniques for CS radar detection.
We presented extensive simulation and experimental studies to show the efficacy of our
theoretical results in practice.
References
[1] L. Anitori, “Compressive sensing and fast simulations: Applications to radar detection,”
PhD dissertation, Technical University of Delft, 2013.
[2] L. Anitori, A. Maleki, M. Otten, R. Baraniuk, and P. Hoogeboom, “Design and analysis of
compressive sensing radar detectors,” IEEE Trans. Signal Process., vol. 61, no. 4, pp. 813–
827, Feb. 2013.
Trans. Signal Process., vol. 57, no. 6, pp. 2275–2284, Jun. 2009.
[4] L. Anitori, M. Otten, and P. Hoogeboom, “Compressive sensing for high resolution radar
imaging,” in Proc. IEEE Asia-Pacific Microwave Conf. (APMC), 2010.
[5] S. M. Song, W. M. Kim, D. Park, and Y. Kim, “Estimation theoretic approach for radar pulse
compression processing and its optimal codes,” Electronic Letters, vol. 36, no. 3, pp. 250–
253, Feb. 2000.
[6] S. D. Blunt and K. Gerlach, “Adaptive pulse compression via MMSE estimation,” IEEE
Trans. Aerosp. Electron. Syst., vol. 42, no. 2, pp. 572–583, Apr. 2006.
[7] L. C. Potter, E. Ertin, J. T. Parker, and M. Cetin, “Sparsity and compressed sensing in radar
imaging,” Proc. IEEE, vol. 98, no. 6, pp. 1006–1020, Jun. 2010.
[8] M. I. Skolnik, Radar Handbook. McGraw-Hill, 1970.
[10] L. R. Varshney and D. Thomas, “Sidelobe reduction for matched filter range processing,” in
Proc. IEEE Radar Conf., 2003.
[11] J. V. DiFranco and W. L. Rubin, Radar Detection. Artech House, 1980.
[12] S. M. Kay, Fundamentals of Statistical Singnal Processing: Detection Theory. Prentice-
Hall, 1998.
[13] H. L. V. Trees, Detection, Estimation and Modulation Theory: Part III. John Wiley & Sons,
2001.
[14] P. Stoica, J. Li, and M. Xue, “Transmit codes and receive filters for radar,” IEEE Signal
Process. Mag., vol. 25, no. 6, pp. 94–109, Nov. 2008.
[15] F. F. J. Kretschmer and K. Gerlach, “Low sidelobe radar waveforms derived from orthogonal
matrices,” IEEE Trans. Aerosp. Electron. Syst., vol. 27, no. 1, pp. 92–102, Jan. 1991.
[16] R. L. Frank, “Polyphase codes with good nonperiodic correlation properties,” IEEE Trans.
Inf. Theory, vol. 9, no. 1, pp. 43–45, 1963.
[17] A. Divito, A. Farina, G. Fedele, G. Galati, and F. Studer, “Synthesis and evaluation of phase
codes for pulse compression radar,” Rivista Tecnica Selenia, vol. 9, no. 2, pp. 12–24, 1985.
[18] Y. I. Abramovich and M. B. Sverdlik, “Synthesis of a filter which maximizes the signal-
to-noise radio under additional quadratic constraints,” Radio Eng. Electron. Phys., vol. 15,
pp. 1977–1984, Nov. 1970.
[19] M. H. Ackroyd and F. Ghani, “Optimum mismatched filters for sidelobe suppression,” IEEE
Trans. Aerosp. Electron. Syst., vol. 9, no. 2, pp. 214–218, Mar. 1973.
[20] S. Zoraster, “Minimum peak range sidelobe filters for binary phase-coded waveforms,” IEEE
Trans. Aerosp. Electron. Syst., vol. 16, no. 1, pp. 112–115, Jan. 1980.
[21] A. Zejak, E. Zentner, and P. Rapajic, “Doppler optimised mismatched filters,” IET Electron-
ics Letters, vol. 27, no. 7, pp. 558–560, Mar. 1991.
[22] C. Candan, “On the design of mismatched filters with an adjustable matched filtering loss,”
in Proc. IEEE Radar Conf., 2010.
[23] B. Steenson, “Detection performance of a mean-level threshold,” IEEE Trans. Aerosp.
Electron. Syst., vol. 4, no. 4, pp. 529–534, Jul. 1968.
[24] H. M. Finn and R. S. Johnson, “Adaptive detection mode with threshold control as a function
of spatially sampled clutter-level estimates,” RCA Review, vol. 29, pp. 414–464, Sept. 1968.
[25] G. M. Dillard and C. E. Antoniak, “A practical distribution-free detection procedure for
multiple-range-bin radar,” IEEE Trans. Aerosp. Electron. Syst., vol. 6, no. 5, pp. 629–635,
Sept. 1970.
[26] P. P. Gandhi and S. Kassam, “Analysis of CFAR processors in homogeneous background,”
IEEE Trans. Aerosp. Electron. Syst., vol. 24, no. 4, pp. 427–445, Jul. 1988.
[27] A. D. Vito and G. Moretti, “Probability of false alarm in CA-CFAR device downstream from
linear-law detector,” IET Electronics Letters, vol. 25, no. 25, pp. 1692–1693, Dec. 1989.
[28] R. S. Raghavan, “Analysis of CA-CFAR processors for linear-law detection,” IEEE Trans.
Aerosp. Electron. Syst., vol. 28, no. 3, pp. 661–665, Jul. 1992.
[29] G. B. Goldstein, “False alarm regulation in log-normal and Weibull clutter,” IEEE Trans.
Aerosp. Electron. Syst., vol. 9, no. 1, pp. 84–92, Jan. 1973.
[30] E. Conte, M. Lops, and A. M. Tulino, “Hybrid procedure for CFAR in non-Gaussian clutter,”
IEEE Proc. Radar, Sonar, and Navig., vol. 144, no. 6, pp. 361–369, Dec. 1997.
[31] R. Ravid and N. Levanon, “Maximum-likelihood for Weibull background,” IEEE Proc. Part
F (London), vol. 139, no. 3, pp. 256–264, Jun. 1992.
[32] H. Rohling, “Radar CFAR thresholding in clutter and multiple target situations,” IEEE
Trans. Aerosp. Electron. Syst., vol. 19, no. 4, pp. 608–621, Jul. 1983.
[33] M. A. Khalighi and M. H. Bastani, “Adaptive CFAR processor for nonhomogeneous
environments,” IEEE Trans. Aerosp. Electron. Syst., vol. 36, no. 3, pp. 889–897, Jul. 2000.
[34] M. Sekine, T. Musha, Y. Tomita, and T. Irabu, “Suppression of Weibull-distributed clutters
using a cell-averaging LOG/CFAR receiver,” IEEE Trans. Aerosp. Electron. Syst., vol. 14,
no. 5, pp. 823–826, Sept. 1978.
[35] R. Nitzberg, “Constant-false-alarm-rate signal processors for several types of interference,”
IEEE Trans. Aerosp. Electron. Syst., vol. 8, no. 1, pp. 27–34, Jan. 1972.
[36] S. R. Babu and R. Srinivasan, “Analysis of envelope detected mean level CFAR processors
using importance sampling,” in Proc. IEEE Radar Conf., 2000.
[37] E. Candès, J. Romberg, and T. Tao, “Robust uncertainty principles: Exact signal reconstruc-
tion from highly incomplete frequency information,” IEEE Trans. Inf. Theory, vol. 52, no. 2,
pp. 489–509, Feb. 2006.
[38] E. Candès, J. Romberg, and T. Tao, “Stable signal recovery from incomplete and inaccurate
measurements,” Comm. Pure Appl. Math., vol. 59, no. 8, pp. 1207–1223, Aug. 2006.
[39] D. L. Donoho, “Compressed sensing,” IEEE Trans. Inf. Theory, vol. 52, no. 4, pp. 1289–
1306, Apr. 2006.
[40] R. Tibshirani, “Regression shrinkage and selection via the LASSO,” J. Roy. Stat. Soc., Series
B, vol. 58, no. 1, pp. 267–288, 1996.
[41] S. Chen, D. Donoho, and M. Saunders, “Atomic decomposition by basis pursuit,” SIAM J.
on Sci. Computing, vol. 20, no. 1, pp. 33–61, 1998.
[42] E. Candès and T. Tao, “Near optimal signal recovery from random projections: Universal
encoding strategies?” IEEE Trans. Inf. Theory, vol. 52, no. 12, pp. 5406–5425, Dec. 2006.
[43] A. Maleki and D. L. Donoho, “Optimally tuned iterative thresholding algorithm for
compressed sensing,” IEEE J. Sel. Topics Sig. Proc., vol. 4, no. 2, pp. 330–341, Apr. 2010.
[44] A. Maleki, L. Anitori, Y. Zai, and R. Baraniuk, “Asymptotic analysis of complex LASSO
via complex approximate message passing (CAMP),” IEEE Trans. Inf. Theory, vol. 59, no.
7, pp. 4290–4308, July 2013.
[45] D. L. Donoho, A. Maleki, and A. Montanari, “Message passing algorithms for compressed
sensing,” Proc. Natl. Acad. Sci., vol. 106, no. 45, pp. 18914–18919, 2009.
[46] D. L. Donoho, A. Maleki, and A. Montanari, “Message passing algorithms for compressed
sensing: I. Motivation and construction,” in IEEE Proc. Inform. Theory Work. (ITW), 2010.
[47] A. Maleki and A. Montanari, “Analysis of approximate message passing algorithm,” in Proc.
IEEE Conf. Inform. Science and Systems (CISS), 2010.
[48] D. L. Donoho, A. Maleki, and A. Montanari, “The noise-sensitivity phase transition in
compressed sensing,” IEEE Trans. Inf. Theory, vol. 57, no. 10, pp. 6920–6941, Oct. 2011.
[49] A. Maleki, “Approximate message passing algorithm for compressed sensing,” PhD disser-
tation, Stanford University, 2010.
[50] M. Bayati and A. Montanari, “The dynamics of message passing on dense graphs, with
applications to compressed sensing,” IEEE Trans. Inf. Theory, vol. 57, no. 2, pp. 764–785,
Feb. 2011.
[51] C. A. Metzler, A. Maleki, and R. G. Baraniuk, “From denoising to compressed sensing,”
IEEE Transactions on Information Theory, vol. 62, no. 9, pp. 5117–5144, 2016.
[52] A. Mousavi, A. Maleki, R. G. Baraniuk et al., “Consistent parameter estimation for lasso
and approximate message passing,” Ann. Statist., vol. 45, no. 6, pp. 2427–2454, 2017.
[53] M. Bayati and A. Montanari, “The LASSO risk for Gaussian matrices,” IEEE Trans. Inf.
Theory, vol. 58, no. 4, pp. 1997–2017, Apr. 2012.
[54] S. Wang, H. Weng, and A. Maleki, “Which bridge estimator is optimal for variable
selection?” arXiv preprint arXiv:1705.08617, 2017.
[55] J. H. G. Ender, “On compressive sensing applied to radar,” J. Signal Process., vol. 90, no. 5,
pp. 1402–1414, May 2010.
[56] L. Anitori, A. Maleki, W. van Rossum, R. Baraniuk, and M. Otten, “Compressive CFAR
radar detection,” in Proc. IEEE Radar Conf., 2012.
[57] S. Shah, Y. Yu, and A. Petropulu, “Step-frequency radar with compressive sampling
SFR-CS,” in Proc. IEEE Int. Conf. Acoust., Speech, and Signal Process. (ICASSP), 2010.
[58] F. J. Massey, “The Kolmogorov–Smirnov test for goodness of fit,” J. Am Statist. Assoc.,
vol. 46, no. 253, pp. 68–78, 1951.
[59] M. A. Davenport, M. B. Wakin, and R. Baraniuk, “The compressive matched filter,” Rice
Univ., ECE Dept., Tech. Rep. TREE-0610, Nov. 2006.
[60] D. P. Meyer and H. A. Mayer, Radar Target Detection: Handbook of Theory and Practice.
Academic Press Inc., 1973.
[61] L. Anitori, M. Otten, and P. Hoogeboom, “Detection performance of compressive sensing
applied to radar,” in Proc. IEEE Radar Conf., 2011.
[62] D. Beker, W. van Rossum, S. Jacobs et al., “On pre-whitening and accuracy in doa estimation
by sparse signal processing on beamformed data,” in Proc. Int. Work. on Compressed
Sensing on Radar, Sonar, and Remote Sensing (CoSeRa), 2016.
5 Sparsity-Based Methods for CFAR
Target Detection in STAP
Random Arrays
Haley H. Kim and Alexander M. Haimovich
5.1 Introduction
Ground moving target indicator (GMTI) radar is an airborne radar mounted on an air-
craft that detects the presence of targets on the ground. One of the main challenges faced
by GMTI radars is the detection of slow-moving targets in the presence of ground clut-
ter interference. Space–time adaptive processing (STAP) implementation with antenna
arrays has been a classical approach to clutter cancellation in airborne radar [1]. One of
the challenges with STAP is that the minimum detectable velocity (MDV) of targets is
a function of the baseline of the antenna array: the larger the baseline (i.e., the narrower
the beam), the lower the MDV. Unfortunately, increasing the baseline of a uniform linear
array (ULA) entails a commensurate increase in the number of elements.
Instead of using a large ULA and localizing targets by beamforming [2], one
may consider a smaller ULA, but use more sophisticated localization algorithms,
such as Capon’s method [3], MUSIC [4], or ESPRIT [5]. All three methods are
capable of resolving targets within the Rayleigh resolution limit, whereas conventional
beamformering cannot. MUSIC and ESPRIT, however, require knowledge of the
number of targets. This information is rarely known to a radar and must be obtained
by other means, such as using the Akaike information criteria (AIC) or the minimum
description length (MDL) [6,7]. Unfortunately, methods such as the AIC or the MDL
do not allow one to control the false alarm rate, a basic requirement in radar. In addition,
all three methods require a large number of snapshots, which usually are not available
in STAP applications.
An alternative approach to increasing the resolution of a radar, but without using
a large number of sensors is to use a large, but sparsely populated array. In a sparse
array, the sensors are placed across a large array with interelement spacing greater than
half a wavelength in a nonuniform manner to avoid grating lobes. There are typically
two methods used to determine the sensor positions of a sparse array. The first method
involves solving an optimization problem to determine the positions of the sensors such
that the resulting beampattern meets some specifications [8,9]. Another method is to
simply decide the sensor positions randomly. In random arrays [10,11], sensors are
randomly placed across a large array aperture. Since the resolution of the radar depends
mostly on the size of the aperture [10], a radar utilizing a sparse array may achieve a high
angular resolution with significantly fewer sensors than a ULA. Unfortunately, sparse
arrays do not come without drawbacks. Due to the spatial under-sampling, the array
135
136 Kim and Haimovich
beampattern suffers from high sidelobes. During the beamforming stages of STAP, these
high sidelobes may cause a significant increase in false alarms [12]. It was shown in [13]
that sparse arrays for which sensor locations are determined either by optimization or
randomly yield similar peak sidelobe levels. In this chapter, we focus our attention only
to random arrays. Moreover, one may envision applications in which the array elements
cannot be controlled, for example, an array constituted of unmanned aerial vehicles
(UAV) [14].
In [15], Carin demonstrates that measurements from random arrays are consistent
to projection measurements that can be utilized by compressive sensing (CS) [16]. This
suggests that the user may reap the full benefits of a large random array without worrying
that the high sidelobes unnecessarily increase the false alarm rate. The goal of CS is to
recover the signal of interest x, given the received data vector y and a linear model
y = Ax + e, where A is a measurement matrix and e is an interference vector. If the
signal x is known to be sparse (i.e., contains K nonzero elements where K is much
smaller than the number of entries in x), the K sparse solution (a solution with at most
K nonzero entries) may be found solving the nonconvex optimization problem
miny − Ax22
x
(5.1)
subject to x0 ≤ K,
where x0 counts the numbers of nonzero elements in x.
The optimization problem in (5.1) is nonconvex, and therefore only approximate
solutions can be obtained. One approach to obtaining an approximate solution is to
first convert the optimization problem (5.1) into the following optimization problem
min y − Ax22 + λx0 . (5.2)

x
To transform (5.2) into a convex optimization problem, the nonconvex term x0 is
replaced by the convex term x1 . Here, λ is a regularization parameter that controls the
sparsity of the solution in x. For a specific choice of λ, numerous algorithms in literature,
such as [16–21], are capable of solving (5.2) in polynomial time. These algorithms are
often referred to as basis pursuit (BP). In [22,23] the authors use this approach to solve
the sparse localization problem. Nevertheless, without the ability to control false alarms,
this approach does not lend itself to radar applications. The authors in [24] argue that
a constant false alarm rate (CFAR) radar may be obtained by properly designing the
regularization parameter λ. However, in [25], the authors point out that the output noise
distribution is unknown and unpredictable, which makes BP unsuitable for designing
CFAR radars.
Another approach to solving (5.1) is to use matching pursuit (MP) algorithms
[26–30]. MP belong to the class of greedy algorithms, which search iteratively one
by one for components of the unknown vector x. Components of x detected by MP
iterations are removed from subsequent iterations to reduce interference to components
of x yet to be detected. In this sense, MP implements a form of successive intereference
cancellation (SIC). Although the MP approach generally has weaker guarantees than BP,
it has been shown empirically that it often performs similarly, and in some applications
Sparsity-Based Methods for CFAR Target Detection in STAP Random Arrays 137
it outperforms BP [31]. The most substantial advantage of MP algorithms over BP

algorithms is their lower computational complexity [32]. In fact, when applied to
the radar problem, MP algorithms have a computational complexity comparable to
that of a beamformer [33]. A large body of literature exists on compressive sensing
applications to radar, but the literature on applying MP to CFAR radar is scarce, with
some exceptions, e.g., [29,34]. In particular, [34] does not account for colored Gaussian
noise and unknown interference covariance matrix.
In this chapter, we extend the work in [34] and propose new detection algorithms
for airborne radar, which combine the strengths of random arrays with the ability of
sparsity based algorithms to handle under-sampling effects. We propose two sparsity-
based CFAR detection algorithms, referred to as MP-CFAR and multibranch MP-CFAR
(MBMP-CFAR), respectively. MP-CFAR consists of a target localization stage followed
by a target detection stage. MBMP-CFAR generalizes MP-CFAR by maintaining mul-
tiple sets of candidate targets. In addition, we present an analysis of the performance of
the new sparsity-based radar. In the analysis, the covariance matrix of the noise is not
assumed to be known. The main results can be summarized as in the following:
1. Show that the number of element of a random array required to maintain a certain
level of peak sidelobes scales with the logarithm of the array aperture, in contrast
with a ULA, where the number of elements scales linearly with the array aperture.
2. Formulate the problem of sparse target detection given space–time observations
from random arrays. The observations are obtained in the presence of Gaussian
colored noise of unknown covariance matrix, but for which secondary data is
available for its estimation.
3. Develop a CFAR detector for detecting targets by random arrays in unknown
colored noise. The detector cancels previously detected targets from the obser-
vations to reduce interference between targets. The detector explicitly accounts
for the number of spatial resolution cells, the number of array elements, and the
number of training samples.
4. Develop the performance analysis for the new sparsity-based radar detector,
including expressions for the probability of false alarm and the probability of
detection.
5.2 STAP Radar Concepts
In this section, we introduce the STAP radar signal model and discuss properties of
random arrays in STAP radar. In particular, we discuss the average sidelobe and, more
importantly, the average peak sidelobe levels exhibited by a random array. We also
discuss the clutter rank of the random array in STAP.
5.2.1 Signal Model

Consider a radar system mounted on an aircraft, in which Na elements collect
returns of a transmitted signal consisting of an Np -pulse coherent waveform with
up
Np pulses
Aperture Z
Figure 5.1 STAP random array radar system model.
pulse-repetition-interval Tr . The radar operating carrier wavelength is λ, and the

airborne platform velocity is vp , where the velocity vector is assumed aligned with
the array axis. The Na receive sensor locations z1,z2,. . .,zNa are assumed to be chosen
randomly within an aperture of length Z, where the sensor locations and the aperture
length are expressed in units of the wavelength λ. For concreteness, it is assumed that
the positions of receive elements are drawn from a uniform distribution. An example of
an array is shown in Figure 5.1.
Let u = sin θ denote the spatial frequency associated with the azimuth angle mea-
sured with respect to the normal to the array. The Na × 1 array response vector c(u), is
defined
1 j 2πz1 u j 2πz2 u T
c(u) = √ e ,e ,. . .,ej 2πzNa u . (5.3)
Na
Since by applying the vector c∗ (u) the array is steered to spatial frequency u, c(u) is also
known as a steering vector. The pattern of a random array β (ω) is a stochastic process
defined as the response of an array steered to spatial frequency (u − ω) to a target at
spatial frequency u. The array pattern is given by
β (ω) = |cH (u − ω)c(u)|2 . (5.4)
The mainbeam is defined as the array patten in the region |ω| ≤ 1/Z, while the sidelobe
region is |ω| > 1/Z. The peak sidelobe is defined μ = max|ω|>1/Z β (ω).
The Doppler shift caused by a target moving at velocity vt relative to the normal
to the array is fd = 2vt /λ. The normalized Doppler frequency v is the Doppler shift
fd normalized to the sampling frequency 1/Tr , where Tr is the pulse repetition interval,
v = fd Tr . The Np × 1 temporal steering vector g(v) of a target with normalized Doppler

frequency v is given by
1 j 2πv T
g(v) = 1,e ,. . .,ej 2π(Np −1)v . (5.5)
Np
For notational convenience, let N = Na Np , then the N × 1 space–time steering vector
of a target with spatial frequency u and Doppler v is given by
a(u,v) = g(v) ⊗ c(u), (5.6)
where ⊗ represents the Kroneckor product. The N × 1 baseband y signal received at the
array from a target with steering vector a and complex amplitude x is given by
y = ax + e, (5.7)
where e = ec + ew is the interference vector consisting of the ground clutter contri-

butions ec and complex-valued white Gaussian noise ew . We treat ground clutter and
thermal noise as uncorrelated processes, and therefore the N × N interference and noise
covariance matrix is given by

R = E (ec + ew )(ec + ew )H = Rc + Rw . (5.8)
Here Rw is the covariance matrix of the thermal noise given by Rw = σ2 I, where σ2 is

the power of thermal noise. A typical model for the clutter covariance matrix Rc [35] is
1
Rc = s(u)a (u,ξu) aH (u,ξu) du, (5.9)
−1
where ξ = 4vp Tr /λ and s(u) is the power of a clutter patch at spatial frequency u and
normalized Doppler frequency ξu.
The signal model for K targets is given by
y = Ax + e. (5.10)
Here, A is the N ×G measurement matrix whose columns are steering vectors associated
with a grid of possible target locations on the angle-Doppler map, and x is a G × 1
vector of complex target amplitudes. The vector x contains only K G nonzeros. In
later sections, we apply optimization algorithms that operate on a grid. To this end, we
2
discretize the angle-Doppler map into G = G grid points, where G is the number
of grid points in each of the two domains. The G grid points serve as resolution cells.
Typically the dimensionality of the signal space N is much smaller than the number of
resolution cells, N G. The G × 1 vector of target gains x is assumed to be sparse, in
the sense that it has K G nonzero entries.
In STAP, the covariance matrix R is typically unknown, but can be estimated from
secondary data. The secondary data is assumed to consist of independent identically
distributed vectors with a covariance matrix common with the cell under test. Let L be
the number of secondary data vectors and q(l) a secondary data vector, the maximum
likelihood estimate (MLE) of the covariance matrix is the sample covariance matrix
1
L
/
R= q(l)q(l)H . (5.11)
L
l=1
In subsequent sections of this chapter, we will make use of the inverse of the sample
covariance matrix. In order to ensure that /
R−1 exists, we make the assumption that
L > N.
5.2.2 Properties of Random Arrays

In random arrays, antenna elements are placed at random between the end points of an
array. Since the goal is to obtain a thinned array, the average spacing between antenna
elements is larger than a half-wavelength. Thus, the term random arrays refers to arrays
that are thinned relative to a filled ULA.
Average sidelobe levels of a random array

Note that the beam pattern of a filled ULA with aperture Z and uniform illumination is
given by [36]
(sin (πZω)
βU LA (ω) = .
[Z sin (πω)])2
From this expression, it is seen that the main beam is the region |ω| ≤ 1/Z, while the
sidelobes are |ω| > 1/Z. The number of sidelobes in the visible region |ω| < 1 is
2 (Z − 2) . Of interest are relatively large arrays, in which case the number of sidelobes
may be approximated by 2Z.
Given an array of Na elements placed at random over an aperture Z, it has been
shown that the shape of the mainbeam β (ω), |ω| ≤ 1/Z, follows that of a filled ULA
with little variation between instantiations of array elements. In Figure 5.2, we show an
example that demonstrate the width of the mainbeam of a random array compared to
the mainbeam of ULAs. In the figure, it is seen that the small ULA with 20 elements
(which corresponds to an array aperture of 10λ) has a wider mainbeam compared to the
random array and the large ULA both with aperture sizes of 15λ. The figure also shows
that the random array and the large ULA have the same mainbeam width. Thus with
significantly fewer elements, a random array provides the advantage of a narrow and
stable mainbeam of a filled array. While there is no impact on the mainbeam, random
arrays have higher sidelobes than filled arrays. By the Central Limit theorem, for a
sufficiently large number of elements Na and a fixed value ω,
b (ω) = cH (u − ω)c(u)

Na
= (1/Na ) ej 2πzn ω,
n=1
is a complex-valued Gaussian random variable with mean

Na

φ (ω) = (1/Na ) E ej 2πzn ω = E ej 2πzω ,
n=1
0
ULA – 10
ULA – 15
RA – 15
–5
Beampattern (dB)
–10
–15
–0.25 –0.2 –0.15 –0.1 –0.05 0 0.05 0.1 0.15 0.2 0.25
Spatial frequency
Figure 5.2 Beampattern of a small ULA with 20 elements (10λ array), a random array with
an array aperture of Z = 15λ with N = 20 elements, and a large ULA with 30 elements
(15λ array).
and variance
2
var |b (ω)|2 = E |b (ω)|2 − φ (ω) .
It is shown in [10] that in the sidelobe region

E Re (b (ω))2 ≈ E Im (b (ω))2 ≈ 1/2Na .
Therefore the mean level of the beam pattern sidelobes is

E |b (ω)|2 = E β (ω) ≈ 1/Na .
Thus, the sidelobes of a random array are dominated by the term 1/Na rather than the
sidelobes of the associated filled array.
Peak sidelobe level of a random array

Next, we are interested in the statistics of the peak sidelobe
μ = max β (ω) .
|ω|>1/Z
Viewed as a function of ω, the array pattern β (ω) is a stochastic process. In the sidelobe
region, the stochastic process is approximately ergodic, meaning that statistical averages
may be gleaned from averages across the spatial frequency variable ω [10]. Further-
more, values of the stochastic process β (ω) can be approximated as independent when
the values of the spatial frequency ω are separated by a sidelobe or more [10].
As previously discussed, the number of sidelobes is approximately 2Z, where Z
is the aperture size of the random array. To find the CDF of the peak sidelobe, let

β (ω) 2Na β (ω) = 2Na |b (ω)|2 and μ 2Na μ. Since b (ω) ∼ CN (0,1/Na ), it
follows that β (ω) is a chi-square random variable with 2 degrees of freedom. Recall
that the sidelobes are approximated to be independent from each other [10]. It is easy to
verify that the cumulative distribution function (CDF) of
β (ω) is given by
−t/2
β (t) = 1 − e
. (5.12)
It follows that the CDF of the peak sidelobe variable

μ is
$ %2Z
μ (t) = Pr{β (ω1 ) ≤ t,. . .,β (ω2Z ) ≤ t} =
β (t) . (5.13)
Using a known relation for nonnegative random variables,

∞
$ %
E μ = 1 −
μ (t) dt. (5.14)
0
Substituting (5.12) and (5.13) in (5.14),

∞ & '2Z

E μ = 1 − 1 − e−t/2 dt. (5.15)
0
The integration in (5.15) is solved in [37], where it is shown

∞ & '2Z
2Z
1
−t/2
1− 1−e dt = 2 . (5.16)
0 k
k=1

For large Z, the sum 2Z k=1 k asymptotically approaches ln 2Z + γE , where γE = 0.577
1

is Euler’s constant [37]. Since, ln 2Z " γE , we approximate 2Z k=1 k ≈ ln 2Z. Substi-
1
μ] = 2 ln 2Z. Finally, recalling that the peak sidelobe μ

tuting this back into (5.15), E[
is related to the random variable μ as μ =
μ/2Na , the mean peak sidelobe is given by
ln (2Z)
E μ = . (5.17)
Na
It is observed that the mean peak sidelobe is larger than the mean sidelobe by the
factor ln (2Z) . To illustrate (5.17), we plot in Figure 5.3, the beampattern of a random
array with Na = 15 elements filling an array of size 20λ, we also show the computed
average sidelobe level and the average peak sidelobe level. Also, to maintain a fixed
mean peak sidelobe level, the number of elements of the random array has to scale with
the logarithm of the aperture length. This is contrast with a filled ULA in which the
number of elements scales linearly with the aperture length Z.
Another point of view that demonstrates that the number of necessary elements
in a random array scales with ln Z rather than Z, is to compute the number of
elements for which the peak sidelobe μ is lower than a level η, with probability
α,α = Pr μ ≤ η = μ (η). The CDF of the peak sidelobe μ (t) can be computed
from (5.13) and (5.12). Recalling the relation μ = μ/2Na , we have
$ %
−Na η 2Z
μ (2Na η) = 1 − e
α = . (5.18)
0
Random array – 15
Average sidelobe level
Average peak sidelobe level
–5
Beampattern (dB)
–10
–15
–1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.
Spatial frequency
Figure 5.3 Beampattern of a random array using Na = 15 elements filling a large aperture of
Z = 20λ. The computed average sidelobe level and average peak sidelobe level is also shown.
Taking ln of both sides, and noting that the expected result is such that Na η " 1, we
approximate the ln function with the first term in its Taylor expansion
$ %
ln 1 − e−Na η ≈ −e−Na η . (5.19)
Note that this approximation holds true because η is reasonably higher than the mean
value. Using (5.19) and after a little algebra, obtain
1$ %
Na = ln 2Z − ln ln α −1 ,
η
which links between the number of elements and confidence level that the sidelobes do
not exceed a set value. A similar result without proof has been presented in [38].
5.2.3 Clutter Response of Random Arrays

STAP relies on the fact that the rank of the clutter covariance matrix Rc (often referred
to as clutter rank) is much lower than the dimensionality of the signal space. As a result,
whitening of the clutter interference does not result in significant loss of target signal-
to-noise ratio (SNR). In a filled ULA, the clutter map (defined as aH (u,v)Rc a(u,v),
with u and v sweeping through their domains |u| < 1, |v| < 1), forms a diagonal ridge
above the uv plane. The width of the ridge along the spatial frequency u axis equals
the beamwidth of the array. Thus the clutter ridge of a random array is expected to be
narrower than the clutter ridge of a filled ULA with the same number of elements. This
is illustrated in Figure 5.4.
The panel on the left of Figure 5.4 shows the clutter map of a ULA with N = 10
elements, while the panel on the right shows the clutter map of a random array of the
–0.5 –0.5
–0.4 –0.4
–0.3 –0.3
–0.2 –0.2
Normalized Doppler
Normalized Doppler
–0.1 –0.1
0 0
0.1 0.1
0.2 0.2
0.3 0.3
0.4 0.4
0.5 0.5
–0.5 –0.4 –0.3 –0.2 –0.1 0 0.1 0.2 0.3 0.4 0.5 –0.5 –0.4 –0.3 –0.2 –0.1 0 0.1 0.2 0.3 0.4 0.5
Spatial frequency Spatial frequency
Figure 5.4 (Left figure): Clutter map using a ULA with N = 10 elements, P = 20 pulses, and
β = 1. (Right figure): Clutter map using a random array with N = 10 elements, P = 20 pulses,
and β = 1. The elements of the sparse random array are spread across an array of size 15λ.
same number of elements (10) spread over 15λ (rather than the 5λ ULA aperture), both
the clutter maps are generated with β = 1. It is noticed that the clutter ridge of the
random array is narrower, which leads to a lower MDV. Note that the clutter map of
the sparse array also exhibits multiple, spurious clutter ridges due to higher sidelobes of
the beampattern.
The clutter rank of a filled ULA can be computed from Brennan’s rule
rc = rank(Rc ) = Na + (Np − 1)ξ, (5.20)
where recall that ξ = 4vp Tr /λ. Now, given a random array with aperture size Z, let
f ull
Na represent the number of sensors in a filled ULA configuration. From [39], the
clutter rank of a random array is
f ull
r c ≈ Na + (Np − 1)ξ. (5.21)
It is noticed that the clutter rank of a random array depends on the aperture size Z,
f ull
since Na = 2Z. This means the random array will require the same number of
degrees of freedom to suppress the clutter as a large ULA. However, the number of
degrees of freedom available to the random array is less than that of the large ULA.
Therefore, fewer degrees of freedom are left to supply gain for the target than for the
filled ULA.
The MDV depends on the width of the clutter ridge and therefore, the MDV depends
on the aperture size of the array [40]. To illustrate this, the signal-to-interference-plus-
noise ratio (SINR) vs. normalized Doppler for three array configurations: a small ULA
with 10 elements (Z = 5λ), a large ULA with 30 elements (Z = 15λ), and a random
array with 10 elements randomly positioned across an array of size Z = 15λ at u = 0
is shown in Figure 5.5.
The SNR is defined as |x|2 /σ2 and the clutter-to-noise ratio (CNR) is defined as
s(u)/σ2 , where s(u) is the power of the clutter at spatial frequency u and Doppler ξu.
For simplicity, it is assumed that the clutter power is constant for all clutter patches and
the power is given by σc2 . Then the CNR is also given by σc2 /σ2 . The SINR is defined as
20
18
16
14
12
SINR (dB)
10
8
ULA – 5
ULA – 15
6 RA – 15
0
–0.1 –0.08 –0.06 –0.04 –0.02 0 0.02 0.04 0.06 0.08 0.1
Normalized Doppler
Figure 5.5 SINR vs. normalized Doppler for three array configurations: a small ULA with 10
elements (Z = 5λ), a large ULA with 30 elements (Z = 15λ), and a random array with 10
elements randomly positioned across an array of size Z = 15λ. The SNR is 10 dB and the CNR
is 30 dB and the spatial frequency u is set to 0.
SINRi = aH −1
i R ai . From Figure 5.5, we see that both the large ULA and the random
array achieve a higher SINR at lower Doppler than the small ULA. For example, at
normalized Doppler = 0.02, the small ULA is about 4 dB SINR, but the random array is
about 9 dB. This disparity points to a lower MDV.
5.3 STAP Detection Problem
In this section, we introduce the detection problem and summarize a popular detection
algorithm, the adaptive beamformer (also known as the adaptive matched filter in [41]).
We also analyze the performance of the adaptive beamformer applied to a random array
to motivate the need for an alternative algorithm.
5.3.1 Review of Adaptive Beamformer

The goal of GMTI radar is to determine the number of targets present and their locations
in the angle-Doppler domain. One common approach to this problem is to divide the
angle-Doppler map into G resolution cells and perform G detection tests, one for each
of the G grid points. The number of targets is determined by counting the number of
cells that pass the detection test, and the locations of the targets are determined by the
cells that pass the test. The binary hypothesis test for any of the resolution cells based
on the model (5.7) is given by
H0 : x = 0
H1 : x != 0.
To recap, we are posing the problem of testing a STAP resolution cell for the presence
of a target of unknown amplitude observed in the presence of Gaussian colored noise
with unknown covariance matrix, when secondary data is available for estimating the
covariance matrix, this problem has been solved by Kelly in [42]. Kelly was able to
show that the probability of false alarm for the GLRT detector does not depend on
interference covariance matrix [42]. In fact it was shown that the probability of false
alarm depends on the dimensionality of the signal N , the number of training samples
used to estimate the interference covariance matrix R, and the threshold parameter γ.
Since the probability of false alarm does not depend on any unknown parameters (γ is
a design parameter chosen to achieve a desired false alarm probability), the detector is
described as a CFAR detector.
A simpler approach is suggested in [41], where the likelihood of the observation is
maximized only over the unknown amplitude (separately for each hypothesis). In this
approach, the covariance matrix is assumed known through the derivation of the test
statistic. It is later substituted with the sample covariance matrix of the secondary data
in the final expression of the test statistic since the covariance matrix is unknown. While
this procedure is ad hoc, it is argued in [41] that the resulting test statistic differs from
GLRT statistic in [42] only by a term that vanishes when the set of secondary data
is large.
The test for deciding H1 for a resolution cell defined by the steering vector a is given
by [41]
|aH /R−1 y|2
T = H −1 ≥ γ. (5.22)
a / R a
It is noted that the test statistic is essentially a beamfomer aH applied to whitened
observations /R−1 y and normalized by the product aH / R−1 a, hence we refer to this
approach as adaptive beamforming (ABF).
An alternative form of the test statistic that is used for performance evaluation is
found by observing that according to Robey et al.[41], the test (5.22) for deciding H1
may be expressed as a ratio of two independent random variables
|ζ|2
T = ≥ γ. (5.23)
hψ
It $ % in [41] that ζ is distributed CN (0,1) when a target is not present and
is shown
CN hρ,1 when a target with steering vector at is present. The effect of estimating
the covariance matrix (the fact that R and R̂ are not the same) is captured in the loss
factor h discussed in (5.25). The target’s SNR is given by
H −1 2
xa R at
ρ= . (5.24)
aH R−1 a
The test statistic (5.23) tests for the presence of a target by applying a steering vector
a, but the actual steering vector of the target is at . The denominator of (5.23), ψ, is a
chi-squared random variable with 2(L + 1 − N ) degrees of freedom. Since the factor
(L + 1 − N ) appears in several expressions in the sequel, for notational brevity, let
M (L + 1 − N ). The factor h, first proposed in [43], is a loss factor 0 ≤ h ≤ 1

that captures the effect of estimating the covariance matrix from the secondary data. It
is shown in [44] that in the absence of a target, the probability density function (PDF)
of the loss factor is the beta PDF
(N + N − 1)! M
p(h) = pβ (h;M + 1,N − 1) = h (1 − h)N−2 . (5.25)
M! (N − 2)!
When a target is present, it was shown in [45] the PDF of the loss factor is given by

M+1
M +1

(N + M − 1)!
−C sin2 (θ)
p(h) = e
m (N + M − 1 + m)! (5.26)
m=0
× C m pβ (h;M + 1,N − 1 + m).
Where the term C is defined as

−1 |aH R−1 at |2
C = aH
t R at 1− . (5.27)
(aH R−1 a)(aH −1
t R at )
We are now interested in computing the probability of false alarm. The probability of
false alarm is given by Pr{T ≥ γ} under H0 . It is more convenient however to compute
the probability Pr{T # ≥ hγ}, where T # = |ζ|
2
#
ψ . The ratio T is a ratio of two independent
chi-square random variables. Normalizing the numerator and the denominator by the
respective degrees of freedom and adjusting the threshold accordingly yields
|ζ|2 /2
T# = ≥ hM γ. (5.28)
ψ/2M
To proceed, we temporarily fix the loss factor h and find the probability of false
alarm conditioned on h. In other words, we are interested in obtaining an expression
for Pr{T # ≥ hM γ|h}. When no target is present, T # follows the F distribution with
parameters 2 and 2M, denoted F (2,2M) . The probability of false alarm conditioned
on h is simply given by
Pr{T ≥ hγ|h} = 1 − F (2,2M) (hM γ|h), (5.29)
where F (·,·) denotes the CDF of the F distribution F (·,·). To obtain the probability of
false alarm we integrate (5.29) over the random variable h
1
PF A = 1 − F (2,2M) (hM γ|h)p(h)dh, (5.30)
0
where p(h) is given by (5.25). Note that the probability of false alarm depends only on
the dimensionality of the signal N and the term M, which depends on the number of
secondary samples used to obtain the estimate of the covariance matrix R̂.
We now turn our attention to the probability of detection again. For convenience
we seek to compute the probability Pr{T # ≥ hγ} instead of Pr{T ≥ γ}. Similar to
the probability of false alarm, we begin by fixing the random variable h and find the
probability of detection conditioned on h. When a target is present, the test statistic
follows the noncentral F distribution with parameters 2 and 2M, and noncentrality
$ %
parameter hρ, denoted F 2,2M,hρ . The conditional probability of detection of a
target with SNR ρ (5.24) is given by
Pr{T ≥ hγ|h} = 1 − F (2,2M,hρ) (hM γ|h). (5.31)
To obtain the probability of detection we integrate (5.32) over the random variable h
1
PD = 1 − F (2,2M,hρ) (hM γ|h)p(h)dh, (5.32)
0
where p(h) is given by (5.26).
5.4 Compressive Sensing CFAR Detection
Detection by ABF is agnostic to the possible presence of multiple targets, which

increases the number of false alarms seen by the radar [13]. In contrast, the model (5.10)
accounts for multiple targets. As explained previously, the number of rows of A, N , is
much smaller than the number of columns G. The problem of recovering x given y and
A is then underdetermined, and hence does not have a unique solution. Instead, inspired
by compressive sensing techniques, we solve the following optimization problem
min y − Ax22 subject to x0 ≤ K, (5.33)
x
where x0 denotes the number of nonzero elements of x. As discussed in Section

5.1, problems involving the zero norm are generally nonconvex, and their solution,
implemented by an exhaustive search among all combinations of non-zero indices of x,
requires exponential complexity [31]. Matching pursuit (MP) is a practical complexity
algorithm whose solution approximates the solution to (5.33). However, MP is not
directly applicable to the radar problem for two reasons: (1) it does not take into account
the presence of clutter, and (2) in radar, the number of targets K is not known a priori.
Proposed solutions to address these problems as well as various enhancements are the
presented in this section.
In radar, clutter contributions are typically much stronger than the unknown targets
and, if not suppressed, may severely interfere with target detection. A whitening oper-
ation is applied to the observed data and to the measurement matrix A. Specifically, let
z=/ R−1/2 y and B = / R−1/2 A, then optimization (5.33) becomes
min z − Bx22 subject to x0 ≤ K. (5.34)
x
Unfortunately, to solve (5.34) one requires the knowledge of the number of targets K,
which of course is unknown a priori. To implement a CFAR radar that exploits target
sparsity, we propose a two-stage MP-CFAR detection algorithm. Candidate targets are
localized in the first phase; in the second phase, they are tested for detection. A detected
target is then canceled from the data. The cancellation of detected targets from the data
is intended to remove mutual interference between targets and thus address one of the
flaws of detection by ABF. A block diagram of the MP-CFAR algorithm is shown in
Figure 5.6.
If test passes:
Set: k = k + 1
Sk If test fails:
Initialize:
Output: S0 = 0, k = 0
S0 = 0, k = 1
Figure 5.6 Block diagram of the MP-CFAR algorithm.
5.4.1 MP-CFAR
Stage 1: MP Localization
The first pass of the MP localization algorithm uses whitened data z = / R−1/2 y and
whitened steering vectors bj = / R−1/2 aj , j = 1,. . .,G. The first candidate target is
localized by the index m1 of the vector bj that has the largest data projection,
|bH
j z|
2
m1 = arg max (5.35)
j bH
j bj
for j = 1,. . .,G. The index m1 localizes the target in the angle-Doppler domains. This
information is subsequently used by the detection stage, as described in relation with
Stage 2 later in this section.
Next, we describe the localization of the k-th candidate target, given that k − 1 targets
have already been localized and passed the detection test. The observed and whitened
data z is processed to cancel the contribution of targets detected previously. Let a matrix
B be formed with the columns bj . Let Sk−1 be the set of indices of columns of B
associated with detected targets, and let BSk−1 be the matrix formed by the columns
indexed by Sk−1 . The $ Hprojection% matrix orthogonal to the detected targets is given by
−1
P⊥BS = I − B S k−1 BSk−1 BSk−1 BSk−1 . Similarly, steering vectors orthogonal to the
k−1
detected targets are formed as follows: wj = P⊥
BSk−1 bj , for all j ∈
/ Sk−1 . The k-th target
is localized according to
|wH
j z|
2
mk = arg max . (5.36)
j wH
j wj
This process continues until a candidate target fails the detection test.
Stage 2: Detection
We now derive a CFAR detector that is applied to candidate targets localized in Stage 1.
The first candidate target is detected according to (5.22), rewritten here for convenience:
|aH /−1 2
m1 R y|
T = ≥ γ, (5.37)
aH /−1
m R am1
1
where m1 is the index found in Stage 1. Note that the test (5.37) may also be expressed
in terms of the whitened steering vectors bm1 = /
R−1/2 am1 ,
|bH
m1 z|
2
T = ≥ γ. (5.38)
bH
m1 bm1
Next we describe the detection of candidate target k, given that k − 1 targets have
already been localized and passed the detection test. The signal model is given by the
expression
z = bmk xmk + BSk−1 xSk−1 + n

= BSk xSk + n, (5.39)
where mk is the index of the resolution cell of the k-th candidate target found in

Stage 1 (5.36), Sk is formed by adding mk to the set Sk−1 , Sk = Sk−1 mk , the
matrix BSk = [bmk ,BSk−1 ] is the matrix formed by columns with indices in Sk ,
T
xSk = xmk ,xTSk−1 , and n = / R−1/2 e. This signal model leads to the following
detection test:
H0 : xmk = 0
H1 : xmk != 0.
Here, the following problem is posed: detect a target located at a specified whitened
steering vector bmk and having unknown amplitude, observed in the presence of inter-
ference and noise. The interference is of unknown gain xSk−1 , but belonging to a known
subspace BSk−1 . The noise is Gaussian colored noise for which the covariance matrix is
unknown, but secondary data is available for its estimation.
To develop the test statistic for the detection problem, we start by expressing the
likelihoods of the observations under the two hypotheses. As in the discussion leading
to (5.22), the detector is a generalized likelihood ratio detector only in the sense that the
liklihood under H1 is maximized over the unknown target amplitude. To simplify the
detector, as in [41], it is assumed that the PDF’s of the test statistic under each hypothesis
are based on the true covariance matrix. It is noted that the subsequent analysis relies
on the properties of the estimated covariance matrix. Thus, z = / R−1/2 y is modeled as
having a covariance matrix equal to the identity matrix. It follows that under H0 , the
likelihood is
& 'H & '
$ % 1 − z−BSk−1 xSk−1 z−BSk−1 xSk−1
p z|H0 = N e ,
π
while under H1 the likelihood is
1 −$z−BS xS %H $z−BS xS %
p (z|H1 ) = e k k k k .
πN
The GLRT for deciding H1 is given by

maxxSk p(z|xSk )
T = ln ≥ γ. (5.40)
maxxSk−1 p(z|xSk−1 )
To obtain a more convenient form of the test, we note that under hypothesis H0 , the
MLE of the gain vector xSk−1 is found from
/
xSk−1 = min z − BSk−1 xSk−1 22 . (5.41)
xSk−1
Minimizing (5.41) with respect to the vector of complex gains xSk yields
−1 H
/
xSk−1 = (BH
Sk−1 BSk−1 ) BSk−1 z. (5.42)
Similarly,
−1 H
/
xSk = (BH
Sk BSk ) BSk z. (5.43)
Inserting (5.42) and (5.43) into (5.40),
T = z − BSk−1/
xSk−1 22 − z − BSk/
xSk 22
& '
= zH PBSk − PBSk−1 z, (5.44)
$ %−1
where PB = B BH B B is a projection matrix that projects onto the subspace
spanned by B. Note that the decision statistic is a difference between two quadratic
forms, where the quadratic form zH PBSk−1 z is an interference term that is canceled.
The test statistic (5.44) may be further simplified, which will be useful to obtain
expressions for performance evaluation of the MP-CFAR detector. We make use of the
following result from [46,47]. Let D and E be two subspaces, and let PD and P[D,E]
be projection matrices that project onto the subspaces spanned by the matrices D and
[D,E], respectively. Let F = P⊥ D E, then the difference between projection matrices
P[D,E] − PD is given by [47]
P[D,E] − PD = PF . (5.45)
Now, identify D = BSk−1 and E = bmk (whitened steering vector). Then F = P⊥ DE

= P⊥BSk−1 bm k is a vector, and let fk P⊥
BSk−1 bmk . Note that f 1 = bm1 . Since by design,
fk−1 is already orthogonal to all previous vectors f1,. . .,fk−2, we have the following
recurrent relations
fk = P⊥
fk−1 bmk . (5.46)
From this expression, fk is the projection of the whitened steering vector bmk orthogonal
to the previous k − 1 targets. We use this vector to remove the potential interference the
other k − 1 targets provides through the sidelobes. Applying (5.45), we obtain
P[BSk−1 ,bmk ] − PBSk−1 = Pfk , (5.47)
where Pfk = fk fH k /fk fk . Noting that BSk = [BSk−1 ,bmk ] and substituting (5.47)
H
into (5.44), the test for deciding H1 on the detection of the k-th target can be expressed as
|fH
k z|
2
T = ≥ γ. (5.48)
fH
k fk
For k = 1, f1 = bm1 , and (5.48) reverts to (5.38), as it should.

Algorithm 1 CFAR-MP
1: Input: y,A, R̂, γ.
2: Initialize: S0 = ∅, r = R̂−1/2 y, B = R̂−1/2 A, W = B, k = 1.
3: Find: Search for the index l that maximizes the metric maxj |wH 2 H
j r| /wj wj .

4: Update set of targets: Sk = Sk−1 l.
5: Check: If Tsi ≥ γ (test statistic to decide if xsi is nonzero) for all si ∈ Sk continue.
Otherwise output Sk−1 as solution and terminate.
& '−1
6: Generate: P⊥ BS = I − BS k BHB
Sk S k BH
Sk .
k
7: Remove found targets: W = P⊥ BSk B.
8: Renormalize: If wi 2 = 0, set wi = 0.
9: Return to step 3.
The test statistic (5.48) is applied to every candidate target included in the set Sk . If
any of the k tests fails to exceed the threshold γ, the algorithm terminates and outputs
the set Sk−1 , the set of k − 1 target locations. Otherwise, MP-CFAR increments the
number of targets k by one and reruns MP with the new value of k. The psuedocode for
the MP-CFAR algorithm is listed in Algorithm 1.
5.4.2 Performance of the MP-CFAR Detector

In this section, we develop analytical expressions for the probability of false alarm
and probability of detection of the MP-CFAR detector for some simple cases. We will
consider the case when no target is present and when a single target is present in the
field of view. To obtain an expression for the probability of false alarm when no target
is present in the field of view, we manipulate the test statistic (5.48) to express it in
the form (5.28). By assumption, no target has been detected yet, hence the test is for
target index k = 1. For k = 1, and based on notation developed previously, the vector
fs1 = bm1 = / R−1/2 am1 and z = / R−1/2 y. Now recall that m1 is the index obtained
from (5.36). It follows that (5.48) may be written
|aH /−1 2
j R y|
T = max . (5.49)
j aH /
R−1 aj
j
Other than the max operator, the test statistic in (5.49) is of the form (5.22), hence it can
be reduced to the form (5.28),
T = max j , (5.50)
j
where
|ζ j |2 /2
j = . (5.51)
hψj /2M
The probability of false alarm is given by
5 6
PF A = 1 − Pr max j ≤ hM γ .
j
Using the assumption that the random variables j (5.51) are independent and identi-
cally distributed,
$ %G
PF A = 1 − Pr j ≤ hM γ .
As discussed in relation with (5.29), j follows an F distribution with CDF F (2,2M) ,

from which the expression for the probability false alarm is approximately
G
1
PF A = 1 − F (2,2M) (hM γ|h)p(h)dh . (5.52)
0
The probability of detection of the first target is given by the same expression as for
the ABF (5.32). Note that it is assumed that the target is assumed to be recovered in
the first pass of the MP-CFAR detector. If this is the case, the probability of detection
follows that of (5.32). Otherwise, the probability of detection will decrease due to the
orthogonal projection. When a target is present, the probability of false alarm includes
the event that a target is detected at the incorrect resolution cell. That would increase the
PF A in (5.52). An algorithm that mitigates this type of false alarm is presented next.
5.4.3 MBMP-CFAR
The MP-CFAR algorithm localizes the first target according to (5.35), namely, it finds
the column of the whitened measurement matrix with the largest projection on the
whitened data z. A false alarm (localizing the target in the wrong resolution cell)
increases the chance of further false alarms downstream, since according to (5.36),
localizing subsequent candidate targets depends on the location of the first target (5.35).
A more robust approach is to hedge bets by finding multiple candidates for the loca-
tion of the first target. Each such candidate target serves as seed to the localization and
detection of subsequent targets. When the process is completed, a metric is used to select
the set of targets that provides the best fit to the data. This algorithm, which generalizes
MP-CFAR, is referred to as MBMP-CFAR.
We introduce some notation that facilitates the presentation of MBMP-CFAR.
A localization solution is referred to as a branch. The set D = {d1,d2,. . .,dk } contains
the number of branches per target. A path is a sequence of branches specified by
their index numbers. For example, the path (i1,i2,. . .,ik ), 1 ≤ i1 ≤ d1,. . .,1 ≤
(i ,i ,...,i )
ik ≤ dk . A localization solution 1 ≤ mk 1 2 k ≤ G, where G is the number of
resolution cells (see [5.10]), consists of a path and the index number of the resolution
(i ,...,i ) (i ) (i i ) (i ...,i )
cell. The set Sk 1 k = m1 1 ,m2 1, 2 ,ldots,mk 1, k contains the localization
solution associated with path (ı1,i2,. . .,ik ) . For k candidate targets, MBMP maintains
d1 × d2 × · · · × dk such sets. The matrix BSk was defined to consist of the whitened
steering vectors bj indexed by Sk . Similarly, we define the matrix B (i1,...,ik ) to consist
Sk
(i ,...,i ) (i ,...,i )
of whitened steering vectors indexed by the set Sk 1 k . The vector fk 1 k is defined
analogous to (5.46)
= P⊥
(i ,...,ik )
fk 1 B bk . (5.53)
Sk (i1 ,...,ik ) \mk
The inputs to MBMP-CFAR are the whitened measurement vector z = / R−1/2 y,

whitened steering vectors bj = R / −1/2 aj , j = 1,. . .,G, and a set of positive integers
D = {d1,d2,. . .,dG }. Similar to MP-CFAR, the MBMP-CFAR algorithm proceeds in
two stages.
MBMP generates a tree in which each branch is associated with a candidate target.
At a given level in the tree, the first branch represents the strongest target (according
to [5.54]), the second branch represents the second-strongest target, and so on. The
number of branches at a given level is user-selected. The children of a given branch are
generated by projecting the received data away from the ancestor candidate target(s)
(according to [5.57]), and again ranking a user selected number of candidate targets
according to their strength. All children from a given target have been orthogonally
projected to remove the effect of the ancestor.
Stage 1: MBMP Localization

To localize the candidates for the first target, the algorithm finds the d1 indices
(1) (2) (d )
m1 ,m1 ,. . .,m1 1 that produce the d1 largest projections of steering vectors bj on the
data z. Specifically, the resolution cell index that localizes the first branch of the first
target is found from
(1)
|bH
j z|
2
m1 = arg max . (5.54)
j bH
j bj
The ith branch of the first target, 1 ≤ i ≤ d1 , is found from
(i)
|bH
j z|
2
m1 = arg max . (5.55)
/ {m1,...,mi−1 }
j∈ bH
j bj
To generate the d1 d2 branches associated with the second target, define the modified
steering vectors wj = P⊥
(i)
b (i) bj , for 1 ≤ i ≤ d1 . The orthogonal projection prevents
m1
(i)
interference from a target at m1 . The resolution cell index associated with the first
branch of the second target is given by
(1)H
(1,1)
|wj z|2
m2 = arg max (1)H (1)
, (5.56)
j wj wj
whereas the index of branch i2 , 1 ≤ i2 ≤ d2 of the second target, given the path (i1,i2 ),
(i )H
(i ,i )
|wj 1 z|2
m2 1 2 = arg max (i )H (i )
. (5.57)
j∈
/ m1,...,mi2 −1 wj 1 wj 1
(i ,...,ik )
Generalizing to k targets and the path (i1,i2,. . .,ik ), define the vector wj 1
= P⊥
B bj . The index associated with the k-th target is given by
(i ,...,ik )
Sk 1
S0
S1(1) { m1(1) }
S1(2) {m(2)
1 }
S2(1,1) {m1(1),m2(1,1) }
S2(2,2) {m1(2) ,m(2,2)
2 }
(2) (2,1)
S2(1,2) {m1(1) ,m2(1,2) } S2(2,1) { m1 ,m 2 }
Figure 5.7 Graph of MBMP algorithm for a branch vector d = [2, 2]T .

(i1,...,ik−1 )H 2
wj z
(i ,...,ik )
mk 1 = arg 5 max 6 . (5.58)
(i ,...,i )H (i ,...,i )
j∈
/ m1,...,mik−1 wj 1 k−1 wj 1 k−1
An example of MBMP localization with D = {2,2,1,. . .} is illustrated in Figure 5.7.

From the figure it is seen that MBMP localization starts with the empty set correspond-
ing to no targets detected. The algorithm then searches for d1 = 2 steering vectors that
generates the largest projection on the data. The algorithm then performs a detection
test on the d1 resolution cells (the detection test is detailed next in Stage 2: Detection).
If the detection test passes, MBMP-CFAR searches for d2 resolution cells using (5.57)
with wj = P⊥
(1) (1)
b (1) bj . The d2 resolution cells form two paths that stem from m1 (see
m1
(2)
Figure 5.7). This is repeated using m1 to create d1 d2 paths.
The process continues until the detection test fails, as explained in relation with
Stage 2.
Stage 2: Detection
The MBMP localization processing yielded d1 candidate locations for the first target.
The largest score relative to the objective function |bH 2 H
j z| /bj bj is obtained by the
(1)
steering vector index m1 , because
max |bH
j z| /bj bj ≥
2 H
max |bH 2 H
j z| /bj bj ,
j / {m1,...,mi−1 }
j∈
(1)
(see [5.54] and [5.55]). Note that the choice of m1 also minimizes the residual of
the objective function z − Bx22 (see [5.34]), m1(1) = arg minj P⊥ 2
bj z2 . The test to
(1)
determine whether a target is present in the resolution cell m1 is given by (5.48)
|bH(1) z|2
m1
T = ≥ γ. (5.59)
bH(1) bm(1)
m1 1
(i )
If the test (5.59) is met, d1 target sets are updated as follows S1 1
(i )
= m1 1 , 1 ≤ i1 ≤ d1 . If the test fails, then MBMP-CFAR declares that no targets
exist, and the algorithm terminates.
To test for the detection of the k-th target, assume that k−1 targets have been detected.
The residual along the path (i1,i2,. . .,ik ) is computed from
4 42
4 ⊥ 4
R (i1,...,ik ) = 4 P z 4
4 B i ,...,i 4 . (5.60)
(
Sk 1 k)
2
The path that yields the lowest residual is given by
(i1,i2,. . .,ik ) = arg min R (j1,...,jk ) . (5.61)

(j1,...,jk )
(i ...,i )
The test to determine whether a target is present in the resolution cell mk 1, k is given
by (see definition of vectors f in [5.53]).
(i ,...,ik )
|fk 1 z|2
T = (i ,...,ik )H (i1,...,ik )
≥ γ. (5.62)
fk 1 fk
(i ) (i i ) (i ...,i )
All k resolution cells m1 1 ,m2 1, 2 ,. . .,mk 1, k are tested. In other words, the
(i ) (i ,...,i )
test (5.62) is performed for the vectors f1 1 ,. . .,fk 1 k .
(i ,...,i )
If all k tests (5.62) are met, d1 × . . . × dk target sets are updated as follows Sk 1 k
(i1 ) (i1, i2 ) (i ...,i )
= m1 ,m2 ,. . .,mk 1, k , 1 ≤ i1 ≤ d1,. . .,1 ≤ ik ≤ dk . The MBMP-CFAR
algorithm proceeds to the localization and detection of the (k + 1) target. If the detection
test fails, MBMP-CFAR outputs as solution the path
(i1,i2,. . .,ik−1 ) = arg min R (j1,...,jk−1 ) .

(j1,...,jk−1 )
To illustrate the MBMP-CFAR, we return to the example in the previous subsection.

MBMP-CFAR first searches for d1 = 2 steering vectors as described in the previous
subsection. It then performs the detection test (5.59). If the detection test fails, the algo-
rithm declares no target exists and terminates. If the detection test passes, the algorithm
forms two paths that stem from empty set (see Figure 5.7). MBMP-CFAR then searches
for d1 d2 paths as described in the previous subsection and then searches for the path
that minimizes the residual of the objective function using (5.61). Note that the path that
minimizes the residual of the objective function using two targets does not necessarily
stem from the branch that started with the largest statistic using a single target and allows
the algorithm to move away from that branch. It then tests the resolution cells obtained
from (5.61) using (5.62). The algorithm terminates if either of the two resolution cells
fail the detection test, otherwise the process continues until a detection test fails.
Intuitively, the MBMP-CFAR algorithm generalizes the MP-CFAR by allowing the
consideration of resolution cells that do not maximize the metric in (5.36). Note that
the first iteration of MBMP-CFAR produces the same localization solution as the MP-
CFAR and also performs the same detection test. Hence when no targets exist, the
performance of both MBMP-CFAR and MP-CFAR is the same. Also note the path that
(1) (1,...,1)
corresponds to the set of resolution cells (m1 ,. . . mk ) is the MP-CFAR solution.
(1) (1,...,1)
However, MBMP-CFAR does not always test the set (m1 ,. . . mk ), because it may
not minimize the residual (5.60). For example, using Figure 5.7, in the first iteration
(1)
MBMP-CFAR will always test m1 , however in the next iteration MBMP-CFAR will
(1,1) (2,1)
test either the set of resolution cells S2 or the set of resolution cells S2 . Note
(1,2) (2,2)
that the algorithm will never test the paths S2 or S2 because they present higher
(1,1) (2,1)
objective functions than S2 and S2 , respectively.
5.5 Numerical Results
In this section, we present numerical results on the MP-CFAR and MBMP-CFAR algo-
rithms and compare them with ABF. Unless stated otherwise, in figures presented in
this section, the aperture of the random arrays is 12λ (Z = 12, where Z is expressed
in units of wavelength). The number of elements in the random array is Na = 16,
thus the mean spacing between elements of the random array is 12λ/16. The number
of coherent pulses used by all arrays is Np = 25. The SNR, defined as |x|2 /σ2 , is
set to SNR = 15.5 dB unless stated otherwise. The CNR is set to 30 dB. It was seen
that the SINR of the random array, defined as SINRi = aH −1
i R ai , is roughly 15 dB
with these parameters. The number of training samples used to estimate the covariance
matrix for the random array is L = 2N . Reduced-rank methods may be applied to
reduce the size of the training set [48], but this is not the emphasis of this work. The
number of resolution cells on the angle-Doppler map is given by G = (2Z + 1)2 = 625.
A random realization of a random array is generated and remains fixed throughout the
Monte Carlo simulations for all figures unless otherwise stated. Let St be the true set of
resolution cells that contain targets, and let Ŝ be the set of resolution cells found by a
detector to have targets. A false alarm event occurs if Ŝ\St != ∅, and a detection event
7
occurs if Ŝ St != ∅. The threshold γ was computed by selecting a desired false alarm
probability, the equations in [49] were then used to find the appropriate value of γ.
The probabilities of false alarm of the MP-CFAR and ABF detectors are studied in
Figure 5.8, which plots the empirical probability of false alarm against the SINR of
a target present with the angle-Doppler pair (5/Z,0). The detection threshold for the
ABF detector is set using (5.30), such that PF A = 10−3 . Applying (5.52), the detection
threshold for the MP-CFAR detector is also set to PF A = 10−3 . In this figure, the arrays
compared are a 12λ ULA and an 12λ random array. The random array has Na = 16
sensors; the resolution cells for this experiment was spaced apart by 1/12λ. For each
curve (excluding the line PF A = 10−3 ), the results of 104 Monte-Carlo experiments
were averaged to obtain the curves, and the ABF tested every resolution cell on the
angle-Doppler map. The probability of false alarm of a true CFAR detector should not
change as a function of SNR of a target present somewhere in the search area. It is
10–1
ABF
MP-CFAR
ABF (large ULA)
PF
10–2
10–3
10 11 12 13 14 15 16 17 18 19 20
SINR
Figure 5.8 Probability of false alarm vs. SINR of a target for the ABF with a random array,
MP-CFAR with a random array, and the ABF with a large ULA.
observed from the figure that the 12λ ULA ABF and the random array MP-CFAR
detectors have probabilities of false alarm that are little changed as a function of the
SNR of a target. More specifically, at low SNR the MP-CFAR experiences a probability
of false alarm of about 2 × 10−3 instead of PF A = 10−3 . This slight increase in the
probability of false alarm occurs because at low SNR, the probability of correct recovery
(the probability that MP-CFAR recovers the correct resolution cell to test) is less than
one. As the SNR of the target increases, the probability of correct recovery increases,
and the false alarm probability of MP-CFAR decreases to PF = 10−3 as intended. It is
noticed that the ABF using a 12λ ULA experiences a slight increase in the probability
of false alarm as the SNR of the interfering target increases. This is because although the
peak sidelobe of a ULA is relatively small (roughly −13 dB), it is not zero and therefore
will ultimately affect the probability of false alarm. In contrast, a random array using
ABF cannot cope with energy leaked by high sidelobes, and as the strength of the target
increases, the probability of false alarm increases.
In Figure 5.9, shown are the receiver operating characteristic (ROC) curves of the
ABF using a 12λ ULA, the ABF using a random array, and MP-CFAR using a random
array, and a single target in the field of view. The target again has the angle-Doppler pair
(5/Z,0). From the figure, the large ULA using ABF performs well as expected. Since
the ULA array does not exhibit large sidelobes, the target does not significantly increase
the probability of false alarm. In contrast, it is seen that the ABF with the random array
performs considerably worse. The random array has large sidelobes, and since the ABF
does not account for the large sidelobes, the radar experiences a high false alarm rate.
The MP-CFAR with a random array on the other hand performs similarly to the ABF
with a large ULA. The MP-CFAR unlike the ABF, accounts for detected targets and
removes the targets before detecting more targets. Note that the MP-CFAR performs
0.95 ABF
CFAR-MP
ABF (Large ULA)
0.9
0.85
0.8
PD
0.75
0.7
0.65
0.6
10–4 10–3 10–2 10–1
PF
Figure 5.9 ROC curve for a single target for the ABF with a random array, MP-CFAR with a
random array, and the ABF with a large ULA. Parameters SNR = 15.5 dB and CNR = 30 dB.
ABF
MP-CFAR
10–1 MBMP-CFAR – [3 1 1 ..]
–3
P F = 10
PF
10–2
10–3
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9
Sparsity Ratio
Figure 5.10 Probability of false alarm vs. the sparsity ratio of an array for the ABF with a random
array, the MP-CFAR with a random array, and the MBMP-CFAR with a random array with
branch vectors D = [3,1,1,. . .]. Parameters: SNR = 15.5 dB and CNR = 30 dB.
similarly to the ABF with the large ULA using about 3/4s of the number of elements
compared to the large ULA. This demonstrates the savings without loss of performance
that are gained by random arrays and the proposed MP-CFAR detector.
In Figure 5.10, shown is the probability of false alarm vs. the sparsity ratio of a
f ull f ull
random array. The sparsity ratio is defined as sparsity ratio = Na /Na , where Na
is the number of elements required to fill a ULA for an array of the same size. There are
10–1 ABF
MP-CFAR
MBMP-CFAR – [6 1 1 ..]
PF
10–2
10–3
Number of Targets
Figure 5.11 Probability of false alarm vs. the number of targets in the field of view for the ABF
with a random array, the MP-CFAR with a random array, and the MBMP-CFAR with a random
array with branch vectors D = [3,1,1,. . .]. Parameters: SNR = 15.5 dB and CNR = 30 dB.
two targets in the field of view, with the angle-Doppler pairs (5/Z,0) and (−5/Z,0). The
threshold parameter γ is set so that the desired probability of false alarm is PF = 10−3
for all methods. As expected, the false alarm rate decreases as the sparsity ratio increases
for all three curves, since using more elements in a random array decreases the average
sidelobe and average peak sidelobe levels. Although the false alarm rate for the ABF
decreases as the sparsity ratio increases, the false alarm rate is still much larger than
10−3 as intended. In contrast, MP-CFAR experiences significantly less false alarms at
all sparsity ratios, in addition, at sparsity ratio 0.9, the MP-CFAR detector experiences
a false alarm rate of about 10−3 , as intended. MBMP-CFAR further decreases the false
alarms at low sparsity ratios, and as expected also experiences a false alarm rate of about
10−3 as intended. MBMP-CFAR experiences lower false alarm rates at lower sparsity
ratios, which can be attributed to the additional resolution cells that the algorithm tests.
In Figure 5.11, shown is the probability of false alarm vs. the number of targets for
the ABF, MP-CFAR, and the MBMP-CFAR with branch vector D = [6,1,1,. . .] using
a random array. The targets have the following angle-Doppler pairs s1 = (5/Z,0),
s2 = (5/Z,1/Z), s3 = (−4/Z,2/Z), s4 = (−3/Z,2/Z), s5 = (−6/Z,−1/Z),
s6 = (6/Z,−1/Z). For K = 1, the target with angle-Doppler pair s1 is placed on the
map; for K = 2, the targets s1,s2 are on the map, etc. Again, the threshold parameter
γ is set so that the desired probability of false alarm is PF = 10−3 for all methods.
From the figure it is seen that the ABF performs significantly worse than the MP-CFAR
and the MBMP-CFAR as the number of targets increases. The false alarm rate for the
MP-CFAR on the other hand increases slightly from PF = 10−3 for K = 0 to about
PF = 1.4 × 10−3 for K = 3. For K = 4, it is noticed that targets s3 and s4 are placed in
adjacent resolution cells in angle. This causes significant interference between the two
1
ABF
0.95 MP-CFAR
MBMP-CFAR [3 1 1 ..]
0.9
0.85
0.8
PD
0.75
0.7
0.65
0.6
0.55
0.5
10–3 10–2 10–1 100
PF
Figure 5.12 ROC curve for two closely spaced targets for the ABF with a random array,
MP-CFAR with a random array, and the MBMP-CFAR with a random array. Parameters
SNR = 15.5 dB and CNR = 30 dB.
targets and the false alarm rate increases to about PF = 7 × 10−3 . The false alarm rate
for MP-CFAR does not change significantly for K ≥ 4. The remaining targets s5 and
s6 are far apart from each other and do not significantly interfere with each other or the
other 4 targets and hence does not drastically impact the false alarm rate. MBMP-CFAR
performs identically to the MP-CFAR for K ≤ 3, it also experience a increase in the
false alarm rate for K = 4 from PF = 1.4 × 10−3 to PF = 4 × 10−3. Note that the false
alarm rate for the MBMP-CFAR at K ≥ 4 is about 4 × 10−3 instead of 7 × 10−3 . This
decrease in the false alarm rate stems from the increase in the probability of correct
recovery in the MBMP-CFAR algorithm.
In Figure 5.12, the ROC curves for the ABF, MP-CFAR, and the MBMP-CFAR
are shown with branch vector D = [3,1,1,. . .] using a random array for two targets
spaced closely together. The two targets are placed in adjacent resolution cells, and
the angle-Doppler pairs of the targets are (−5/Z,0) and (−4/Z,0). From the figure, the
ABF performs poorly because it cannot cope with the high sidelobes of a random array
and both sparsity-based radars significantly outperform the ABF. Comparing the two
sparsity-based CFAR algorithms, the MBMP-CFAR algorithm sees a slight performance
gain compared to the MP-CFAR. This increase in performance is due to the increase in
the probability of correct recovery that the MBMP-CFAR algorithm enjoys by testing
more resolution cells.
5.6 Summary
In this chapter we propose using a random array with the MP-CFAR and MBMP-CFAR
algorithms to solve the target detection problem in a STAP setting. The random array
is a large undersampled array that achieves high resolution due to the large aperture at
the cost of high sidelobes. Although conventional beamforming cannot cope with the
high sidelobes introduced by the random array, the proposed sparsity-based algorithms
can cope with the high sidelobes, allowing one to enjoy the high resolution of the
random array without the consequences of the high sidelobes. This was achieved using
the proposed algorithms by iteratively detecting targets one by one and removing their
contributions from the data. Numerical simulations show that the proposed algorithms
outperform beamforming methods when a random array is employed.
We show in simulations that both MP-CFAR and MBMP-CFAR outperform the pop-
ular beamformer when a random array is employed. In particular, we show that the
beamformer experiences significantly higher false alarms compared to the proposed
methods, and is not compatible with a random array. In contrast, the MP-CFAR and
MBMP-CFAR algorithms are shown to be able to cope with high sidelobes and are
compatible with a random array.
References
[1] J. Ward, “Space-time adaptive processing for airborne radar,” DTIC document, Tech. Rep.,
1994.
[2] B. D. Van Veen and K. M. Buckley, “Beamforming: A versatile approach to spatial filtering,”
IEEE ASSP Magazine, vol. 5, no. 2, pp. 4–24, 1988.
[3] J. Capon, “High-resolution frequency-wavenumber spectrum analysis,” Proceedings of the
IEEE, vol. 57, no. 8, pp. 1408–1418, 1969.
[4] R. Schmidt, “Multiple emitter location and signal parameter estimation,” IEEE Transactions
on Antennas and Propagation, vol. 34, no. 3, pp. 276–280, 1986.
[5] R. Roy and T. Kailath, “Esprit-estimation of signal parameters via rotational invariance
techniques,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 37, no. 7,
pp. 984–995, 1989.
[6] J. Rissanen, “A universal prior for integers and estimation by minimum description length,”
The Annals of Statistics, pp. 416–431, 1983.
[7] M. Wax and T. Kailath, “Detection of signals by information theoretic criteria,” IEEE
Transactions on Acoustics, Speech, and Signal Processing, vol. 33, no. 2, pp. 387–392,
1985.
[8] M. G. Bray, D. H. Werner, D. W. Boeringer, and D. W. Machuga, “Optimization of thinned
aperiodic linear phased arrays using genetic algorithms to reduce grating lobes during
scanning,” IEEE Transactions on Antennas and Propagation, vol. 50, no. 12, pp. 1732–
1742, 2002.
[9] L. Cen, Z. L. Yu, W. Ser, and W. Cen, “Linear aperiodic array synthesis using an improved
genetic algorithm,” IEEE Transactions on Antennas and Propagation, vol. 60, no. 2,
pp. 895–902, 2012.
[10] Y. Lo, “A mathematical theory of antenna arrays with randomly spaced elements,” IEEE
Transactions on Antennas and Propagation, vol. 12, no. 3, pp. 257–268, 1964.
[11] B. D. Steinberg, Principles of Aperture and Array System Design: Including Random and
Adaptive Arrays. Wiley Interscience, vol. 1, 1976, p. 374.
[12] F. Athley, C. Engdahl, and P. Sunnergren, “On radar detection and direction finding using
sparse arrays,” IEEE Transactions on Aerospace and Electronic Systems, vol. 43, no. 4,
pp. 1319–1333, 2007.
[13] B. D. Steinberg, “Comparison between the peak sidelobe of the random array and algo-
rithmically designed aperiodic arrays,” IEEE Transactions on Antennas and Propagation,
vol. 21, no. 3, pp. 366–370, 1973.
[14] S. Tonetti, M. Hehn, S. Lupashin, and R. D’Andrea, “Distributed control of antenna array
with formation of UAVs,” IFAC Proceedings Volumes, vol. 44, no. 1, pp. 7848–7853, 2011.
[15] L. Carin, “On the relationship between compressive sensing and random sensor arrays,”
IEEE Antennas and Propagation Magazine, vol. 51, no. 5, pp. 72–81, 2009.
[16] D. L. Donoho, “Compressed sensing,” IEEE Transactions on Information Theory, vol. 52,
no. 4, pp. 1289–1306, 2006.
[17] R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal
Statistical Society, Series B (Methodological), pp. 267–288, 1996.
[18] S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic decomposition by basis pursuit,”
SIAM Review, vol. 43, no. 1, pp. 129–159, 2001.
[19] J. M. Bioucas-Dias and M. A. Figueiredo, “A new twist: Two-step iterative shrink-
age/thresholding algorithms for image restoration,” IEEE Transactions on Image Process-
ing, vol. 16, no. 12, pp. 2992–3004, 2007.
[20] D. L. Donoho, A. Maleki, and A. Montanari, “Message-passing algorithms for compressed
sensing,” Proceedings of the National Academy of Sciences, vol. 106, no. 45, pp. 18914–
18919, 2009.
[21] A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm for linear
inverse problems,” SIAM Journal on Imaging Sciences, vol. 2, no. 1, pp. 183–202, 2009.
[22] D. Malioutov, M. Çetin, and A. S. Willsky, “A sparse signal reconstruction perspective for
source localization with sensor arrays,” IEEE Transactions on Signal Processing, vol. 53,
no. 8, pp. 3010–3022, 2005.
[23] I. W. Selesnick, S. U. Pillai, K. Y. Li, and B. Himed, “Angle-doppler processing using sparse
regularization,” in 2010 IEEE International Conference on Acoustics, Speech and Signal
Processing 2010, pp. 2750–2753.
[24] L. Anitori, A. Maleki, M. Otten, R. G. Baraniuk, and P. Hoogeboom, “Design and analysis
of compressed sensing radar detectors,” IEEE Transactions on Signal Processing, vol. 61,
no. 4, pp. 813–827, 2013.
[25] N. A. Goodman and L. C. Potter, “Pitfalls and possibilities of radar compressive sensing,”
Applied Optics, vol. 54, no. 8, pp. C1–C13, 2015.
[26] J. A. Tropp and A. C. Gilbert, “Signal recovery from random measurements via orthogonal
matching pursuit,” IEEE Transactions on Information Theory, vol. 53, no. 12, pp. 4655–
4666, 2007.
[27] W. Dai and O. Milenkovic, “Subspace pursuit for compressive sensing: Closing the gap
between performance and complexity,” DTIC document, Tech. Rep., 2008.
[28] D. Needell and J. A. Tropp, “Cosamp: Iterative signal recovery from incomplete and
inaccurate samples,” Applied and Computational Harmonic Analysis, vol. 26, no. 3,
pp. 301–321, 2009.
[29] M. Rossi, A. M. Haimovich, and Y. C. Eldar, “Compressive sensing with unknown
parameters,” in 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals,
Systems and Computers (ASILOMAR), 2012, pp. 436–440.
[30] M. E. Davies and Y. C. Eldar, “Rank awareness in joint sparse recovery,” IEEE Transactions
on Information Theory, vol. 58, no. 2, pp. 1135–1146, 2012.
[31] M. Elad, Sparse and Redundant Representations: From Theory to Applications in Signal
and Image Processing. Springer, 2010.
[32] J. A. Tropp and S. J. Wright, “Computational methods for sparse solution of linear inverse
problems,” Proceedings of the IEEE, vol. 98, no. 6, pp. 948–958, 2010.
[33] H. H. Kim, M. A. Govoni, and A. M. Haimovich, “Cost analysis of compressive sensing for
mimo stap random arrays,” in 2015 IEEE Radar Conference (RadarCon), 2015, pp. 0980–
0985.
[34] M. Rossi, A. M. Haimovich, and Y. C. Eldar, “Spatial compressive sensing in MIMO radar
with random arrays,” in 46th Annual Conference on Information Sciences and Systems
(CISS), 2012, pp. 1–6.
[35] J. Ward, “Space-time adaptive processing for airborne radar,” in IEE Colloquium on Space-
Time Adaptive Processing (Ref. No. 1998/241), 1998, pp. 2/1–2/6.
[36] H. L. Van Trees, Detection, Estimation, and Modulation Theory: Optimum Array process-
ing. John Wiley & Sons, 2004.
[37] B. Eisenberg, “On the expectation of the maximum of IID geometric random variables,”
Statistics & Probability Letters, vol. 78, no. 2, pp. 135–143, 2008.
[38] B. D. Steinberg, “The peak sidelobe of the phased array having randomly located elements,”
IEEE Transactions on Antennas and Propagation, vol. 20, no. 2, pp. 129–136, 1972.
[39] J. Ward, “Space-time adaptive processing with sparse antenna arrays,” in Conference Record
of the Thirty-Second Asilomar Conference on Signals, Systems and Computers 1998,
pp. 1537–1541.
[40] R. Klemm, Principles of Space-Time Adaptive Processing. IET, 2002, no. 159.
[41] F. C. Robey, D. R. Fuhrmann, E. J. Kelly, and R. Nitzberg, “A CFAR adaptive matched filter
detector,” IEEE Transactions on Aerospace and Electronic Systems, vol. 28, no. 1, pp. 208–
216, 1992.
[42] E. J. Kelly, “An adaptive detection algorithm,” IEEE Transactions on Aerospace and
Electronic Systems, no. 2, pp. 115–127, 1986.
[43] I. S. Reed, J. D. Mallett, and L. E. Brennan, “Rapid convergence rate in adaptive arrays,”
IEEE Transactions on Aerospace and Electronic Systems, no. 6, pp. 853–863, 1974.
[44] E. Kelly, “Performance of an adaptive detection algorithm; rejection of unwanted signals,”
[45] D. M. Boroson, “Sample size considerations for adaptive arrays,” IEEE Transactions on
Aerospace and Electronic Systems, no. 4, pp. 446–451, 1980.
[46] R. T. Behrens and L. L. Scharf, “Signal processing applications of oblique projection
operators,” IEEE Transactions on Signal Processing, vol. 42, no. 6, pp. 1413–1424, 1994.
[47] L. L. Scharf and B. Friedlander, “Matched subspace detectors,” IEEE Transactions on Signal
[48] J. R. Guerci, J. S. Goldstein, and I. S. Reed, “Optimal and adaptive reduced-rank STAP,”
[49] J. Liu, H. Li, and B. Himed, “Threshold setting for adaptive matched filter and adaptive
coherence estimator,” IEEE Signal Processing Letters, vol. 22, no. 1, pp. 11–15, 2015.
6 Fast and Robust Sparsity-Based
STAP Methods for Nonhomogeneous
Clutter
Xiaopeng Yang, Yuze Sun, Xuchen Wu, Teng Long, and Tanpan K. Sarkar
6.1 Introduction
Space–time adaptive processing (STAP) can effectively suppress clutter and achieve
better moving target detection, as compared with traditional nonadaptive methods [1].
However, in a practical nonhomogeneous environment, the clutter characteristics would
change fast and dynamically, making it difficult to collect a sufficient number of inde-
pendent and identically distributed (i.i.d.) training samples for effective estimation of the
clutter covariance matrix. Meanwhile, the computation of high-dimensional covariance
matrix inversion requires a prohibitive computational complexity [2–4]. Therefore it is
important to develop STAP methods for robust clutter suppression and fast computation
capabilities.
In the past three decades, many suboptimal STAP algorithms have been developed
[4–8]. The reduced-dimension STAP algorithms, such as the pulse repetition interval
(PRI)-staggered algorithm [4] the extended factored algorithm (EFA) [5,6], and the
joint-domain localized (JDL) algorithm [7] can reduce the requirement of training
samples. However, the covariance matrix inversion in computing the adaptive weight
vector is still a time-consuming task for clutter suppression. On the other hand, the
reduced-rank STAP algorithms [9–13], such as the principle components (PCs) [9–11],
conjugate gradient (CG) [12], and the projection approximation subspace tracking
(PAST) [13,14], can also reduce the requirement of training samples. However, because
the performance of reduced-rank STAP algorithms is sensitive to the variation of clutter
environment, an inappropriate rank selection may lead to a severe clutter suppression
performance loss. In the past decade, some sparsity-based STAP methods have been
investigated that significantly reduce the requirement of training samples [15–25]. It has
been explored that the intrinsic sparsity of the clutter spatial-temporal power spectrum
and the space–time adaptive weight vectors can be well utilized for clutter suppression
[20,23]. However, the conventional sparse recovery methods cannot obtain desirable
sparse recovery accuracy and convergence speed in a practical environment. Therefore,
it is necessary to study on the fast and robust sparsity-based STAP methods for the
practical complex clutter environment.
In this chapter, some sparsity-based STAP methods are developed by exploiting the
intrinsic sparsity of the clutter spatial-temporal power spectrum and the space–time
adaptive weight vectors. Firstly, the signal model of airborne phased array radar is
introduced, and then the intrinsic sparsity of STAP is analyzed according to the clutter
165
166 Yang, Sun, Wu, Long, and Sarkar
spatial-temporal power spectrum and the space–time adaptive weight vectors. Secondly,
based on the sparsity of the clutter spatial-temporal power spectrum, a robust and fast
iterative sparse recovery method for STAP is proposed, which can not only alleviate the
effect of noise and dictionary mismatch but also reduce the computational complexity
by recursive inverse matrix calculation. Afterwards, based on the sparsity of space–time
adaptive weight vectors, a fast STAP method based on PAST with sparse constraint is
proposed, which can provide a more robust and stable estimation of clutter subspace
for a small set of training samples. Based on the simulated and the actual airborne
phased array radar data, it is verified that the proposed methods can provide better per-
formance with small training sample support in a practical complex nonhomogeneous
environment.
6.2 Signal Models
The side-looking antenna array configuration of an airborne phased array radar is con-
sidered and the corresponding geometry is shown in Figure 6.1. The N-element uniform
linear antenna array aligns with the velocity direction of the platform, and the interele-
ment spacing is dA . During each coherent processing interval (CPI), M identical pulses
are transmitted with a pulse repetition frequency (PRF) of fr . The operation wavelength
is λ, the height of platform is h with a velocity denoted by va , and L fast time samples
are collected to cover the detection region in each pulse repetition interval (PRI). Each
CPI data of the received signal is stored as an N × M × L data-cube, as shown in
Figure 6.1 Geometry of a side-looking antenna array configuration.

Fast and Robust Sparsity-Based STAP Methods for Nonhomogeneous Clutter 167
Figure 6.2 One CPI data-cube of airborne phased array radar.
Figure 6.2. Each slice corresponding to a fast-time sample of the data cube is an N × M
matrix, which is stacked as an N M × 1 vector according to the channel order.
It is known that radar target detection is a binary hypothesis problem, where hypoth-
esis H1 corresponds to target presence and hypothesis H0 corresponds to target absence
H0 : x = xc + xn,H1 : x = xc + xt + xn, (6.1)
when x is composed of clutter xc , the received target echo xt , and the zero-mean com-
plex Gaussian, spatially and temporally white noise xn .
According to the geometry shown in Figure 6.1, the target can be model as a strong
scatter point, whose azimuth angle is θt and the elevation angle is ϕt , the relative
velocity is vt . Therefore the target echo xt ∈ CN M×1 can be given as
xt = ξ̃t (v(ωt ,ϑt )) = ξ̃t (b(ωt ) ⊗ a(ϑt )) , (6.2)
$ ξ̃% t denotes the target complex amplitude, the spatial frequency is

where $ ϑ%t = dA cos (θt )
sin ϕt /λ, the Doppler frequency is ωt = 2vt cos (θt ) sin ϕt /λfr , b(ωt )
= [1, exp(j 2πωt ),. . ., exp(j (M − 1)2πωt )]T ∈ CM×1 is the target temporal steering
vector, and a(ϑt ) = [1, exp(j 2πϑt ),. . ., exp(j (N − 1)2πϑ t )]T ∈ CN×1 is the target
spatial steering vector. Meanwhile, the clutter at each range cell is the superposition
of Nc independent clutter patches, which are distributed in azimuth with angle interval
ϕ = 2π/Nc . Each clutter patch can be denoted by the azimuth angle θ and elevation
angle ϕ according to each range cell. The spatial frequency ϑc,i and the normalized
Doppler frequency ωc,i of the ith clutter patch of CUT are respectively expressed as
dA $ % $ % 2va $ % $ %
ϑc,i = cos θl,i sin ϕl ,ωc,i = cos θl,i sin ϕl . (6.3)
λ λfr
Therefore, the space–time steering vector of the ith clutter patch is expressed as
v(ωc,i ,ϑc,i ) = b(ωc,i ) ⊗ a(ϑc,i ), (6.4)

where b(ωc,i ) = [1, exp(j 2πωc,i ),. . ., exp(j (M − 1)2πωc,i )]T ∈ CM×1 is the tem-
poral steering vectors and a(ϑc,i ) = [1, exp(j 2πϑc,i ),. . ., exp(j (N − 1)2πϑc,i )]T ∈
CN×1 is the spatial steering vectors, respectively. Thus, based on Melvin’s model in [4],
the space–time clutter snapshot xc ∈ CN M×1 can be expressed as

Nc
xc = ξ̃i v(ωc,i ,ϑc,i ), (6.5)
i=1
where ξ̃i denotes the random complex amplitude corresponding to the ith clutter patch.
The conventional STAP weight vector w ∈ CN M×1 is derived with the minimum
noise variance (MNV) principle [1], which is shown as the following constrained power
minimization problem

4 42
4 4
min J (w) = E 4wH x4 s.t. wH v(ωt ,ϑt ) = 1, (6.6)
w 2
where v (ωt ,ϑt ) is the steer vector of the target. Then by using the method of Lagrange
multipliers, the optimal adaptive weighting vector is obtained as
R−1 v(ωt ,ϑt )

w= −1
. (6.7)
vH (ω t ,ϑ t )R v(ωt ,ϑ t )
It is well known and analyzed in some reference that the space–time covariance matrix
R ∈ CN M×N M is usually estimated from K i.i.d. training samples around the lth range
cell under test (CUT) [1,2,4], i.e.,
1
K−1
R̃ = E xx H ≈ xk xkH = R̃c + R̃n . (6.8)
K −1
k=1,k!=l
In (6.8), R̃c denotes the clutter covariance matrix, the noise covariance matrix is
R̃n = σ2 I, the noise power is σ2 and I is the N M × N M identity matrix.
6.3 Sparsity Principle Analysis of STAP
6.3.1 Sparsity of the Clutter Spatial-Temporal Spectrum

It is well known that the eigenvalue decomposition (EVD) of the space–time covariance
matrix R is obtained as [26]
R = Uc c Uc + σ2 Un UH
n, (6.9)
where Uc denotes the clutter subspace spanned by the principal eigenvectors, c = diag
(ς1,. . .,ς P ) consists of the P principal eigenvalues of R, and Un denotes the noise
subspace. The clutter covariance matrix can be denoted by only P principal eigenvalues
instead of Nc clutter space–time steering vector. The clutter is sparse with respect to
0.012
pi/2
pi/3
0.01 pi/6
Degree of correlation
0.008
0.006
0.004
0.002
0
0 0.1 0.2 0.3 0.4 0.5 0.6
Spatial angle/rad
Figure 6.3 Correlations between space–time steering vectors with spatial angle π/2, π/3, and π/6.
the system degrees of freedom (DoFs). In order to further demonstrate the sparsity of
clutter, the space–time correlation coefficient is given as
4 4
$ %4v(ωp,ϑp )H v(ωq ,ϑq )4
cor v(ωp,ϑp ),v(ωq ,ϑq ) = 4 4 4 4, (6.10)
4v(ωp,ϑp )4 4v(ωq ,ϑq )4
which describes the degree of correlation between different space–time steering vectors,
and the correlation results corresponding to three different spatial angles are shown in
Figure 6.3. It is found that the space–time steering vector is highly correlated to the
vectors that are spatially adjacent. It means that the clutter has a high correlation with
the spatial angle, and the space–time steering vector near the clutter ridge can replace all
the vectors in the spatial-temporal plane to approximate the clutter [27]. Therefore the
received space–time data shows high sparsity in the angle-Doppler domain [19,20]. As
shown in Figure 6.4, the major components of clutter spectrum are distributed near the
ridge. Thus, compared with the whole spatial-temporal plane, the complex amplitude of
received spectrum in most area is rather small, so the received space–time data is sparse
with respect to the whole angle-Doppler domain.
Based on this property, the homogeneous clutter can be well estimated by the sparse
recovery approaches. The spatial-temporal plane is discretized into a grid with Ns spatial
bins and N$ d Doppler
% bins, where each grid point is associated with a space–time steering
vector v fd,i ,fs,j . Therefore the space–time snapshot of the lth range cell can be
given as
xl = xc + xn = ϒ + xn, (6.11)
0.5 0
0.4 –5
0.3
–10
0.2
Normalized Doppler
–15
0.1
0 –20
–0.1
–25
–0.2
–30
–0.3
–0.4 –35
–0.5 –40
–0.5 –0.25 0 0.25 0.5
Normalized angle
Figure 6.4 Sparse distribution of spatial-temporal spectrum in angle-Doppler domain.
Figure 6.5 The procedure of clutter suppression based on sparsity of spatial-temporal spectrum.

where ϒ = γ̃1,1, γ̃1,2,. . ., γ̃Ns ,Nd ∈ CNs Nd ×1 is the complex amplitude of the spectral
distribution. The space–time overcomplete dictionary matrix ∈ CN M×Ns Nd is given
as the collection of all space–time steering vectors, i.e.,
$ % $ % $ %
= v fd,1,fs,1 ,. . .,v fd,i ,fs,j ,. . .,v fd,Nd ,fs,Ns . (6.12)
Therefore, when the complex amplitude of the spectral distribution and the corre-
sponding space–time steering vectors are estimated effectively based on the property of
sparse distribution, the space–time covariance matrix R can be well reconstructed for
clutter suppression. As Ns Nd is much bigger than the system DoFs N × M, the space–
time dictionary is overcomplete and highly correlated, so that (6.11) is underdeter-
mined. However, based on the theory of sparse recovery [28,29], the ill-posed equation
can be solved effectively with limited number of training sample. The sparse recovery
of clutter spectrum can be solved by the following L1 -norm optimization
ϒ̂ = arg minϒ1 s.t. xl − ϒ2 ≤ ε, (6.13)
where L1 -norm guarantees the sparsity of complex amplitude ϒ, and the L2 -norm
restrains the estimation error within ε. Equation (6.13) can also be given as
ϒ̂ = arg min xl − ϒ2 + λ γ ϒ1, (6.14)

ϒ
where λ γ is the regularization parameter. In order to obtain a better spectrum distri-

bution of homogeneous clutter, the adjacent training data can be utilized in the same
processing. Afterwards, the covariance matrix R can be reconstructed as
1
P Nd
Ns
$ % $ %
R= γ̃p,i,j 2 vp fd,i ,fs,j vH fd,i ,fs,j + σ2 I, (6.15)
P
p=1 i=1 j =1
$ %
where P is the number of training data, vp fd,i ,fs,j is the space–time steering vector
in the overcomplete dictionary corresponding to the recovery result of each training data.
6.3.2 Sparsity of Space–Time Adaptive Weight Vectors

As mentioned in the previous section, the clutter subspace is spanned by the eigenvectors
of the corresponding P largest eigenvalues, thus clutter suppression involves a rank-
deficient problem.
According to the Brennan’s rules [3], the rank of the clutter subspace is much smaller
than the DoFs. In other words, the available length of the adaptive weight vector deter-
mined by the number of antenna channels and the slow-time pulses is much larger than
the required length for clutter suppression, implying the sparsity of the adaptive weight
vector that can be exploited. Sparse least mean square (LMS)-type algorithms and
Figure 6.6 The relationship between the full space–time dimension and clutter subspace.
recursive least square (RLS)-type algorithms applied to system identification are studied
in [30,31], which results in a performance improvement for sparse systems, meanwhile,
several novel STAP methods with sparse constraint have recently been proposed in
[23,24]. Among these methods, the concept of the L1 -norm sample matrix inversion
(SMI) method is reviewed in this section. By employing the sparse constraint to the
MNV cost function, the problem is described as the following constrained optimization
problem

4 4
4 H 42
min J1 (w) = E 4w xl 4 + 2κ (w) s.t. wH vt (ωt ,ϑt ) = 1, (6.16)
w 2
where (w) is a term to characterize the sparsity of the weight vector, and κ is a
positive scalar that provides a trade-off between the sparsity and the output power. A
larger value of κ implies that more components will be shrunk to zero. Because of the
convexity of the L1 -norm constraint, it is common practice to approximate the sparse
constraint as (w) = w1 [24]. Therefore the adaptive weight vector w deduced by
Yang et al. [23] is
−1
R̂ + κ vt (ωt ,ϑt )
w= −1 , (6.17)
t (ωt ,ϑ t ) R̂ + κ
vH vt (ωt ,ϑt )

where = diag |w11|+ε , |w21|+ε ,. . ., |wNM1 |+ε and ε is a small positive constant and wi ,
i = 1,2,. . .,NM are the entries of the adaptive weighting vector w. Compared with the
adaptive weight vector obtained in (6.7), it is obvious that the sparse constraint yields
an additional term κ in the inversion of the estimated clutter covariance matrix R̂.
However, the adaptive weight vector in (6.17) is not a closed-form solution, since is
a function of w. Some iterative methods, such as L1 -norm RLS and L1 -norm recursive
SMI methods, can be used to compute the adaptive weight vector [23,24].
6.4 Fast and Robust Sparsity-Based STAP Methods
6.4.1 Robust and Fast Iterative Sparse Recovery Methods for STAP
In the past decade, by exploiting the intrinsic sparsity of the clutter in the angle-Doppler
domain, some sparse-based STAP methods have been proposed to achieve the sparse
recovery of the clutter spatial-temporal spectrum with limited training data [17–22].
A global matched filter (GMF) [17] is firstly applied to the STAP, which demonstrates
that both target and clutter can be identified based on a single snapshot without prior
estimation of the clutter covariance matrix. Then, by assuming the knowledge of the
clutter ridge in the spatial-temporal plane, targets are estimated as the sparse solution in
the angle-Doppler domain outside this clutter ridge [18]. In these methods, the sparse
recovery of the clutter spatial-temporal spectrum is usually formulated as a regularized
optimization problem, which can be solved by convex optimization [19]. The compu-
tational complexity of such an operation will increase beyond our capacity when the
dimension of convex optimization becomes large, thereby making STAP implemen-

tation very difficult. Then a series of fast approximation algorithms have also been
proposed [19–22]. The FOcal Underdetermined System Solver (FOCUSS) is one of the
typical fast approximation algorithms, which uses the weighted L2 -norm minimization
to recursively achieve the approximate estimation of the extinct clutter profiles [19].
However, because the performance is heavily affected by the mismatch of the over-
complete dictionary, and the regularization parameter cannot be adjusted adaptively
according to the environment, the performance of FOCUSS would degrade in practical
application.
In this section, by exploiting the intrinsic sparsity of the clutter in the angle-Doppler
domain, a robust and fast iterative sparse recovery method for STAP is given [21]. In
the proposed method, the sparse recovery of the clutter spatial-temporal spectrum and
the calibration of the space–time overcomplete dictionary are executed iteratively. The
robust solution of sparse recovery is derived by regularized processing and calculated
recursively based on the block Hermitian matrix property; afterwards the mismatch of
the space–time overcomplete dictionary is calibrated by minimizing the cost function.
The proposed method is verified based on simulated and actual airborne phased array
radar data.
In an actual clutter environment, the clutter component would possibly be located
between two grids rather than the exact gird point of the dictionary, so the mismatch
between the space–time overcomplete and actual clutter distribution cannot be avoided.
When the mismatch is considered, the space–time snapshot of lth range cell can be
changed into

Nd
Ns
$ %
xl = xc + xn = γ̃i,j v fd,i ,fs,j + xn = ϒ l + xn, (6.18)
i=1 j =1
where = + denotes the actual overcomplete dictionary and is the mismatch

matrix. Therefore, a robust and fast iterative sparse recovery method for STAP in prac-
tical environments is proposed, and the main procedures of the proposed method are
mainly demonstrated in the following explorations.
A. Method Formulation
a. Sparse Recovery Processing
The effect of additive noise is not considered in the basic sparse recovery method [19].
However, the additive noise is inevitable in practical environments, which will increase
the recovery error. Therefore, the regularized processing is employed to reduce the effect
of additive noise in the sparse recovery. In this method, the fast approximation FOCUSS
method is utilized for clutter sparse recovery. The sparse recovery problem in (6.14) can
be converted as
N
s Nd
4 4
min J (ϒ l ) = 4 γl,i 4 s.t. xl − ϒ l 2 ≤ ε, (6.19)
ϒl p
i=1
then the cost function is given by the Lagrange multiplier method
L (ϒ l ) = J (ϒ l ) + αxl − ϒ l 2, (6.20)
where α denotes the Lagrange multipliers that match the noise level. By solving the
gradient of ϒ l , we can get
8 $ %
∇L (ϒ l ) = |p| (ϒ l ) ϒ l + α H ϒ l − H xl , (6.21)
9 $ p−2 p−2 %
where (ϒ l ) = diag γl,1 ,. . ., γl,Ns Nd , then the appropriate Hessian matrix
can be obtained by
8
∇ 2 L (ϒ l ) = |p| (ϒ l ) + αH . (6.22)
Afterwards, by applying the quasi-Newton method, we can get

0 $ (k) %1−1 $ (k) %
(k+1) (k)
γl = γl − ∇ 2 L γl · ∇L γl . (6.23)
Then, by substituting (6.21) and (6.22) into (6.23), we can get

0 8$ 1−1
(k+1) (k) %
ϒl = δ ϒ l + H H xl , (6.24)
$ %−2 9 $ (k) %
where δ=1/α. By defining W (k) = ϒl and (k) = W (k) , where
& p p '
(k) 1− /2 1− /2
W (k) = diag γl,1 ,. . ., γl,Ns Nd
(k)
is the diagonal weighting matrix at the
kth iteration, the solution at kth iterative can be derived as
(k+1) $ %H 0 $ %H 1−1
ϒl = W (k) (k) δI + (k) (k) xl . (6.25)
It is easily found from (6.25) that the matrix inversion is still needed to calculate
complex amplitude, which will significantly influence the convergence of iteration.
Although the adaptive subspace selection [19] can be applied to reduce the dimension
of complex amplitude in the iterations, the direct inverse calculation still cannot be
avoided. However, based on the mathematical analysis, it can be proved that the
$ %H
matrix T=δI+(k) (k) is a Hermitian matrix, therefore the matrix inversion can be
calculated recursively based on the block Hermitian matrix property [33,34].
The low-resolution estimation based on the Fourier spectrum is employed as the ini-
(0)
tial value of ϒ l , i.e., ϒ l = H xl , and then the calculation can be executed iteratively
as (6.25). During the iterations, the prominent components in are gradually reinforced,
while the remaining small components are suppressed until they become close to zero.
(k)
Finally, when the absolute difference of ϒ l is smaller than the convergence threshold,
the sparse recovery result is obtained. From (6.25), it can be found that when the noise
level is reduced to 0, i.e., δ → 0, the proposed method will degenerate to the FOCUSS
method. Because the regularized processing is applied in the iteration, the proposed
method can effectively improve the recovery performance under noise.
b. Mismatch Calibration Processing

The mismatch between the space–time overcomplete dictionary and actual clutter distri-
bution is ignored in conventional sparsity-based STAP methods, which would decrease
the performance of clutter suppression. Therefore, the mismatch calibration processing
(k)
is investigated in this section. After ϒ l is obtained at the kth iteration, the estimation
of can be obatined by
$ % 4 4 4 (k) 4
(k) = arg min J (k) = 4(k) 42 + 4xl − (k) ϒ l 42 . (6.26)
(k)
(k) (k)
By defining e(k) = xl − ϒ l , and y(k) = ϒ l , the cost function of (6.26) can be
given as
$ % 4 4 4 4
J (k) = 4(k) 42 + 4e(k) − (k) y(k) 42 . (6.27)
Then the mismatch can be calculated by solving gradient equation

$ % &
∂J (k) $ % '
(k) (k) H
$ %H
(k)
= (k)
H
+ y y − e(k) y(k) = 0. (6.28)
∂
Therefore the estimation of can be obtained
$ %H & (k) $ (k) %H '−1
(k) = e(k) y(k) y y + H . (6.29)
Afterwards, the space–time overcomplete dictionary is calibrated at the kth iteration

by (k) = +(k) . It can be found that the mismatch of the space–time overcomplete
dictionary can be calibrated gradually by minimizing the corresponding cost function,
so that the mismatch between the dictionary and actual clutter distribution has been
reduced effectively. When the following convergence condition is satisfied as

ϒ (k+1) − ϒ (k)
l l
≤ ξ, (6.30)
ϒl
(k+1)
the iteration will stop, and then the sparse recovery is achieved. Then the reconstruction
of space–time covariance matrix R and adaptive weighting vector can be calculated
correspondingly.
B. Method Verification
In this section, simulated data and two actual measured airborne phased array radar
data sets (MCARM [34] and one other data) are used to verify the clutter suppression
performance of the proposed methods, and compared with SMI, EFA [3], STAP using
CVX [19], and FOCUSS [19] methods.
a. Simulated Data
The simulation parameters are listed in Table 6.1, where ρs and ρ d are set to be 4 in the
simulations, and all the results are averaged over 500 Monte Carlo runs.
The output SINRs versus the number of snapshots based on SMI, EFA, STAP using
CVX, FOCUSS, and the proposed method are investigated and the results are shown
in Figure 6.7. It is found that the sparsity-based STAP methods can obtain desirable
Table 6.1 Simulation parameters.
Parameter Value Parameter Value
Number of spatial elements 8 Number of temporal pulses 8

in a CPI
Radar frequency 450 MHz Pulse repetition frequency 1200 Hz
Platform velocity 200 m/s Height of platform 12 km
Main beam look direction Side-looking Clutter-to-noise ratio (CNR) 40 dB
Target normalized Doppler 0.15 SNR 5 dB
frequency
24
22
20
Output Power [dB]
18
16
14
OPT
SMI
12 EFA
FOCUSS
STAP using CVX
Proposed Method
10
124 8 16 32 48 64 72 96 128
Range Cells
Figure 6.7 Output SINRs versus number of snapshots.
SINR performance with small training sample support, which exhibits much faster
convergence than conventional STAP methods. Meanwhile, the proposed method can
provide better performance than other sparsity-based methods with the same number of
training samples, because the calibration of space–time overcomplete dictionary and the
regularization in the sparse recovery processing are applied in the proposed method.
The output SINRs with four training samples are investigated correspondingly and
the results are shown in Figure 6.8. It is found that the SMI and EFA methods cannot
obtain desirable SINR with minimal training sample support, owing to the insufficient
estimation of the clutter covariance matrix, so that these two methods could not provide
desirable target detection performance in a practical clutter environments. However,
the sparsity-based STAP methods could provide much better performance in the instance
of good estimation of clutter distribution, so that the clutter covariance matrix can be
well reconstructed. Moreover, the proposed method can obtain better sparse recovery
performance than conventional sparsity-based STAP methods, because of the calibration
25
15
5
SINR [dB]
−5
−15
−25 OPT
SMI
EFA
−35 FOCUSS
STAP using CVX
Proposed Method
−45
−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5
Normalized Doppler
Figure 6.8 Output SINRs of SMI, EFA, STAP using CVX, FOCUSS, and our proposed method.
of the space–time overcomplete dictionary and regularization processing. The proposed

method can effectively improve the SINR performance, especially in low Doppler
frequency regions where the target located.
In the following, the range detections with four training samples are also investigated
correspondingly, and the results are shown in Figure 6.9. It accords with the results
in Figure 6.8, in that the conventional SMI and EFA methods cannot suppress the
clutter effectively with small training sample support, and this leads to undesirable
target detection performance in practical clutter environments. However, the sparsity-
based STAP methods can obtain desirable range detection performance, and moreover
the proposed method can produce the larger difference of output power between the
tested range cells and adjacent range cells than FOCUSS and CVX methods, which will
be very useful for target detection in practical complex nonhomogeneous environments.
b. MCARM Data
The MCARM data [34] is used to verify the STAP methods in this section. The array was
an L-band phased array antenna using 22 elements arranged as a 2 × 11 configuration.
The PRF of the radar is 1,984 Hz, 128 pulses are contained in one CPI, the platform
velocity is 100 m/s, and the height of the platform is 3,078 m. We note that ρ s and ρ d
are both set to 6 while 12 pulses and 8 elements data of MCARM are used, the target
is located at the 299th range cell with −0.15 Doppler frequency, and 4 range cell data
around the 299th range cell are selected as the training samples.
The output SINRs are investigated and the results are shown in Figure 6.10. It is
similar with the previous simulated results that SMI and EFA methods cannot obtain
desirable SINR with minimal training sample support. Moreover, the proposed method
40
30
20
Output Power [dB]
10
−10
−20
−30 SMI
EFA
−40 FOCUSS
STAP using CVX
Proposed Method
−50
1 50 100 150
Range Cells
Figure 6.9 Range detections of SMI, EFA, STAP using CVX, FOCUSS, and our proposed
method.
25
10
SINR [dB]
−10
SMI
EFA
FOCUSS
−20 STAP using CVX
Proposed Method
−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5
Normalized Doppler
Figure 6.10 Output SINRs of SMI, STAP using CVX, FOCUSS, and our proposed method based
on MCARM data.
can obtain better SINR performance in both main-lobe and side-lobe regions than in
conventional sparsity-based STAP methods.
The range detections are also investigated and the results are shown in Figure 6.11.
It is similar to the previous simulated results, in that the SMI and EFA methods cannot
−10
Output Power [dB]
−20
−30
SMI
EFA
−40 FOCUSS
STAP using CVX
Proposed Method
250 270 290 310 330 350
Range Cells
method based on MCARM data.
detect the target effectively, while the sparsity-based STAP methods can provide desir-
able detection. The proposed method can also obtain a larger difference of the output
power between the tested range cell and adjacent range cells than the FOCUSS method
and STAP using CVX, so that the target can be detected correctly.
c. Actual Measured Airborne Radar Data

The actual measured airborne radar data is also applied to verify the proposed method
comparing with SMI, EFA, CVX, and FOCUSS methods in the section. The actual
measured airborne radar data consists of 16 spatial channels and 128 temporal pulses
in a CPI. As before, ρs and ρd are both set to 6, and 100 range snapshots of the first 8
channels and the first 12 pulses are used, a strong target is located in the 231th range
cell with the normalized Doppler of about 0.07, and 4 range cell data around the 231th
range cell are selected as the training samples.
The output SINRs are also investigated and the results are shown in Figure 6.12. It
is similar to the previous results that were based on simulated data and MCARM data,
in that the SMI and EFA methods cannot obtain desirable SINR, while the sparsity-
based STAP methods provide better performance with limited training samples, and the
proposed method can obtain better performance, especially in the main-lobe region.
The range detection results are shown in Figure 6.13. It is also similar to the previous
results that were based on simulated data and MCARM data, in that SMI and EFA
methods cannot detect the target effectively, while the sparsity-based STAP methods
provide desirable detection, and the proposed method can obtain a greater disparity in
output power between the tested range cell and adjacent range cells than FOCUSS and
CVX methods.
25
10
Output SINR [dB]
−10
SMI
EFA
FOCUSS
−20
STAP using CVX
Proposed Method
−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5
Normalized Doppler
Figure 6.12 Output SINRs of SMI, STAP using CVX, FOCUSS, and our proposed method based
on actual measured airborne radar data.
85
75
Output Power [dB]
65
55
45 SMI
EFA
FOCUSS
STAP using CVX
35 Proposed Method
200 220 240 260 280
Range Cells
method based on actual measured airborne radar data.
6.4.2 Fast PAST Methods with Sparse Constraint for STAP

A. Method Formulation
It is well known that the clutter covariance matrix has a low rank property, which means
that fewer DoFs are required than those offered by the system, thus a high degree of
sparsity can be exploited in the adaptive weight vector. Based on this property, several
sparse constrained STAP methods have been proposed [23,24]. In these approaches,
by imposing the sparse constraint to the conventional STAP cost function, the adaptive
weight vector is altered to yield a performance improvement compared with the STAP
methods that do not apply sparse constraint. However, L1 -norm RLS and L1 -norm SMI
methods do not fully utilize the low-rank property, which means that extra training
samples are needed to perform the STAP [23]. On other hand, the L1 -norm conventional
conjugate gradient (L1 -norm CCG) method requires multiple iterations for each input
data and thus leads to increased computational complexity [24].
In order to improve the clutter suppression capability with minimal training sample
support, by further exploiting the low-rank property of the clutter covariance matrix,
a fast STAP method–based PAST with the sparse constraint is given [25]. In the pro-
posed method, the sparse constraint is imposed into the cost function of PAST, and
the adaptive weight vector is then derived iteratively by using the RLS method and
matrix inversion lemma to update the autocorrelation matrix and cross correlation terms.
Because of the sparse constraint in PAST, the proposed method provides more robust
estimation of the clutter subspace. Therefore the clutter suppression performance can
be significantly improved effectively when a small training sample is used. Through
simulated results and two sets of actual airborne phased array radar data, it is verified
that the proposed method achieves nearly the same performance at lower computational
complexity compared with existing sparsity-constrained STAP methods, and provides
performance improvement compared with conventional STAP methods without sparse
constraint.
According to the eigencanceller [9,13], the corresponding weight vector can be
obtained by
$ %
I − Ûc ÛH
c vt (ωt ,ϑ t )
wEV D = $ % . (6.31)
t (ωt ,ϑ t ) I − Ûc Ûc vt (ωt ,ϑ t )
vH H
By reducing the required number of sample support from 2N K to 2P , the EVD-

based method can improve the convergence of clutter suppression. However, because
of the direct EVD processing, the computational complexity of the EVD-based method
remains O((N M)3 ), which is impractical for real-time processing. To remedy this prob-
lem, the PAST technique is employed to reduce the computational complexity for clutter
subspace acquisition [35,36]. Owing to the approximation in the iteration, the PAST
method may suffer performance loss compared with the EVD method, especially when
the environment is nonhomogeneous. In order to improve the clutter suppression per-
formance when only a small training sample support is available, we propose the fast
STAP method based on PAST with sparse constraint in this section.
The PAST method [35] is derived by minimizing the mean squares error between the
space–time received data and its projection on the clutter subspace, which is denoted by
54 42 6
J2 (Ŵc ) = E 4xl − Ŵc ŴH 4
c xl 2
$ % $ % $ H % (6.32)
c R̂Ŵc + tr Ŵc R̂Ŵc · Ŵc Ŵc ,
= tr R̂ − 2tr ŴH H
where the N M × P matrix Ŵc spans the clutter subspace as Uc , and tr(R̂) denotes the
trace of the covariance matrix R̂. The cost function J2 (Ŵc ) obtains its global minimum
only when Us contains the principal eigenvectors of R̂. Otherwise, all the stationary
points of J2 (Ŵc ) are saddle points. Therefore, similar to the L1 -norm SMI method, by
imposing the sparse constraint to J2 (Ŵc ), the cost function of the proposed method is
described as
54 42 6 $ %
J3 (Ŵc ) = E 4xl − Ŵc ŴH 4
c xl 2 + 2κ Ŵc
$ % $ %
= tr R̂ − 2tr ŴH (6.33)
c R̂Ŵc
$ H % $ %
c Ŵc + 2κ Ŵc .
+ tr Ŵc R̂Ŵc · ŴH
The cost function in (6.33) can be represented in the following exponentially weighted
form

k 4 42 $ %
4 4
J3 (Ŵc (k)) = ρ K−i 4xi − Ŵc (k)ŴcH (k)xi 4 + 2κ Ŵc (k)
2
i=1
$ % $ % (6.34)
= tr R̂(k) − 2tr ŴcH (k)R̂(k)Ŵc (k)
$ % $ %
+ tr ŴcH (k)R̂(t)Ŵc (k) · ŴcH (k)Ŵc (k) + 2κ Ŵc (k) ,
where ρ is a forgetting factor to ensure that data in the distant past are downweighted,
which can provides a trade-off between the tracking capability and evaluated error
when the system operates in a nonstationary environment. In addition [11,37], R̂(k)

= ki=1 ρ k−i xi xHi = ρ R̂(k − 1) + xk xk denotes the exponentially weighted covariance
H
matrix. As J3 (Ŵc (k)) is a fourth-order function of the elements of Ŵc (k), iterative
processing is thus necessary to minimize J3 (Ŵc (k)). The core idea of the PAST method
is to employ ŴH c (i − 1)xi to approximate the projection Ŵc (k)xi for 1 ≤ i ≤ k.
H
Therefore, the cost function in (6.34) can be rewritten as

k 4 42 & '
4 4
J4 (Ŵc (k)) = ρ k−i 4xi − Ŵc (k)ŴcH (k)xi 4 + 2κ Ŵc (k)
2
i=1
(6.35)

k 4 42 & '
4 4
= ρ k−i 4xi − Ŵc (k)yi 4 + 2κ Ŵc (k) ,
2
i=1
where yi = ŴH c (i − 1)xi . The projection approximation changes the error performance
surface of J3 (Ŵc ). For stationary or slowly time-varying signals, the difference between
ŴHc (i − 1)xi and Ŵc (k)xi is small, and as the number of iterations increases, the
H
effect of the past input data gradually becomes insignificant. As a result, J4 (Ŵc (k))
can effectively approximate J3 (Ŵc ) and a desirable estimation of the clutter subspace
can be obtained by minimizing J4 (Ŵc (k)).
Note that J4 (Ŵc (k)) is a second-order function of the estimated clutter subspace
$ Ŵc%(k). Therefore,
approximation $ %based on the reference in [23], the sparse constraint is
given as Ŵc (k) = sign Ŵc (k) , the sub-gradient of (6.35) with respect to Ŵc (k) is
given as
$ %
∇JŴc (k) = −rxy + Ŵc (k)Ryy + κsign Ŵc (k) , (6.36)
where

sign (·) is a component-wise sign function, which is defined as sign (x)
x/|x| for x != 0
= [22,23], and rxy (k) and Ryy (k) are given as
0 for x = 0

k
rxy (k) = ρ k−i xi yH
i = ρrxy (k − 1) + xk yk
H
i=1
(6.37)

k
Ryy (k) = ρ k−i yi yH
i = ρRyy (k − 1) + yk yk .
H
i=1
By equating the above gradient terms to zero, the clutter subspace can be estimated as
$ $ %%
Ŵc (k) = rxy (k) − κsign Ŵc (k) R−1 yy (k). (6.38)
It is obvious that using (6.38) directly is computationally consuming, but the computa-
tional complexity can be effectively reduced by employing the RLS method. Let
$ %
T (k) = rxy (k) − κsign Ŵc (k) . (6.39)
Then, substituting (6.37) into the above equation, T (k) can be described by the
following recursive equation
0 $ % $ %1
T (k) = ρT (k − 1) + xk yHk − κsign Ŵc (k) − ρκsign Ŵc (k − 1) . (6.40)
Because the instantaneous error of the weight vector changes slowly in each time step,
the sign of the weights does not change rapidly. As such, T (k) can be approximated by
$ % $ %
T (k) ≈ ρT (k − 1) + xk yH k + κ ρ − 1 sign Ŵc (k − 1) . (6.41)
Therefore, the clutter subspace can be recursively estimated as
Ŵc (k) = T (k) R−1

yy (k). (6.42)
Then by defining P(k) = R−1

yy (k) and using the matrix inversion lemma, we can get
2 3
1 P(k − 1)yk yHk P(k − 1)
P(k) = P(k − 1) − , (6.43)
ρ ρ + yH
k P(k − 1)yk
which can be rewritten as

P(k) = ρ−1 P(k − 1) − g(k)hH (k) , (6.44)
h(k)
where h(k) = P (k − 1) yk , g(k) = . By substituting (6.41) and (6.44) into
ρ+yH
k h(k)
(6.42), the iterative relationship between Ŵc (k) and Ŵc (k − 1) can be obtained as

Ŵc (k) = Ŵc (k − 1) + xk − Ŵc (k − 1)yk gH (k)

1 $ % (6.45)
+κ 1− sign Ŵc (k − 1) P (k − 1) − g (k) hH (k) .
ρ
After estimating the clutter subspace Ŵc (k) from the proposed method, the space–
time adaptive weight vector can be obtained as
$ %
I − Ŵc ŴHc vt (ωt ,ϑ t )
wSC−P AST = $ % . (6.46)
t (ωt ,ϑ t ) I − Ŵc Ŵc vt (ωt ,ϑ t )
vH H
Compared with the solution of the conventional PAST method [35], because of the
sparse constraint, there is an additional term in (6.38). It has been proved in [38] that
the mean squared error (MSE) of L1 -norm constraint RLS method is lower than that
of RLS, which means that the L1 -norm constraint RLS method represents a signifi-
cant performance improvement over the conventional RLS method. As in the proposed
method, the RLS method is employed for estimating the clutter subspace, thus a more
robust solution of the estimated clutter subspace can be obtained because of the sparse
constraint.
The following three important observations are in order. First, the initial value of P(k)
must be a Hermitian and positive-definite matrix, and the initial value of Ŵc (k) should
be composed of P orthogonal vectors. Therefore, without loss of generality, P(0) is set
to the P × P identity matrix and the columns of Ŵc (0) are set to the P leading unit
vectors of the N M × N M identity matrix. Second, in order to maintain the Hermitian
symmetry, an operation Tri {·} denoted by the operator is employed in the processing,
which indicates that only the upper (or lower) triangular part of P(k) is reserved, and its
Hermitian transposed version is copied to the opposite triangular part. Third, as PAST
may not converge to an orthonormal basis [35], an operation denoted by the operator
Orth {·} is employed in each iteration to guarantee the orthogonality between the vectors
in Ŵc (k).
B. Method Verification
In this section, simulated data and two actual measured airborne phased array radar
data sets (MCARM [34] and one other data) are used to verify the clutter suppression
performance of the proposed L1 -norm PAST method. The performance is compared
with those obtained from the SMI, EVD, PAST, L1 -norm CCG [23,24], L1 -norm mod-
ified conjugate gradient (L1 -norm MCG) [23,24], and shrinkage operator STAP [11]
methods.
a. Simulated Data
The parameters of the simulated data are listed in Table 6.2. In the simulations, the
SMI method is implemented with a loading factor of 10 dB above the noise power, the
forgetting factor ρ is set to 0.97, and zero vectors are used to initialize the adaptive
Table 6.2 Simulation parameters.
Parameter Value Parameter Value
Number of spatial elements 8 Main beam look direction Side-looking

Number of temporal pulses 8 Target normalized Doppler 0.2
in a CPI frequency
Radar frequency 450 MHz Pulse repetition frequency 1200 Hz
Platform velocity 200 m/s Height of platform 12 km
Channel spacing λ/2 Clutter-to-noise Ratio (CNR) 40 dB
70 70
60 60
Improvement Factor[dB]
Improvement Factor[dB]
50 50
40 40
30 30
OPT OPT
SMI L1−norm CCG
20 20
EVD L1−norm MCG
PAST Shrinkage Operator STAP
L1−norm PAST L1−norm PAST
10 10
20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200
Number of Samples Number of Samples
Figure 6.14 IF performance of EVD, PAST, L1-norm PAST, L1-norm CCG, L1-norm MCG, and
shrinkage operator STAP methods versus the number of training samples.
weight vector in the L1 -norm CCG and L1 -norm MCG methods. A scaled N M × N M
identity matrix εI is used as the initialization matrix for the PAST, L1 -norm PAST,
L1 -norm CCG, and L1 -norm MCG methods, respectively, where ε = 1 × 10−3 is used.
All results are averaged over 500 Monte Carlo runs.
The improvement factor (IF) performance of the proposed method with respect to
the number of training samples is given and compared with the EVD, PAST, L1 -norm
PAST, L1 -norm CCG, L1 -norm MCG, and shrinkage operator STAP methods. It is
assumed that the selected rank is 15, and the value of sparse constraint parameter is set to
κ = 1000. The results are shown in Figure 6.14. It is clear that the SMI method obtains
the worst convergence performance, whereas the proposed L1 -norm PAST converges
with the lowest number of training samples and outperforms the conventional PAST
method. On the other hand, the EVD, L1 -norm CCG and shrinkage operator STAP
methods achieve comparable performance. In addition, compared with the L1 -norm
PAST, the L1 -norm CCG methods and the shrinkage operator STAP method, the L1 -
norm MCG method suffers from significant performance loss when the training sample
support is small.
The IF performance of different methods with respect to the target Doppler frequency
using a small training samples support is shown in Figure 6.15. It is assumed that the
selected rank is 15, the number of training samples is 40, and the sparse constraint
70 70
60 60
50 50
Improvement Factor [dB]

40 40
30 30
20 20
OPT OPT
SMI L1−norm CCG
10 EVD 10 L1−norm MCG
0 0
−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 −0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5
Normalized Doppler Normalized Doppler
Figure 6.15 IF performance of SMI, EVD, PAST, L1-norm PAST, L1-norm CCG, L1-norm
MCG, and shrinkage operator STAP methods versus the target Doppler frequency.
50 50
40 40
30 30
SMI
L1−norm CCG
EVD
20 20 L1−norm MCG
−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 −0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4
Figure 6.16 IF performance of SMI, EVD, PAST, L1-norm PAST, L1-norm CCG, L1-norm
MCG, and shrinkage operator STAP methods based on 200 training samples of MCARM data.
parameter is κ = 1000. Similar to Figure 6.14, the SMI method yields the worst IF
performance, while the EVD, the L1 -norm CCG, and shrinkage operator STAP methods
perform the best. The proposed L1 -norm PAST obtains better IF performance than the
conventional PAST method. On the other hand, the L1 -norm MCG method suffers from
some performance loss for the underlying small training sample support case when
compared with the L1 -norm PAST and L1 -norm CCG methods.
b. MCARM Data
In the verifications, 12 pulses and 8 elements data of the MCARM are applied, and the
target is located at the 299th range cell with a normalized Doppler frequency of −0.15.
The rank is selected as 19, the SMI method is implemented with a loading factor of 10
dB above the noise power, the value of the sparse constraint parameter is chosen to be
κ = 1000, and the initialization matrix for the PAST, L1 -norm PAST, L1 -norm CCG,
and L1 -norm MCG methods are chosen to be 10−3 I.
First, 201 training samples from range cells number 200 to number 400 are used,
and the IF results and range detection are shown in Figures 6.16 and 6.17. It is
5 5
L1−norm CCG L1−norm CCG
L1−norm MCG L1−norm MCG
0 0
Shrinkage Operator STAP Shrinkage Operator STAP
−5 −5
−10 −10
Output Power [dB]
Output Power [dB]

−15 −15
−20 −20
−25 −25
−30 −30
−35 −35
−40 −40
280 290 300 310 320 330 340 350 360 280 290 300 310 320 330 340 350 360
Range Cell Range Cell
Figure 6.17 Range detection of SMI, EVD, PAST, L1-norm PAST, L1-norm CCG,
L1-norm MCG, and shrinkage operator STAP methods based on 200 training samples of
MCARM data.
40 40
30 30
20 20
10 10
SMI L1−norm CCG
EVD L1−norm MCG
−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 −0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5
Figure 6.18 IF performances of SMI, EVD, PAST, L1-norm PAST, L1-norm CCG,
L1-norm MCG, and shrinkage operator STAP methods based on 20 training samples of
MCARM data.
seen that, when a sufficient number of training samples are available, all the STAP
methods achieve acceptable IF performance and the target can be detected correctly.
Next, we only use 21 training samples from range cells number 260 to 280, and the
yielding range detection and IF results are depicted in Figures 6.18 and 6.19. It is
evident that, as the number of training samples becomes smaller, the SMI and L1 -
norm MCG methods no longer perform properly, whereas the EVD, PAST, L1 -norm
PAST, L1 -norm CCG, and shrinkage operator STAP methods still provide satisfactory
performance.
c. Actual Measured Airborne Radar Data

The actual measured airborne radar data are collected using 16 spatial channels and
consist of a CPI with 128 temporal pulses. We use the first 8 channels and the first 16
pulses for processing. A strong target is located in the 231th range cell with a normalized
Doppler frequency of about 0.07, and the clutter rank is selected as 18. In addition, the
5 5
0 0
−5 −5
−10 −10
Output Power [dB]
Output Power [dB]

−15 −15
−20 −20
−25 −25
−30 SMI −30

L1−norm CCG
EVD
L1−norm MCG
−35 PAST −35
Shrinkage Operator STAP
−40 −40
280 290 300 310 320 330 340 350 360 280 290 300 310 320 330 340 350 360
Range Cell Range Cel
l
Figure 6.19 Range detection of SMI, EVD, PAST, L1-norm PAST, L1-norm CCG, L1-norm
MCG, and shrinkage operator STAP methods based on 20 training samples of MCARM data.
60 60
50 50
40 40
30 30
20 20
SMI L1−norm CCG
EVD L1−norm MCG
10 PAST 10 Shrinkage Operator STAP
−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 −0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5
Figure 6.20 IF performances of SMI, EVD, PAST, L1-norm PAST, L1-norm CCG, L1-norm
MCG, and shrinkage operator STAP methods based on 300 training samples of actual measured
airborne radar data.
SMI method is implemented with a loading factor of 10 dB above the noise power, the
value of the sparse constraint parameter is selected as κ = 1000, and 10−3 I is used
as the initialization matrix for the PAST, L1 -norm PAST, L1 -norm CCG, and L1 -norm
MCG methods.
We first use 301 training samples from range cells number 120 to 400, and the
yielding range detection and IF results are shown in Figures 6.20 and 6.21. Similar
to the results based on the MCARM data, as the number of training samples is sufficient
in this case, all the STAP methods achieve acceptable IF performance and the target can
be detected correctly. We then reduce the number of training samples to 21 collected
between range cells number 120 and 140, and the corresponding range detection and IF
results are shown in Figures 6.22 and 6.23. Again, similar to the MCARM data case,
as the number of the training samples is insufficient in this case, the SMI and the L1 -
norm MCG methods do not offer satisfactory performance, whereas the EVD, PAST,
L1 -norm PAST, L1 -norm CCG, and shrinkage operator STAP methods still function
well.
0 0
SMI L1−norm CCG
EVD L1−norm MCG
−10 L1−norm PAST −10 L1−norm PAST
−20 −20
Output Power [dB]
Output Power [dB]

−30 −30
−40 −40
−50 −50
150 180 210 240 270 300 150 180 210 240 270 300
Range Cells Range Cells
60 60
50 50
40 40
30 30
20 20
L1−norm CCG
SMI
EVD L1−norm MCG
10 PAST 10 Shrinkage Operator STAP

−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 −0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5
Figure 6.22 IF performances of SMI, EVD, PAST, L1-norm PAST, L1-norm CCG, L1-norm
0 0
−10
−10
−20
Output Power [dB]
Output Power [dB]
−20
−30
−30
−40
−40
−50 SMI L1−norm CCG
EVD L1−norm MCG
−60 −50
150 180 210 240 270 300 150 180 210 240 270 300
Range Cells Range Cells
6.5 Conclusions
In this chapter, fast and robust sparsity-based STAP methods for practical environments
have been developed by exploiting the intrinsic sparsity ofthe clutter spatial-temporal
power spectrum and the space–time adaptive weight vectors. Firstly, the signal model of
received space–time data for airborne phased array radar is introduced, and the intrinsic
sparsity of STAP is analyzed according to the clutter spatial-temporal power spectrum
and the space–time adaptive weight vectors. Secondly, based on the sparsity of clutter
spatial-temporal power spectrum, a robust and fast iterative sparse recovery method for
STAP is introduced, which can not only alleviate the effect of noise and dictionary
mismatch, but also reduce computational complexity by the recursive inverse matrix
calculation. Afterwards, based on the sparsity of space–time adaptive weight vectors, a
fast STAP method based on PAST with sparse constraint is introduced, which provides
a more robust and stable estimation of clutter subspace when there is only a small set of
training samples is available. Based on the simulated and the actual airborne phased
array radar data, it is verified that the proposed methods can provide better perfor-
mance with minimal training sample support in practical complex nonhomogeneous
environments.
References
[1] J. R. Guerci, Space-Time Adaptive Processing for Radar. Artech House, 2003.
[2] J. Ward, “Space-time adaptive processing for airborne radar,” Technical Report 1015, MIT
Lincoln Laboratory, Dec. 1994.
[3] W. Zhang, Z. He, and J. Li, “A method for finding best channels in beam-space post-Doppler
reduced-dimension STAP,” IEEE Trans. Aerosp. Electron. Syst., vol. 50, pp. 254–264, 2013.
[4] W. L. Melvin, “Space-time adaptive radar performance in heterogeneous clutter,” IEEE
Trans. on Aerospace and Electronic Systems, vol. 36, pp. 621–633, 2000.
[5] Y. L. Wang, Y. N. Peng, and Z. Bao, “Space-time adaptive processing for airborne radar with
various array orientation,” IET Radar, Sonar Navigation, vol. 144, pp. 330–340, 1997.
[6] W. Zhang, Z. He, and J. Li, “A method for finding best channels in beam-space post-Doppler
reduced-dimension STAP,” IEEE Trans. on Aerospace and Electronic Systems, vol. 50, pp.
254–264, 2013.
[7] H. Wang and L. Cai, “On adaptive spatial-temporal processing for airborne surveillance
radar systems,” IEEE Trans. on Aerospace and Electronic Systems, vol. 30, pp. 660–670,
1994.
[8] A. K. Shackelford, K. Gerlach, B. D. Blunt, “Partially adaptive STAP using the FRACTA
algorithm,” IEEE Trans. on Aerospace and Electronic Systems, vol. 45, pp. 58–69, 2009.
[9] T. Long, Y. Liu, X. Yang, and Y. Sun, “Improved eigenanalysis canceler based on data-
independent clutter subspace estimation for space-time adaptive processing,” Science China:
Information Sciences, vol. 56, no. 10, pp. 1–10, 2013.
[10] R. Fa and R. C. de Lamare, “Reduced-rank STAP algorithms using joint iterative optimiza-
tion of filters,” IEEE Trans. on Aerospace and Electronic Systems, vol. 47, pp. 1668–1684,
2011.
[11] S. Sen, “Low-rank matrix decomposition and spatio-temporal sparse recovery for STAP
radar,” IEEE Journal of Selected Topics in Signal Processing, vol. 9, no. 8, pp. 1510–1523,
2015.
[12] Y. L. Wang, W. J. Liu, and W. C. Xie, “Reduced-rank space-time adaptive detector for
airborne radar,” Science China: Information Sciences, vol. 57, no. 8, pp. 1–11, 2014.
[13] W. J. Liu, W. C. Xie, R. F. Li, Z. T. Wang, and Y. L. Wang, “Adaptive detectors in the Krylov
subspace,” Science China: Information Sciences, vol. 57, no. 10, pp. 1–11, 2014.
[14] P. Parker, A. L. Swindlehurst, “Space-time autoregressive filtering for matched subspace
STAP,” IEEE Trans. on Aerospace and Electronic Systems, vol. 39, no. 2, pp. 510–520,
2003.
Trans. on Signal Processing, vol. 57, no. 6, pp. 2275–2284, 2009.
[16] K. R. Varshney, M. Cetin, J. W. Fisher, and A. S. Willsky, “Sparse representation in
structured dictionaries with application to synthetic apertureradar,” IEEE Trans. on Signal
[17] S. Maria and J. J. Fuchs, “Application of the global matched filter toSTAP data an efficient
algorithmic approach,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process. Toulouse,
France, May 2006, pp. 14–19.
[18] I. W. Selesnick, S. U. Pillai, K. Y. Li, and B. Himed, “Angle-Doppler processing using sparse
regularization,” in Proc. IEEE ICASSP, Dallas, TX, Mar. 2010, pp. 2750–2753.
[19] K. Sun, H. Zhang, G. Li, H. Meng, and X. Wang, “A novel STAP algorithm using sparse
recovery technique,” in Proc. IGARSS, Cape Town, South Africa, July 2009, pp. 336–339.
[20] Z. Yang, X. Li, H. Wang, and L. Nie, “Sparsity-based space-time adaptive processing using
complex-valued homotopy technique for airborne radar,” IET Signal Processing, vol. 8, no.
5, pp. 552–564, 2014.
[21] X. Yang, Y. Sun, T. Zeng, and T. Long, “Robust and fast iterative sparse recovery method
for space-time adaptive processing,” Science China: Information Sciences, vol. 59, no. 6,
pp. 1–13, 2016.
[22] Q. Wu, Y. D. Zhang, M. G. Amin, and B. Himed, “Space-time adaptive processing and
motion parameter estimation in multi-static passive radar exploiting Bayesian compressive
sensing,” IEEE Transactions on Geoscience and Remote Sensing, vol. 54, no. 2, pp. 944–
957, 2016.
[23] Z. Yang, R. C. de Lamare, and X. Li, “L1-regularized STAP algorithms with a generalized
sidelobe canceller architecture for airborne radar,” IEEE Trans. on Signal Processing, vol.
60, no. 2, pp. 674–686, 2012.
[24] Z. Yang, R. C. de Lamare, and X. Li, “Sparsity-aware space-time adaptive processing
algorithms with L1-norm regularisation for airborne radar,” IET Signal Processing, vol. 6,
no. 5, pp. 413–423, 2012.
[25] X. Yang, Y. Sun, T. Zeng, and T. Long, “Fast STAP method based on PAST with sparse
constraint for airborne phased array radar,” IEEE Trans. On Signal Processing, vol. 64, no.
17, pp. 4550–4561, 2016.
[26] A. Haimovich, “The eigencanceller: Adaptive radar by eigenanalysis methods,” IEEE Trans.
on Aerospace and Electronic Systems, vol. 32, no. 2, pp. 532–542, 1996.
[27] M. E. Tipping, “Sparse Bayesian shrinkage and selection learning andthe relevance vector
machine,” Journal of Machine Learning Research, vol. 1, no. 9, pp. 211–244, 2001.
[28] D. L. Donoho, M. Elad, V. N. Temlyakov, “Stable recovery of sparse overcomplete

representations in the presence of noise,” IEEE Trans. on Information Theory, vol. 5, no.
1, pp. 6–18, 2006.
[29] Q. Q. Jia, R. B. Wu, “Space time adaptive parameter estimation of moving target based on
compressed sensing,” Journal of Electronics and Information Technology, vol. 35, no. 11,
pp. 2714–2720, 2013.
[30] J. Jin, Y. Gu, and S. Mei, “A stochastic gradient approach on compressive sensing signal
reconstruction based on adaptive filtering framework,” IEEE Journal of Selected Topics in
Signal Processing, vol. 4, no. 2, pp. 409–420, 2010.
[31] E. M. Eksioglu, “RLS adaptive filtering with sparsity regularization,” in Proc. 10th Int. Conf.
Inf. Sci., Signal Process. Appl., 2010, pp. 550–553.
[32] X. Yang, Y. Liu, T. Long, “Pulse-order recursive method for inverse covariance matrix com-
putation applied to space-time adaptive processing,” Science China: Information Sciences,
vol. 56, no. 4, pp. 1–12, 2013.
[33] X. Yang, Y. Sun, Y. Liu, T. Zeng, T. Long, “Fast inverse covariance matrix computation
based on element-order recursive method for space-time adaptive processing,” Science
China: Information Sciences, vol. 58, no. 2, pp. 1–14, 2015.
[34] B. N. S. Babu, J. A. Torres, and W. L. Melvin, “Processing and evaluation of multichannel
airborne radar measurements (MCARM) measured data,” in Proc. IEEE Int. Symp. Phased
Array Systems and Tech., Boston, MA, Oct. 1996, pp. 395–399.
[35] R. Badeau, B. David, and G. Richard, “Fast approximated power iteration subspace
tracking,” IEEE Trans. on Signal Processing., vol. 53, no. 8, pp. 2931–2941, 2008.
[36] X. Yang, Y. Liu, Y. Sun, and T. Long, “Improved PRI-staggered space-time adaptive process-
ing algorithm based on projection approximation subspace tracking subspace technique,”
IET Radar Sonar Navigation, vol. 8, no. 5, pp. 449–456, 2014.
[37] B. Yang, “Projection approximation subspace tracking,” IEEE Trans. on Signal Processing,
vol. 43, no. 1, pp. 95–107, 1995.
[38] B. Babadi, N. Kalouptsidis, V. Tarokh, “SPARLS: The sparse RLS algorithm,” IEEE
Trans.on Signal Processing, vol. 58, no. 8, pp. 4013–4025, 2010.
7 Super-Resolution Radar Imaging
via Convex Optimization
Reinhard Heckel
A radar system emits probing signals and records the reflections. Estimating the relative
angles, delays, and Doppler shifts from the received signals allows to determine the
locations and velocities of objects. However, due to practical constraints, the prob-
ing signals have finite bandwidth B, the received signals are observed over a finite
time interval of length T only, and a radar typically has only one or a few transmit
and receive antennas. These constraints fundamentally limit the resolution up to which
objects can be distinguished. Specifically, a radar cannot distinguish objects with delay
and Doppler shifts much closer than 1/B and 1/T , respectively, and a radar system with
NT transmit and NR receive antennas cannot distinguish objects with angels closer than
1/(NT NR ). As a consequence, the delay, Doppler, and angular resolution of standard
radars is proportional to 1/B and 1/T , and 1/(NT NR ). In this chapter, we show that the
continuous angle-delay-Doppler triplets and the corresponding attenuation factors can
be resolved at much finer resolution, using ideas from compressive sensing. Specifically,
provided the angle-delay-Doppler triplets are separated either by factors proportional to
1/(NT NR −1) in angle, 1/B in delay, or 1/T in Doppler direction, they can be recovered
at a significantly smaller scale or higher resolution.
7.1 Introduction
A traditional single-input single-output (SISO) pulse-Doppler radar system transmits

a probing signal and receives the reflections from objects with a single antenna. By
estimating the induced delays and Doppler shifts, the radar system can determine the
relative distances and velocities of the objects. However – as with any imaging system –
physics imposes a limit on how well objects can be resolved. The resolution of a radar
system is determined by the bandwidth B of the probing signals and the time interval
T over which the responses are observed. Specifically, the delay and Doppler resolution
is proportional to 1/B and 1/T , meaning that objects closer than that are essentially
impossible to distinguish under noise. Since both B and T cannot be made arbitrarily
large due to physical limitations, those two constraints fundamentally limit the resolu-
tion of a SISO radar system. In contrast to SISO radar systems, mulitple-input, multiple-
output (MIMO) radar systems [1,2] use multiple antennas to transmit probing signals
simultaneously and record the reflections from the objects with multiple receive anten-
nas. A MIMO radar can thereby, in principle, resolve the relative angles in addition to
193
194 Heckel
the relative distances and velocities of objects with a single measurement. However, the
angular resolution of a MIMO radar is 1/(NT NR ), where NT and NR are the number of
transmit and receive antennas, and is thus again limited by a physical constraint, namely
the number of antennas (see Section 7.6 for a detailed argument on the resolution).
Even though objects that are simultaneously much closer than 1/(NT NR ) in angle,
1/B in delay, and 1/T in Doppler direction, are impossible to distinguish for real-world
radar systems in general, it is still possible to determine the locations of the objects with
a much higher degree of accuracy than the resolution limit of (1/(NT NR ),1/B,1/T ). In
this chapter, we discuss signal recovery techniques for building super-resolution radar
systems that can achieve localization accuracy below the resolution limit. In more detail,
we study the problem of recovering the continuous delays and Doppler shifts in a SISO
radar system, and the problem of recovering the angles, delays, and Doppler shifts in a
MIMO radar system, in both cases from the responses to known and suitably selected
probing signals. As we see later, those problems – termed the super-resolution radar
and super-resolution MIMO radar problems – amount to recovering a signal that is
sparse in a continuous dictionary from linear measurements, and can thus be viewed
as a generalization of the traditional compressive sensing problem.
If the objects may be assumed to lie on a sufficiently coarse grid, compressed
sensing-based [3] approaches provably recover the delay-Doppler pairs for SISO radar
system [4–6], and the angle-delay-Doppler triplets for MIMO radar systems [7,8].
However, to establish those results, the aforementioned papers assume that angles,
delays, and Doppler shifts lie on a sufficiently coarse grid, specifically a grid with
spacing 1/(NT NR ),1/B, and 1/T , in angle, delay, and Doppler direction, respectively
(see Section 7.2.3). Since NT ,NR ,B, and T are physical problem parameters, they
can in general not be made (arbitrarily) large in order to make the grid finer. In fact,
the coarseness of the grid is required for the measurement matrix to be incoherent,
therefore the aforementioned results cannot straightforwardly be extended to a grid
with significantly finer spacing. In some special cases, however, off-the-grid recovery
is possible with standard spectral estimation techniques. For example, for a single
input antenna and either known and constant delays (see Section 7.2.4), or known and
constant Doppler shifts, the super-resolution radar problem reduces to a standard 2D
line spectral estimation problem [8, sec. 5]. For these special cases, the object locations
can be recovered – off the grid – with standard spectral estimation techniques such as
Prony’s method, MUSIC, and ESPRIT [9]. In general, however, the super-resolution
radar problems cannot be reduced to the classical line spectral estimation problem.
Therefore, traditional spectral estimation techniques are not directly applicable.
Recently, an alternative, convex optimization based approach to solve the classical
line spectral estimation has been proposed that is much more generally applicable than
traditional line spectral estimation techniques. Specifically, the paper [10] shows that
the frequency parameters, which are the unknowns in the line spectral estimation prob-
lem, can be perfectly recovered by solving a convex total-variation norm minimization
program, provided they are sufficiently separated. Related convex programs have been
studied for compressive sensing off the grid [11], denoising [12], signal recovery from
short-time Fourier measurements [13], and the SISO and MIMO super-resolution radar
Super-Resolution Radar Imaging via Convex Optimization 195
problems [14,15], and the generalized line-spectral estimation problem [16]. The focus
of this chapter is on explaining how this convex optimization based approach enables
high resolution in radar. In particular we discuss the results in [14,15], showing that
a convex program recovers the continuous angles, delays, and Doppler shifts perfectly,
provided that they are sufficiently separated. Furthermore, we show that a simple convex
1 -minimization program recovers the angles and delay-Doppler shifts on an arbitrarily
fine grid, again provided they are sufficiently separated. Finally, we provide numerical
results demonstrating robustness to noise.
The remainder of this chapter is organized as follows. In the first part we consider
the SISO radar model. In more detail, Section 7.2 contains the radar model and formal
problem statement, in Sections 7.3 and 7.3.2 we present the convex optimization based
recovery approach and corresponding performance guarantees, and in Section 7.4 we
show that 1 -minimization recovers the locations on an arbitrarily fine grid. In Section
7.4.2, we provide numerical results demonstrating that the approach is robust to noise
and in Section 7.5.2 we outline the proof of the main technical statements. In the second
part, Section 7.6, we explain how the results for SISO radar can be extended to the
MIMO case. We conclude in Section 7.7 with a discussion on challenges in applying
those ideas in practice, and current and future research directions.
7.2 Signal Model and Problem Statement
A radar system with a single transmit and single receive antenna is typically modeled
as a linear system. The response y recorded at the receive antenna, to a probing signal,
x, emitted at the transmit antenna, is a weighted superposition of delayed and Doppler-
shifted versions of the probing signal x:

y(t) = s(τ,ν)x(t − τ)ei2πν t dνd τ. (7.1)
Here, s denotes the spreading function, which describes the scene being sensed, and τ
and ν are the delays and Doppler shifts. Often, the moving objects are modeled by point
scatterers. Mathematically, this means that the spreading function specializes to

S
s(τ,ν) = bj δ(τ − τ̄j )δ(ν − ν̄j ).
j =1
Here, bj is the (complex-valued) attenuation factor associated with the delay-Doppler

pair ( τ̄j , ν̄j ). With the spreading function above, the input-output relation (7.1) reduces to

S
y(t) = bj x(t − τ̄j )ei2πν̄j t . (7.2)
j =1
Thus, the received signal is a superposition of the reflections of the probing signal by
the point scatterers. The relative distances and velocities of the S-many objects can be
trivially obtained from the delay-Doppler pairs ( τ̄j , ν̄j ). In order to locate the objects,
196 Heckel
we therefore need to estimate the delay-Doppler pairs and the corresponding attenuation
factors bj from a single input–output measurement, i.e., from the response y to a known
and suitably selected probing signal x. As we will see later, the particular choice of the
probing signal is crucial for good localization performance.
7.2.1 Band- and Time-Limitation and Resolution

The probing signal x can be controlled by the system engineer and is known. However,
due to practical and technological constraints, it must be band-limited and approxi-
mately time-limited. Also, again due to practical constraints, we can only observe the
response y over a finite time interval. For concreteness, we assume that
i. we observe the response y over an interval of length T and that
ii. x has bandwidth B and is approximately supported on a time interval of length
proportional to T .
The time- and band-limitation determines the “natural” resolution of the system, which
is the accuracy up to which the delay-Doppler pairs can be identified. A standard pulse-
Doppler radar that samples the received signal at its Nyqist rate, and performs digital
matched filtering, estimates the parameters up to accuracy 1/B and 1/T in delay (τ)
and Doppler (ν) directions, respectively, and therefore only identifies the delay-Doppler
pairs up to the natural solution.
From the input–output relation (7.2), it is evident that band and approximate time
limitation of the input signal x implies that the response y is band- and approximately
time-limited as well – provided that the delay-Doppler pairs are compactly supported.
In radar, due to path loss and finite velocity of the objects in the scene this is indeed the
case [17]. Throughout, we will therefore assume that the delay-Doppler pairs (τ̄j , ν̄j )
lie in the region
: ; : ;
−T T −B B
, × , .
2 2 2 2
This is not a restrictive assumption as this region can have area BT " 1, which is
typically very large. In fact, for many applications, it is reasonable to assume that the
delay-Doppler pairs lie in a region of area significantly smaller than one [18–20], an
assumption often referred to as the linear system being “underspread”. We do not make
or require this assumption here.
By the 2W T -Theorem [21], band and approximate time limitation of the response
y implies that y is essentially characterized by on the order of BT coefficients. We
therefore sample y in the interval [−T /2,T /2] at rate 1/B, so as to collect L := BT
samples, denoted by yp := y(p/B) (for simplicity we assume in the following that
L = BT is an odd integer). As detailed in [14, sec. 5], those samples are given by

S
L−1
yp = bj [Fνj Tτj x]p, p = −N,. . .,N, N := , (7.3)
2
j =1
where
2 3
1
N N
pk
−i2π k
[Tτ x]p := x e L e−i2πkτ ei2π L (7.4)
L
k=−N =−N
and
[Fν x]p := xp ei2πpν .
Here, we defined the time-shifts τj := τ̄j /T and frequency shifts νj := ν̄j /B. To
avoid ambiguity, from here onwards we refer to (τ̄j , ν̄j ) as a delay-Doppler pair and
to (τj ,νj ) as a time–frequency shift. From (τ̄j , ν̄j ) ∈ [−T /2,T /2] × [−B/2,B/2] we
have (τj ,νj ) ∈ [−1/2,1/2]2 . Since Tτ x and Fν x are 1-periodic in τ and ν, we assume
in the remainder of the chapter without loss of generality that (τj ,νj ) ∈ [0,1]2 . The
operators Tτ and Fν have an interesting interpretation as fractional time and frequency
shift operators in CL . In fact, if the parameters τ and ν lie on a (1/L,1/L) grid, the
operators Fν and Tτ reduce to the “natural” time frequency shift operators in CL , i.e.,
[Tτ x]p = xp−τL and [Fν x]p = xp ei2πpν .
The definition of a time shift in (7.4) as taking the Fourier transform, modulating the
frequency, and taking the inverse Fourier transform is a very natural definition of a
continuous time-shift τj ∈ [0,1] of a discrete vector x = [x0,. . .,xL−1 ]T .
Finally, note that to obtain the input–output relation (7.3) (see [14, sec. 5]) from (7.2),
a periodic sinc function is approximated with a finite sum of sinc functions (this is where
partial periodization of x becomes relevant). Thus, if we take the probing signal to be
essentially time-limited, then equality in (7.3) does not hold exactly. However, in [14,
sec. 5] it is shown that for a random probing
√ signal, as considered in this chapter, the
incurred relative 2 -error decays as 1/ L and is therefore negligible for large L. It
is confirmed numerically in the same paper that the approximation error made in this
process is negligible. Moreover, if we took x to be T -periodic, then the input–output
relation (7.3) becomes exact, but at the cost of the probing signal x not being time-
limited.
7.2.2 Formal Problem Statement

From the discussion in the previous section we conclude that identification of the objects
under the constraints that the probing signal x is band-limited and the response y to the
probing signal is observed over a finite time interval, reduces to the estimation of the
triplets {(bj ,τj ,νj )}Sj=1 from the samples {yp }N
p=−N . Thus, in this chapter, we consider
the problem of recovering those triples from the samples {yp }N p=−N in (7.3). We call
this the super-resolution radar problem, as recovering the exact time–frequency shifts
{(τj ,νj )}Sj=1 “breaks” the natural resolution limit of (1/B,1/T ) achieved by a standard
pulse-Doppler radar.
Alternatively, one can view the super-resolution radar problem as that of recovering
a signal that is S-sparse in the continuous dictionary of time–frequency shifts of an
198 Heckel
L-periodic sequence x . In order to see this, and to better understand the super-resolution
radar problem, it is instructive to consider two special cases.
7.2.3 Time–Frequency Shifts on a Grid

Suppose the delay-Doppler pairs ( τ̄j , ν̄j ) lie on a ( B1 , T1 )-grid. As a consequence the
time–frequency shifts (τj ,νj ) lie on a ( L1 , L1 )-grid, which in turn implies that τj L and
νj L are integers in {0,. . .,L − 1}. Thus, the super-resolution radar problem reduces to a
sparse signal recovery problem with a Gabor measurement matrix. To see this, note that
under the aforementioned assumption, the input–output relation (7.3) reduces to

S (νj L)p
yp = bj xp−τj L ei2π L , p = −N,. . .,N .
j =1
Writing this equation in vector-matrix form gives
y = Gx b.
2
Here, the vector y contains as entries the samples yp , Gx ∈ CL×L is the Gabor matrix
with window x, defined by
kp
[Gx ]p,(k,) := xp− ei2π L , k,,p = −N,. . .,N, (7.5)
2
and b ∈ CL is a sparse vector with the j -th nonzero entry given by bj and indexed by
(τj L,νj L).
Thus, recovery of the triplets {(bj ,τj ,νj )}Sj=1 amounts to recovering the S-sparse
vector b from the measurement vector y. A – by now standard – recovery approach
is to solve a convex 1 -norm-minimization program. From [22, thm. 5.1] we know
that, provided the x are i.i.d. sub-Gaussian random variables, and provided that
S ≤ cL/(log L)4 for a sufficiently small numerical constant c, with high probability, all
S-sparse vectors b can be recovered from y via 1 -minimization. Note that the result
[22, thm. 5.1] only applies to the Gabor matrix Gx and therefore does not apply to the
super-resolution problem where the “columns” Fν Tτ x are highly correlated. In fact, the
two problems are conceptually very different: [22, thm. 5.1] shows that the columns of
the Gabor matrix Gx are nearly orthogonal, while the “columns” Fν Tτ x are extremely
correlated for two pairs of time–frequency shifts that are close.
7.2.4 Only Time or Only Frequency Shifts

Next, suppose we only have time- or only frequency shifts. In both cases, recovery of
the unknowns {(bj ,τj )} and {(bj ,νj )}, respectively, is equivalent to the recovery of a
weighted superposition of spikes from low-frequency samples. Specifically, if we only
have frequency shifts, and therefore τj = 0 for all j , the input–output relation (7.3)
reduces to

S
yp = xp bj ei2πpνj , p = −N,. . .,N . (7.6)
j =1
Note that the samples {yp } above are samples of a mixture of S complex sinusoids, and
estimation of the coefficients {(bj ,νj )} corresponds to determining the magnitudes and
the frequency components of these sinusoids. Estimating the coefficients {(bj ,νj )} from
the samples {yp } is known as a line spectral estimation problem and can be solved using
classical approaches, such as Prony’s method [23, ch. 2], as well as convex programming
based approaches [10]. An analogous situation arises when there are only time shifts
(νj = 0 for all j ) as taking the discrete Fourier transform of yp yields a relation exactly
equal to (7.6).
7.3 Atomic Norm Minimization and Associated Performance Guarantees
We next present a convex optimization based recovery algorithm. Even though the
corresponding convex program can be solved in polynomial time, standard solvers are
currently computationally very expensive, limiting the practical applicability. However,
in Section 7.4 we will discuss a very closely related convex program that has a signif-
icantly better computational efficiency at the cost of making a small griding error that
is due to a discretization step. Since the results and intuition for both approaches are
nearly the same, we start by discussing the continuous case here.
7.3.1 Atomic Norm Minimization

We first define for convenience the vector rj := [τj ,νj ], and write the input–output
relation (7.3) in matrix-vector form:

S
y = Gx FH z, z= bj f(rj ). (7.7)
j =1
Here, FH ∈ C L2 ×L2 is the (inverse) 2D discrete Fourier transform matrix with the entry
qk+r
in the (k,)-th row and (r,q)-th column given by [FH ](k,),(r,q) := L12 ei2π L , and the
entries of the vector f are given by [f(r)](r,q) := e−i2π(r τ+q ν ) , k,,q,r = −N,. . .,N
(here, and in the following we use for convenience a two- or three-dimensional index
2
to refer to entries of vectors and matrices). Moreover, Gx ∈ CL×L is the Gabor matrix
defined in (7.5).
The significance of the representation in (7.7) is that recovery of the unknowns
{(bj ,rj )} from z is a 2D line spectral estimation problem that can be solved with
standard spectral estimation techniques such as Prony’s method [9]. Therefore, we only
2
need to recover z ∈ CL from y ∈ CL . To do so, we use that z is a sparse linear
combination of atoms in the set A := {f(r),r ∈ [0,1]3 }. A regularizer that promotes
such a sparse linear combination is the atomic norm induced by these signals [24],
defined as
200 Heckel

zA := inf |bk | : z = bk f(rk ) .
bk ∈C,rk ∈[0,1]2
k k
We estimate z by solving the basis pursuit type atomic norm minimization problem
problem
4 4
AN(y) : minimize 4z̃4A subject to y = Az̃. (7.8)
z̃
To summarize, we estimate the attenuation factors bk and time–frequency shifts rk from

y by
i. solving AN(y) in order to obtain z,
ii. estimating the rk from z by solving the corresponding 2D-line spectral estimation
problem, and

iii. solving the linear system of equations y = S−1k=0 bk Af(rk ) for the bk .
We remark that the rk may be obtained more directly from a solution to the dual of (7.8)
[14, sec. 6]; see also [12, sec. 3.1], [10, sec. 4], [11, sec. 2.2] for details on this approach
as it is applied to related problems.
Since computation of the atomic norm involves taking the infimum over infinitely
many parameters, finding a solution to AN(y) may appear to be daunting. For the
one-dimensional case (i.e., only time or frequency shifts), the atomic norm can be
characterized in terms of linear matrix inequalities [11, prop. 2.1]. This characterization
is based on the Vandermonde decomposition lemma for Toeplitz matrices, and allows
us to formulate the atomic norm minimization program as a semidefinite program that
can be solved in polynomial time. While this lemma generalizes to higher dimensions
[25, thm. 1], it fundamentally comes with a rank constraint that appears to prohibit an
straightforward characterization of the atomic norm in terms of linear matrix inequal-
ities. Nevertheless, based on [25, thm. 1], one can obtain a semidefinite programming
relaxation of AN(y), which can be solved in polynomial time. Similarly, a solution of
the dual of AN(y) can be found with a semidefinite programming relaxation. Since the
computational complexity of the corresponding semidefinite programs is quite large,
we will not dive into the details of those semidefinite programming relaxations. As
mentioned before, instead, we show in Section 7.4 that the parameters {rj } can be
recovered on an arbitrarily fine grid via 1 -minimization. While this leads to a gridding
error, the grid may be chosen sufficiently fine for the gridding error to be negligible
compared to the error induced by additive noise, and in practice, there is always some
additive noise.
7.3.2 Recovery Guarantees for Atomic Norm Minimization

We consider a random probing signal by taking the samples of the probing signal x
in (7.3) to be i.i.d. Gaussian random variables. More generally, the result presented in
Theorem 7.6 later in the chapter continues to hold if we choose the samples as sub-
Gaussian random variables, for example as random signs. Note that the probing signal
can be chosen by the radar engineer, therefore, choosing the coefficients at random is not
problematic and can be done in practice. Theorem 7.6 shows that, with high probability,
the triplets {(bj ,τj ,νj )} can be recovered perfectly from the samples by solving a con-
vex program, provided that the number of time–frequency shifts is sufficiently smaller
than the number of measurement, and provided that the following minimum separation
condition holds:
definition 7.1 (Minimum separation condition) We say the time–frequency shifts

(τj ,νj ) ∈ [0,1]2,j = 1,. . .,S satisfy the minimum separation condition if
2.38
max(|τj − τj # |,|νj − νj # |) ≥, for all j != j #, (7.9)
N
where |τj − τj # | is the wraparound distance on the unit circle. For example,
|3/4 − 1/2| = 1/4 but |5/6 − 1/6| = 1/3 != 2/3.
Note that the time–frequency shifts must not be separated in both time and frequency,
for example the minimum separation condition can hold even when τj = τj # for some
j != j # . The main result on recovery via atomic norm minimization from the paper [14]
is stated next.
theorem 7.2 Assume that the samples of the probing signal x ∈ CL are i.i.d.
N (0,1/L) random variables. Consider a signal where the sign of the attenuation
factors {bj }Sj=1 is i.i.d. uniform on {−1,1} or the complex unit disc, and suppose
that the time–frequency shifts {(τj ,νj )}Sj=1 obey the minimum separation condition.
Furthermore, choose δ > 0 and assume that the number of nonzero attenuation factors,
S, and the number of measurements, L, obey
L
S≤c 3
,
log (L/δ)
S
where c is a numerical constant. Then, with probability at least 1− δ, z = j =1 bj f(rj )
is the unique minimizer of AN(y), y = Gx FH z.
This result is essentially optimal in terms of the allowed sparsity level, as the number
S of unknowns can be linear – up to a logarithmic-factor – in the number of obser-
vations L. Even when we are given the time–frequency shifts (τj ,νj ), we can only
hope to recover the corresponding attenuation factors bj by solving the linear system of
equations in (7.3), provided that S ≤ L.
Since the complex-valued coefficients bj in the radar model describe the attenuation
factors, it is natural to assume that the phases of different bj are independent from
each other and are uniformly distributed on the unit circle of the complex plane. Indeed,
standard models for wireless communication channels and radar [26], assume the coeffi-
cients {bj } to be complex Gaussian distributed. Nevertheless, we believe that the random
sign assumption is not necessary for Theorem 7.2 to hold. In Section 7.7, we discuss
a closely related problem, which does not require the random sign assumption, and
thus provides a basis for the claim of the random sign assumption not being necessary.
Finally, we would like to point out that Theorem 7.2 continues to hold for sub-Gaussian
sequences x , for example random signs.
202 Heckel
7.3.3 Necessity of Minimum Separation

Theorem 7.2 imposes a minimum-separation condition, and indeed some form of sepa-
ration between the time–frequency shifts is necessary for stable recovery. To be specific,
we consider the simpler problem of line spectral estimation (see Section 7.2.4) that is
obtained from our setup by setting τj = 0 for all j . Clearly, any condition necessary
for the line spectral estimation problem is also necessary for the super-resolution radar
problem.
# #
If there are S # frequencies {νk }Sk=1 in an interval of length smaller than 2S
L , and S
#
is large, then in the presence of even a tiny amount of noise, stable recovery of the
attenuation factors and time–frequency shifts even from

S
z= bj f(rj ),
j =1
where we set, with a slight abuse of notation, [f(νj )] = ei2πj νj , is not possible (see [27,
thm. 1.1] and [10, sec. 1.7]). Condition (7.9) allows us to have 0.4 S # time–frequency
#
shifts in an interval of length 2S
L , which is optimal up to the constant 0.4.
This argument illustrates that for stable recovery, it is relevant whether a number
of frequencies, say S many, cluster together in a small interval of size smaller than
S/L. To illustrate this point further, consider again the simpler problem of line
S
spectral estimation, i.e., recovery or the frequencies from z = j =1 bj f(νj ), with
[f(νj )] = ei2πj νj . Consider the following (Vandermonde) matrix parameterized by S
and :
V = [f(0),f((1 − )/L),. . .,fL ((2S − 1)(1 − )/L)].
We next state a theorem, which provides a lower bound on the condition number of V.
The lower bound implies that there are signals with S many frequencies in an interval
smaller than S/L, that are indistinguishable even under a tiny amount of additive noise.
theorem 7.3 [28, thm. 1.3] Fix some ∈ (0,1), let K = 1− 1
L, and let
S = O(log(L/(1 − ))). Then the matrix V has condition number at least e O(S) .
The theorem implies that there exists a vector b with unit norm that obeys
Vb2 ≤ e−O(S) . As a consequence,
⎛ ⎞2
N
⎝ bj ei2πνj + bj ei2πνj ⎠ = Vb22 ≤ e−O(S),
=−N j odd j even
which means that there are two sets of S many point sources each, with separation
2(1−)
L , but telling them apart requires an exponentially small additive error. To obtain
intuition on the constants involved in the statement, we plot the condition number of V
for different values of S and in Figure 7.1.
While those two arguments show that some form of separation between the time–
frequency shifts is necessary, the exact form of separation required in (7.9) may not
be necessary for stable recovery and less restrictive conditions may suffice. Indeed, in
1 S=2
S=4
S=8
1/κ
0.5 S = 16
S = 32
0
0.6 0.8 1
1−
Figure 7.1 Inverse of the condition number κ of the matrix V with entries
[V]pq = e−i2πpq(1−)/L , L = 200, and q = 1,. . .,S, for different values of the number of
sources S as a function of the separation between frequencies of (1 − )/L.
the simpler problem of line spectral estimation, Donoho [27] showed that stable super-
resolution is possible via an exhaustive search algorithm, even when condition (7.9) is
#
violated locally, as long as every interval of the ν-axis of length 2SL contains less than
S # /2 frequencies and S # is small (in practice, think of S # 10).
7.3.4 Implications for the Detection Accuracy of Radar Systems

Translated to the continuous setup, Theorem 7.2 implies that with high probability, the
triplets (bj , τ̄j , ν̄j ) can be identified perfectly provided that
4.77 4.77
| τ̄j − τ̄j # | ≥ or |ν̄j − ν̄j # | ≥ , for all j != j #, (7.10)
B T
and provided that S ≤ cBT /log3 (BT ). Since we can exactly identify the delay-Doppler
pairs (τ̄j , ν̄j ), as opposed to only localizing them on a grid, this result offers a signif-
icant improvement in resolution over conventional radar techniques. Specifically, with
a standard pulse-Doppler radar, which samples the received signal and performs digital
matched filtering in order to detect the objects, the delay-Dopper shifts ( τ̄j , ν̄j ) can only
be estimated up to an uncertainty of about (1/T ,1/B).
We hasten to add that in the radar literature, the term super-resolution is often used
for the ability to resolve objects that are very close – even closer than the Rayleigh res-
olution limit [29] that is proportional to 1/B and 1/T for delay and Doppler resolution,
respectively. The norm minimization approach discussed here permits identification of
each object with a precision that is much higher than 1/B and 1/T as long as the other
objects are not too close. Specifically, other objects should be separated by a constant
multiple of the Rayleigh resolution limit as formalized by the minimum separation
condition (7.10). Recall that, however, any method that attempts to recover objects
closer than the resolution limit can only succeed if there are very few objects below that
limit, since resolving many objects that are all below the resolution limit is in general
impossible, as discussed previously in Section 7.3.3.
Finally, recall that the approach discussed here allows the delay-Doppler pairs (τ̄j , ν̄j )
to lie in [−T /2,T /2] × [−B/2,B/2] so the delay-Doppler pairs can lie in a rectangle
204 Heckel
of area L = BT " 1. The ability to handle a potentially large region in which delay-
Doppler pairs can lie in is important in radar applications, since we might need to resolve
objects with large relative distances and relative velocities.
7.3.5 Can Standard Nonparametric Methods Yield Similar Performance?

We finally note that standard nonparametric estimation methods such as the MUSIC
algorithm can in general not be applied directly to the super-resolution radar prob-
lem. The reason is that MUSIC relies on multiple measurements (often referred to as
snapshots) [9, sec. 6.3], whereas we assume only a single measurement {yp }N p=−N
to be available. In our context, multiple measurements would amount to carrying out
multiple, independent input–output measurements. However, by choosing the probing
signal x in (7.3) to be periodic, a single measurement can be transformed into multiple
measurements, and for that case, such algorithms as, for example, the MUSIC algo-
rithm may be applied. However, this approach, discussed in detail in [14, appendix H],
requires the frequencies {νj } to be distinct,
√ the time-shifts {τj } to lie in a significantly
smaller range than [0,1], and S < L, as opposed to the much milder condition
S < L/ log3 (L/δ) required by the convex program. In addition, applying MUSIC
in that way is (significantly) more sensitive to noise than the convex programming-
based approach discussed in this chapter. If multiple measurements are available, for
example by observing distinct paths of a signal by an array of antennas, the situation
might be different. For that case, subspace methods have been studied for delay-Doppler
estimation in [30].
7.4 Super-Resolution Radar on a Fine Grid
A practical approach to estimating the triplets {(bj ,τj ,νj )} from the received signal y in
the input–output relation (7.3) is to suppose that the time–frequency shifts lie on a fine
grid, and solve the problem on that grid. In general this leads to a gridding error, which,
however, is minimized as the grid grows finer [31]. We next discuss the corresponding
(discrete) sparse signal recovery problem.
Suppose the time–frequency shifts lie on a fine grid with spacing (1/K,1/K), where
2
K is an integer obeying K ≥ L. Let b ∈ CK be the signal with each entry bm,n
corresponding to one of the grid points, with nonzeros equal to the attenuation factors
bj for the time–frequency shifts (τj ,νj ),j = 1,. . .,S. See Figure 7.2 for an illustration.
With this assumption, the input–output relation (7.3) becomes:

K−1
yp = bm,n [Fm/K Tn/K x]p, p = −N,. . .,N .
m,n=0
Writing this relation in matrix-vector form yields
y = Rb,
1
K
1
L
ν
1
L
Figure 7.2 Time frequency shifts that lie on a grid: (1/L,1/L) is the “natural” grid, and
(1/K,1/K) is the fine grid. Each dot corresponds to a potential nonzero, and the larger dots
correspond to the actual nonzeros {bj }.
2
where, as before, y is the vector containing as entries the values yp , and R ∈ CL×K ,
is the matrix with (m,n)-th column given by Fm/K Tn/K x. The matrix R contains as
columns “fractional” time–frequency shifts of the sequence x . If K = L, R contains as
columns only “whole” time–frequency shifts of x and R is equal to the Gabor matrix Gx ,
defined by (7.5). In this sense, K = L is the natural grid (see Section 7.2.3) and the ratio
SRF := K/L can be interpreted as a super-resolution factor. The super-resolution factor
determines by how much the (1/K,1/K) grid is finer than the original (1/L,1/L) grid.
A standard approach to the recovery of the sparse signal b from the underdetermined
linear system of equations y = Rb is to solve the following convex program:
4 4
4 4
L1(y) : minimize 4b̃4 subject to y = Rb̃. (7.11)
b̃ 1
The following theorem is the main result from [14] for recovery on the fine grid.
theorem 7.4 Assume that the samples of the probing signal x are i.i.d. N (0,1/L)
random variables, L = 2N + 1, and that L = 2N + 1 ≥ 1024. Consider a signal b
supported on S ⊆ {0,. . .,K − 1}2 , and suppose that it satisfies the minimum separation
condition
1 2.38
min max(|m − m# |,|n − n# |) ≥ .
(m,n),(m#,n# )∈S : (m,n)!=(m#,n# ) K N
Moreover, suppose that the nonzeros of b are i.i.d. uniform on {−1,1} or the complex
unit disk. Choose δ > 0, let y = Rb be the measurement corresponding to b, and
suppose that the number of nonzeros of b is sufficiently smaller than the number of
samples L
L
S≤c 3
,
log (L/δ)
where c is a numerical constant. Then, with probability at least 1 − δ, b is the unique
minimizer of L1(y), y = Rb.
206 Heckel
Note that Theorem 7.4 does not impose any restriction on K, in particular it can be
arbitrarily large. The proof of Theorem 7.4, discussed in Section 7.5.2, is closely linked
to that of Theorem 7.2.
7.4.1 Implementation Details

The matrix R has dimension L × K 2 , thus as the grid becomes finer, i.e., K becomes
larger, the complexity of solving L1(y) increases. The complexity of solving L1(y)
can be managed as follows. First, the complexity of first-order convex optimization
algorithms (such as TFOCS [32]) for solving L1(y) is dominated by multiplications
with the matrices R and RH . Due to the structure of R, those multiplications can be
done very efficiently by utilizing the fast Fourier transform. Second, in practice we have
( τ̄j , ν̄j ) ∈ [0,τmax ] × [0,νmax ], which means that
0 τ 1 0 ν 1
max max
(τj ,νj ) ∈ 0, × 0, . (7.12)
T B
It is therefore sufficient to consider the restriction of R to the τmaxBT
νmax K2
= τmax νmax L ·
2
SRF many columns corresponding to the time–frequency shifts (τj ,νj ) satisfy-
ing (7.12). Since typically τmax νmax BT = L, this results in a significant reduction
of the problem size.
7.4.2 Numerical Results and Robustness

We next discuss numerical results that show that the convex optimization-based super-
resolution approach is robust to noise. Consider the following modification of 1 -norm
minimization, which accounts for noise and the gridding error:
4 4 4 42
4 4 4 4
L1-ERR : minimize 4b̃4 subject to 4y − Rb̃4 ≤ δ.
b̃ 1 2
The parameter δ is chosen on the order of the noise variance or the expected grid-
ding error.
The paper [14] considered a synthetic problem with L = 201, where each problem
instance is generated by drawing√S = 10 time–frequency shifts (τj ,νj ) uniformly at
random from the interval [0,2/ 201]2 . This amounts to drawing the corresponding
delay-Doppler pairs ( τ̄j , ν̄j ) from the interval [0,2] × [0,2]. The attenuation factors bj
corresponding to the time–frequency shifts were drawn uniformly at random from the
complex unit disc, independently across j . Measurements were then obtained according
to the input–output relation (7.3).
Figure 7.6 depicts the average resolution error versus the super-resolution factor
. = K/L. The resolution error is defined as the average over j = 1,. . .,S of
SRF
L ( τ̂j − τj )2 + (ν̂j − νj )2 , where the ( τ̂j , ν̂j ) are the estimates of the time–frequency
shifts extracted from a solution of L1-ERR, obtained with the SPGL1 solver [33].
Note that the resolution attained at SRF = 1 corresponds to the resolution attained
by matched filtering and by the compressive sensing radar architecture [4] that was
discussed in Section 7.2.3.
0.4
2 -resolution error
0.3 Standard/CS Radar
0.2
10 dB SNR
0.1 30 dB SNR
Noiseless
0
1 5 10 15
Super-resolution factor
1
T
1
B
Figure 7.3 Super-resolution radar uniformly provides better resolution error than standard
or CS radar. The plot show the resolution error for the recovery of S = 10 time–frequency shifts
from the observation y with and without additive Gaussian noise n of a certain signal-to-noise
. SNR := y2 /n2 . The resolution error is defined as the average over
ratio
L ( τ̂j − τj )2 + (ν̂j − νj )2 , where (τj ,νj ) are the original time–frequency shifts, and the
( τ̂j , ν̂j ) are the time–frequency shifts on the grid, obtained by solving L1-ERR, for different
super-resolution factors. The different super-resolution factors are illustrated below the plot.
As mentioned before, there are two error sources that were incurred by this approach.
The first is the gridding error obtained by assuming that the points lie on a fine grid
with grid constant (1/K,1/K), which decays in K. The second is the additive noise
error, which is constant. The figure shows that for SRF larger than 1, the resolution is
significantly improved using the super-resolution radar approach. Moreover we see that
for small super-resolution factors SRF, the gridding error dominates, while for large
values of SRF, the additive noise error dominates. In this experiment, the gridding error
approximately decays as 1/SRF. The experiment demonstrates that in practice solving
the super-resolution radar problem on a fine grid is essentially as good as solving it
on the continuum – provided the super-resolution factor is chosen sufficiently large, so
that the gridding error becomes negligible relative to the error due to additive noise.
7.5 Proof Outline
In this section, we discuss the proofs of theorems 7.2 and 7.4 from [14], which are
closely linked. Specifically, both theorems 7.2 and 7.4 follow from the existence of
certain dual certificates, and the certificate for proving theorem 7.4 is obtained directly
from the certificate constructed to prove theorem 7.2.
208 Heckel
7.5.1 Proof of Theorem 7.2

Theorem 7.2 is proven by constructing an appropriate dual certificate; the existence of
this certificate guarantees that the solution to AN(y) is z, as formalized by Proposition 1,
as we will see shortly. Proposition 1 is a consequence of strong duality, and is well-
known for the discrete setting from the compressed sensing literature [3]. The proof is
standard, see for example [11, proof of prop. 2.4]. For convenience, in this section, we
set A = Gx FH , so that y = Az.

proposition 1 Let y = Az with z = Sj=1 bj f(rj ). Suppose there exists a function,
called dual certificate, of the form Q(r) = q,Af(r), parameterized by the complex
coefficients q ∈ CL , such that
Q(rj ) = sign(bj ), for all j , and

|Q(r)| < 1 for all r ∈ [0,1]2 \ {r1,. . .,rS }. (7.13)
Moreover, suppose that the vectors {Af(rj )}Sj=1 are linearly independent. Then z is the
unique minimizer of AN(y).
A condensed argument showing that Proposition 1 is true follows. Suppose4for4 con-

# # 4 #4
tradiction that there exists another optimal solution z# = j bj f(rj ) with z A =
# # #
j |bj | and {rj } != {rj }. First, suppose that {rj } ⊆ {rj }. Then, linear independence of
the vectors {Af(rj )} contradicts that of z != z. Next, suppose that {r#j } !⊆ {rj }. We then
#
have that
4 #4
4z 4 − zA = |bj# | −
i
|bj | > Re Q∗ (r#j )bj# − Re Q∗ (rj )bj
A
j j j j

qH Af(r#j )bj# − Re
ii
= Re qH Af(rj )bj
j j
iii
= 0.
Here, inequality (i) follows from the dual polynomial interpolating the sign pattern,
and from |Q(r#j )| < 1 for at least one r#j , which in turn follows from {r#j } !⊆ {rj } and
|Q(r)| < 1 for all r ∈/ {rj }, by assumption. Inequality (ii) follows from the definition of
the dual polynomial, and inequality (iii) follows from Az# = Az, by assumption. This
contradicts that z# is an optimal solution.
We now turn to the construction of a dual certificate obeying the conditions of Propo-
sition 1, which concludes the proof of Theorem 7.2. The construction of the dual certifi-
cate Q is inspired by the construction of related certificates in [10,11]. First, recall that
the entries of f(r),r = [τ,ν], are given by [f(r)](k,p) = ei2π(kτ+pν ) . From
< =
Q(r) = q,Af(r) = AH q,f(r) ,
it is seen that Q is a two-dimensional trigonometric polynomial in the variables τ and

ν with coefficient vector AH q. To build the certificate, we therefore need to construct a
two-dimensional trigonometric polynomial that satisfies the interpolation and bounded-
ness condition (7.13), and whose coefficients are constraint to be of the form AH q. Since
the matrix A is a function of the random probing signal x, Q is a random trigonometric

polynomial. We construct Q explicitly.
It is instructive to first consider the construction of a deterministic two-dimensional
trigonometric polynomial
> ?
Q̄(r) = q̄,f(r) ,
with unconstrained, deterministic coefficients q̄ ∈ CL , that satisfies the interpolation
and boundedness conditions (7.13), but whose coefficient vector q̄ is not constraint
to be of the form AH q. Such a construction has been established – provided that the
parameters {rk } obey the minimum separation condition in Definition 7.9 – by Candès
and Fernandez-Granda [10, prop. 2.1, prop. C.1] for the one- and two-dimensional case.
To construct the polynomial Q̄, [10] interpolate the signs {sign(bj )} with a fast-decaying
kernel
Ḡ(r) := F (τ)F (ν),
and slightly adopt this interpolation near the parameters {rj } with the partial derivatives
Ḡ(n1,n2 ) (r) := ∂ n1 /∂ τ n1 ∂ n2 /∂ν n2 Ḡ(r) to ensure that local maxima are achieved at
the rj :

S
Q̄(r) = ᾱ j Ḡ(r − rj ) + ᾱ 1j Ḡ(1,0) (r − rj ) + ᾱ 2j Ḡ(0,1) (r − rj ). (7.14)
j =1
Here, F is the squared Fejér kernel, which is a particular trigonometric polynomial with
coefficients gj , i.e.,

N
F (t) = gj ei2πtj .
j =−N
Shifted versions of the polynomial F (i.e., F (t −t0 ) for some t0 ∈ R) and the derivatives
of the polynomial F are also one-dimensional trigonometric polynomials of degree N .
Therefore Ḡ, its partial derivatives, and shifted versions thereof are two-dimensional
trigonometric polynomials of the form q̄,f(r). The construction of Q̄ is concluded by
showing that the coefficients ᾱj , ᾱ 1j , ᾱ 2j , ᾱ 3j , can be chosen such that Q̄ reaches global
maxima at the parameters {rj }.
The construction of Q in the paper [14] for proving Theorems 1 follows a similar
program. Specifically, the polynomial Q is constructed > such that
? it interpolates the signs
sign(bj ) at rj with the functions Gn (r,rj ) = Agn (rj ),Af(r) . Here, gn (r),n = (n1,n2 )
is the vector with (v,k)-th coefficient given by
gk gp (i2πk)n1 (i2πp)n2 e−i2π(τk+ν p),
where the {gj } are the coefficients of the squared Fejér kernel F , just defined. With this
definition, we have

E Gn (r,rj ) = Ḡn (r − rj ).
This follows from
0 1 0 1
E AH A = FH E GH
x Gx F = I,
210 Heckel

where we used E GH x Gx = I, shown at the end of this section. Moreover, Gn (r,rk )
concentrates around Ḡn (r − rk ).
Now, Q is constructed by interpolating the signs sign(bj ) at rj with G(0,0) (r,rj ),
j = 1,. . .,S, and slightly adopting this interpolation near the points {rj } with linear
combinations of the functions G(1,0) (r,rj ) and G(0,1) (r,rj ), in order to ensure that local
maxima of Q are achieved exactly at the rj . Specifically, we set

S
Q(r) = α j G(0,0) (r,rj ) + α1j G(1,0) (r,rj ) + α2j G(0,1) (r,rj ). (7.15)
j =1
Note that Q(r) is a linear

> H combination
? of the functions Gm (r,rj ), and by definition
of Gm (r,rj ) it obeys A q,f(r) , for some q, as desired. The proof is concluded by
showing that, with high probability, there exists a choice of coefficients αj ,α1j , and
α 2j such that Q reaches global maxima at the rj and Q(rj ) = uj , for all j . For this
argument to work, the particular choice of Gm (r,rj ) is crucial; the main ingredients
for the argument to work are that Gm (r,rj ) concentrates around Ḡ(r − rj ), and certain
properties of the deterministic functions Ḡ and Q̄.

Proof of E Gx H Gx = I:
By definition of the Gabor matrix in (7.5), the entry in the (k,)-th row and (k #,# )-th
column of GH x Gx is given by

N
kp k# p
∗
x Gx ](k,),(k #,# ) =
[GH xp− xp−# e−i2π L ei2π L .
p=−N
#
Noting that E [x ] = 0, we conclude that E [GH x Gx ](k,),(k #,# ) = 0 for != . For
# ∗
= , using the fact that E[xp− xp− ] = 1/L, we arrive at
1 i2π (k# −k)p

N
x Gx ](k,),(k #,# ) ] =
E[[GH e L .
L
p=−N
The
latter is equal to 1 for k = k# and 0 otherwise. This concludes the proof of
E Gx Gx = I.
H
7.5.2 Proof of Theorem 7.4

The following proposition, which is standard in the compressed sensing literature (see
e.g., [3]) states that the existence of a dual polynomial guarantees that L1(y) recovers b
from the measurement y = Gb. The proposition is the discrete analogue of Proposition 1
from earlier in the chapter.
proposition 2 Let y = Rb and let S denote the support (i.e., the set of nonzero ele-
ments) of b. Assume that the columms of R corresponding to S are linearly independent.
If there exists a vector v in the row space of R with
vS = sign(bS ) and vS c ∞ < 1, (7.16)
then b is the unique minimizer of L1(y). Here, vS its the vector consisting of the entries
of v indexed by S, and likewise vS c consists of the entries not indexed by S, i.e., the
entries indexed by the complement of S.
Theorem 7.4 now follows directly from the existence of the polynomial Q that was
constructed in the previous section. To see this, define v as [v](m,n) = Q([m/K,n/K])
and note that v satisfies (7.16), since Q([m/K,n/K]) = sign(b(m,n) ) for (m,n) ∈ S and
|Q([m/K,n/K])| < 1 for (m,n) ∈ / S.
7.6 MIMO Radar
In this section, we discuss super-resolution imaging in the context of MIMO radar.

A MIMO radar uses multiple transmit antennas to send – typically orthogonal – probing
signals simultaneously, and records the reflections from the objects with multiple receive
antennas. As shown in this section, a MIMO radar can thereby, in principle, resolve the
relative angles in addition to the relative distances and velocities of objects with a single
measurement.
To illustrate the principle of a MIMO radar, first consider a radar system with a
single transmit and multiple receive antennas, and consider a object that lies in the
far field of the radar, so that the reflections of the objects that arrive at the receiver
are essentially parallel, as illustrated in Figure 7.4. The reflection from the object must
travel an additional distance of dR sin(θ) between the signals received at two adjacent
receive antennas. Thus, from an estimate of the angle of arrival we can determine the
relative position of a object. The angular resolution that can be achieved as well as the
number of objects that can be distinguished increases linearly in the number of receive
antennas.
As a consequence, for doubling the angular resolution or doubling the number of
objects to be distinguishable, a MISO radar needs to double its number of transmit
antennas. As we discuss next, using multiple transmit antennas in addition to multiple
θ
object
dR
Figure 7.4 Principle of MISO radar: × and correspond to transmit and receive antennas. The
transmit antenna sends a probing signal, and the reflections of the object are received by three
receive antennas. Estimating the relative delays of the probing signal allows to estimate the angle
of the object relative to the antenna array, in addition to distance and velocity.
212 Heckel
Reflection from object 1

dR
r = 0,j = 0 ×
θ
× Object 1
dT
d
×
Figure 7.5 Principle of MIMO radar: × and correspond to transmit and receive antennas.
Throughout, we assume the spacing of the NT transmit and NR receive antennas to be dT = 2fc
c
and dR = cN T
2fc , where fc is the carrier frequency.
receive antennas can give a much larger angular resolution from far fewer antennas.
Specifically, by arranging NT transmit and NR receive antennas in a particular way
(see Figure 7.5), a MIMO radar can obtain the same resolution obtained by a MISO
(or SIMO) radar with NT NR uniformly spaced receive (or transmit) antennas. This is
often called a MIMO virtual array. In this section, we discuss a MIMO radar model,
and show that the fundamental limit for resolving the angle-delay-Doppler triplets is
(1/(NT NR ),1/B,1/T ). We furthermore show that this limit can be overcome in the
sense that triplets can be resolved on a much finer grid, provided they are sufficiently
separated.
7.6.1 MIMO Signal Model and Problem Statement

We consider a MIMO radar with NT transmit and NR receive antennas that are colocated
and lie in a plane along with S objects, see Figure 7.5 for an illustration. The technical
results presented in this section generalize to the more general setup where objects lie in
three-dimensional space and the transmit and receive antennas lie in a two-dimensional
plane. We consider the simpler two-dimensional setup since the generalization to three
dimensions are straightforward. As in the previous section, we assume that the objects
are located in the far field of the array. As a consequence, propagating waves appear
planar and the angles between the object and each antenna are (approximately) the
same. We let the transmit and receive antennas be uniformly spaced with spacings
NT
dT = 2f1 c and dR = 2f c
, respectively, where fc is the carrier frequency of the probing
signals. This spacing yields a uniformly spaced virtual array with NT NR antennas, and
thus maximizes the number of virtual antennas achievable with NT transmit and NR
receive antennas [34,35]. As explained in Section 7.6.2, the (baseband) signal yr (t) at
continuous time t received by antenna r = 0,. . .,NR − 1, consists of the superposition
of the reflections from the objects of the transmitted probing signals xj (t),j = 0,. . .,
NT − 1, and is given by

S T −1
N
yr (t) = bk ei2πrNT βk ei2πj βk xj (t − τ̄k )ei2πν̄k t . (7.17)
k=1 j =0
Here, bk ∈ C is the attenuation factor, βk ∈ [0,1] the angle or azimuth parameter,

and τ̄k and ν̄k are the delay and Doppler shift, all associated with the k-th object.
The parameters βk , τ̄k , ν̄k determine the angle (β = − sin(θ)/2 see Figure 7.5), dis-
tance, and velocity of the k-th object relative to the radar. Locating the object there-
fore amounts to estimating the continuous parameters bk ,βk , τ̄k , ν̄k from the responses
yr ,r = 0,. . .,NR − 1, to known and suitably selected probing signals xj .
As discussed in Section 7.2, due to practical constraints, the probing signals must be
band-limited and approximately time-limited, and the responses to the probing signals
can only be observed over a finite time interval. We make the same assumption on
the probing signals as well as on the received signals as we made in Section 7.2; in
particular, we assume that the received signals yr are observed over an interval of length
T and that the probing signals xj have bandwidth B and are approximately supported on
a time interval proportional to T . As explained in Section 7.2, it follows from the band
limitation of the probing signals xt and the limited observation time of the received
signals yr that the received signals are characterized by the samples

S T −1
N
i2πrNT β k
[yr ]p = bk e ei2πj βk [Fνk Tτk xj ]p . (7.18)
k=1 j =0
Here, yr contains the samples of the received signal yr taken at rate 1/B (i.e.,
[yr ]p = yr (p/B) in the interval p/B ∈ [−T /2,T /2]), and xj is the vector containing
the samples of the probing signal ([xj ]p := xj (p/B)).
We have reduced the problem of identifying the locations of the objects under the con-
straints that the probing signals xj are band-limited and the responses yr are observed
over a finite time interval only, to the estimation of the parameters bk ∈ C, (βk ,τk ,νk ) ∈
[0,1]3,k = 1,. . .,S, from the samples [yr ]p,r = 0,. . .,NR − 1,p = −N,. . .,N , in the
input–output relation (7.18). We call this the super-resolution MIMO radar problem.
7.6.2 Derivation of the MIMO Input–Output Relation

In this section, we derive the MIMO input–output relation (7.17). Towards this goal, we
first consider a single object. The j -th antenna transmits the signal xj (t)ei2πfc t , where
fc is the carrier frequency. This signal propagates to the object, which we assume to be
a point scatterer, gets reflected, and propagates back to the r-th receiver. From Figure
7.5, we see that the corresponding delay is, as a function of the angle between antennas
and the object, θ, distance to the object, d, and speed of light, c, given by
214 Heckel
2d sin(θ)(dT j + dR r) 2(dT j + dR r)
τ̃ := + = τ̄ − β .
c c c
For the second equality, we defined the angle parameter β := − sin(θ)/2 and the delay
τ̄ := 2d
c . Taking the Doppler shift into account, the reflection of the j -th probing signal
received by the r-th receive antenna is given by
b̃xj (t − τ̃)ei2π(fc +ν̄ )(t−τ̃) . (7.19)
Here, b̃ ∈ C is the attenuation factor associated with the object, and ν̄ := 2v c fc is the
Doppler shift, which is a function of the relative velocity, v, of the object. By choosing
the antenna spacing as dT = 2fc c and dR = cN T
2fc , the reflection of the j -th probing signal
received by the r-th receive antenna in (7.19) becomes
j +rNT
i2π(fc +ν̄ )β
b̃xj (t − τ̃)ei2π(fc +ν̄ )(t−τ̄) e fc ≈ b̃xj (t − τ̄)ei2π(fc +ν̄ )(t−τ̄) ei2πβ(j +rNT ) .
Here, the approximation follows by the Doppler shift ν̄ being much smaller than the
carrier frequency fc , therefore fcf+c ν̄ ≈ 1, and τ̃ ≈ τ̄. If follows that the reflection of the
j -th probing signal received by the r-th receive antenna, after demodulation, is
bxj (t − τ̄)ei2πν̄ t ei2πβ(j +rNT ),
where we defined b = b̃e−i2πν̄ τ̄ . Next, consider S objects with parameters (bk ,βk ,
τ̄k , ν̄k ). Since, for S objects, the (demodulated) signal yr received by antenna r consists
of the superposition of the reflections of the probing signals xj ,j = 0,. . .,NT − 1,
transmitted by the transmit antennas, we obtain the input-output relation in (7.17) simply
by summing over the reflections given by bk xj (t − τ̄k )ei2πν̄k t ei2πβk (j +rNT ) .
7.6.3 MIMO Atomic Norm Minimization

Recall that our goal is to recover the unknown parameters (bk ,βk ,τk ,νk ) from the mea-
surements {yr }. Toward this goal, we proceed analogously as in Section 7.2.1. We
start by defining for convenience the vector r := [β,τ,ν], and write the input–output
relation (7.18) in matrix-vector form:

S
y = Az, z= bk f(rk ). (7.20)
k=1
2N
Here y is obtained by stacking the vectors yT0 ,. . .,yTNR −1 , and the vector f(r) ∈ CL T NR
has entries
[f(r)](v,k,p) = ei2π(vβ+kτ+pν ),v = 0,. . .,NT NR − 1, k,p = −N,. . .,N .
Similarly as before, we use for convenience a three dimensional index to refer to entries
2
of a vector. Finally, the matrix A ∈ CNR L×NR NT L is defined as follows. The expression
T −1
N
i2πrNT β
wr,p := e ei2πj β [Fν Tτ xj ]p,
j =0
in (7.3) can be written as

T −1
N
N
wr,p = ap,k,j ei2π(kτ+pν +(j +NT r)β),
j =0 k=−N
with
1
N
k
ap,k,j = [xj ] ei2π(−p) L .
L
=−N
Let fp,j ∈ CL be the vector with kth entry [fp,j ]k = ap,k,j , k = −N,. . .,N ,
2
and let Aj ∈ CL×L be the block-diagonal matrix with fTp,j on its pth diagonal,
p = −N,. . .,N . With this notation, A is defined as the block-diagonal matrix with the
2
matrix [A0,. . .,ANT −1 ] ∈ CL×NT L on its diagonal, for all NR blocks on the diagonal.
With this notation, (7.3) becomes (7.20).
Similarly as for the SISO radar problem, recovery of the unknowns bk ,rk = [βk ,τk ,νk ]

from the measurement z = Sk=1 bk f(rk ) is a 3D line spectral estimation problem that
can be solved with standard spectral estimation techniques. In order to recover the
vector z from the measurement y, we use that z is a sparse linear combination of atoms
in the set A := {f(r),r ∈ [0,1]3 }, and estimate z by solving the basis pursuit type atomic
norm minimization problem
4 4
AN(y) : minimize 4z̃4A subject to y = Az̃,
z̃
where

zA := inf |bk | : z = bk f(rk ) .
bk ∈C,rk ∈[0,1]3
k k
To summarize, as for the SISO radar problem, we estimate the parameters bk ,rk from y
by:
i. solving AN(y) in order to obtain z,
ii. estimating the rk from z by solving the corresponding 3D-line spectral estimation
problem, and

iii. solving the linear system of equations y = S−1k=0 bk Af(rk ) for the bk .
7.6.4 Recovery Guarantees for MIMO Atomic Norm Minimization

As before, we take the probing signals to be random by choosing its samples, i.e., the
entries of the xj as i.i.d. Gaussian (or sub-Gaussian) zero-mean random variables with
variance 1/(NT L). Moreover, we again require a minimum separation condition to be
satisfied.
definition 7.5 (MIMO minimum separation condition) We say the triplets

(βj ,τj ,νj ) ∈ [0,1]2,j = 1,. . .,S satisfy the minimum separation condition if for
all j,j # : j != j # ,
216 Heckel
10 5 5
|β j − βj # | ≥ or |τj − τj # | ≥ or |νj − νj # | ≥ . (7.21)
NT NR − 1 N N
As before, |τj − τj # | is the wraparound distance on the unit circle.
Note that the triplets must not be separated in angle, time, and frequency simulta-
neously; for the MIMO minimum separation condition to be satisfied, it is sufficient if
they are separated in at least one of those dimensions.
theorem 7.6 Assume that the samples of the probing signals xj ,j = 0,. . .,NT − 1,
are i.i.d. zero-mean Gaussian random variables with variance 1/(NT L), and let
L = 2N + 1 ≥ 1024 and NT NR ≥ 1024. Consider a signal where the signs of
the attenuation factors {bj }Sj=1 are i.i.d. uniform on {−1,1} or the complex unit disc,
and suppose that the triplets {(βj ,τj ,νj )}Sj=1 obey the MIMO minimum separation
condition. Furthermore, choose δ > 0 and assume that the number of nonzero
attenuation factors, S, obeys
min(L,NT NR )
S≤c , (7.22)
log3 (L/δ)
S
where c is a numerical constant. Then, with probability at least 1 − δ, z = k=1 bk f(rk )
is the unique minimizer of AN(y), y = Az.
Theorem 7.2 guarantees that, with high probability, the attenuation factors and loca-
tion parameters can be recovered perfectly from the observation y by solving a convex
program (recall that the parameters bk ,rk can be obtained from z), provided that the loca-
tions rk = [β k ,τk ,νk ] are sufficiently separated in either angle, time, or frequency, and
provided that the total number of objects satisfies condition (7.22). Note that, translated
to the physical parameters τ̄k , ν̄k , the MIMO minimum separation condition becomes:
For all k,k # : k != k # ,
10 10.01 10.01
|βk − βk # | ≥ or | τ̄k − τ̄k # | ≥ or |ν̄k − ν̄k # | ≥ .
NT NR − 1 B T
Theorem 7.2 is essentially optimal in the number of objects that can be located, since
S can be linear – up to a log-factor – in min(L,NT NR ), and S ≤ min(L,NT NR ) is a
necessary condition to uniquely recover the attenuation factors bk even if the locations
rk are known. To see this, note that for the linear system of equations (7.20) to have a
unique solution, the vectors Af(rk ) must be linearly independent. If βk = 0, for all k, or
if τk = 0 and νk = 0, for all k, the vectors Af(rk ),rk = (βk ,τk ,νk ),k = 0,. . .,S − 1 can
only be linearly independent provided that S ≤ L and S ≤ NT NR , respectively. This is
seen from
⎡ T −1 i2πj β ⎤
ei2π0β N j =0 e F ν T τ xj
⎢ ⎥
Af(r) = ⎢
⎣
..
.
⎥.
⎦
T −1 i2πj β
ei2πNT (NR −1)β N j =0 e F T x
ν τ j
We finally note that Theorem 7.2 is proven by constructing a dual certificate in a similar
manner to our certificate construction for the SISO in Section 7.7.
7.6.5 MIMO Super-Resolution Radar on a Fine Grid

As discussed before for the SISO radar setup, a practical approach to estimate the
parameters rk from the received signals, is to suppose the angle-time-frequency triplets
lie on a fine grid, and solve the recovery problem on that grid. In general this leads to a
gridding error, that, however, decreases as the grid becomes finer. We next discuss the
corresponding (discrete) sparse signal recovery problem.
Suppose the parameters (βk ,τk ,νk ) lie on a grid with spacing (1/K1,1/K2,1/K3 ),
where K1,K2,K3 are integers obeying K1 ≥ NT NR , K2,K3 ≥ L = 2N + 1. With this
assumption, the super-resolution MIMO radar problem reduces to the recovery of the
sparse (discrete) signal b ∈ CK1 K2 K3 from the measurement
y = Rb,
where R ∈ CNR L×K1 K2 K3 is the matrix with (n1,n2,n3 )-th column given by
Af(rn ), rn = (n1 /K1,n2 /K2,n3 /K3 ).
Note that the nonzeros of b and its indices correspond to the attenuation factors bk and
the locations rk on the grid. A standard approach to the recovery of the sparse signal b
from the underdetermined linear system of equations y = Rb is to solve the following
convex program:
4 4
4 4
L1(y) : minimize 4b̃4 subject to y = Rb̃. (7.23)
b̃ 1
What follows is the main result for recovery on the fine grid.
theorem 7.7 Assume L = 2N + 1 ≥ 1024, NT NR ≥ 1024, and suppose we

observe y = Rb, where b is a sparse vector with nonzeros indexed by the support set
S ⊆ [K1 ] × [K2 ] × [K3 ], [K] := {0,. . .,K − 1}. Suppose that those indices satisfy the
following minimum separation condition: for all triplets (n1,n2,n3 ),(n#1,n#2,n#3 ) ∈ S,
|n1 − n#1 | 10 |n2 − n#2 | 5 |n3 − n#3 | 5

≥ or ≥ or ≥ .
K1 NT NR − 1 K2 N K3 N
Moreover, we assume that the signs of the nonzeros of b are chosen independently from
symmetric distributions on the complex unit circle. Choose δ > 0 and assume
min(L,NT NR )
S≤c ,
log3 (L/δ)
where c is a numerical constant. Then, with probability at least 1 − δ, s is the unique
minimizer of L1(y) in (7.23).
Note that Theorem 7.7 does not impose any restriction on K1,K2,K3 , in particular
they can be arbitrarily large. The proof of Theorem 7.7 is closely linked to that of
Theorem 7.6; specifically, similarly to the SISO case, the existence of a certain dual
certificate guarantees that b is the unique minimizer of L1(y). The dual certificate is
obtained directly from the dual certificate for the continuous case, which, as mentioned
218 Heckel
before, is constructed for the MIMO case in a similar way as the certificate for the SISO
case has been constructed in Section 7.7.
7.6.6 Numerical Results and Robustness

Paralleling the discussion in Section 7.4.2 for SISO radar, we next briefly numerically
evaluate the resolution obtained by the norm minimization approach to enable super-
resolution in a MIMO radar, and demonstrate robustness to noise. We discuss a
synthetic experiment from [15]. In that experiment, a problem instance was generated
by setting NT = 3,NR = 3, L = 41, and S = 5.√Object locations (βk ,τk ,νk )
were drawn uniformly at random from [0,1] × [0,2/ L]2 . Moreover, we choose
K1 = SRFNT NR ,K2 = SRFL, and K3 = SRFL, where SRF ≥ 1 can be interpreted as
a super-resolution factor as it determines by how much the (1/K1,1/K2,1/K3 ) grid is
finer than the original, coarse grid (1/(NT NR ),1/L,1/L). To account for additive noise,
as before, we solve the following modification of L1(y) in (7.23)
4 4 4 42
4 4 4 4
L1-ERR : minimize 4b̃4 subject to 4y − Rb̃4 ≤ δ,
b̃ 1 2
with δ chosen on the order of the noise variance. There are two error sources incurred
by this approach: the gridding error obtained by assuming the points lie on a grid
with spacing (1/K1,1/K2,1/K3 ), which decreases in SRF and becomes negligible, and
the additive noise error, which is constant. The results of the simulations, depicted in
Figure 7.6, show that the object resolution of the super-resolution approach is signifi-
cantly better than that of the compressed sensing-based approach [7,8] corresponding to
recovery on the coarse grid, i.e., SRF = 1. Moreover, the results show that the approach
SNR = 5dB
0.6 SNR = 10dB
Resolution error
SNR = 20dB
0.4 Noiseless
0.2
1 2 3 4 5 6
SRF
Figure 7.6 Resolution error for the recovery of S = 5 objects from the samples y with and
without additive Gaussian noise n of a certain signal-to-noise ratio SNR = y22 /n22 , for
varying super-resolution factors (SRFs). The resolution error is defined as the average over
(NT2 NR2 (β̂ k − βk )2 + L2 ( τ̂k − τk )2 + L2 (ν̂k − νk )2 )1/2 , k = 1,. . .,S, where (β̂k , τ̂k , ν̂k ) are the
locations obtained by solving L1-ERR.
0.15 IAA
Resolution error L1-ERR
0.1
0.05
0
30 20 10
SNR in dB
Figure 7.7 Resolution error (smaller is better) of L1-ERR and IAA applied to y + n,
where n ∈ CNR L is additive Gaussian noise, such that the signal-to-noise ratio is
SNR := y22 /n22 . As before, the resolution error is defined as
(NT2 NR2 (β̂ k − βk )2 + L2 ( τ̂k − τk )2 + L2 (ν̂k − νk )2 )1/2 , where (β̂ k , τ̂k , ν̂k ) are the
locations obtained by solving L1-ERR.
is robust to noise and that even under noise, the localization accuracy is significantly
improved over a standard approach to radar.
We next compare our approach to the iterative adaptive approach (IAA) [36], pro-
posed for MIMO radar in [37]. IAA is based on weighted least squares and has been
proposed in the array processing literature. IAA can work well even with only one
snapshot only and can therefore be directly applied to the MIMO super-resolution prob-
lem. However, to the best of our knowledge, no analytical performance guarantees are
available in the literature that attest IAA similar performance than the 1 -minimization-
based approach. We compare the IAA algorithm [36, table II, entitled “The IAA-APES
Algorithm”] to L1-ERR, for a problem with parameters NT = 3,NR = 3, and L = 41,
as before, but with SRF = 3 and (βk ,τk ,νk ) = (k/(NR Nt ),k/L,k/L),k = 1,. . .,S, so
that the location parameters lie on the fine grid and are separated. As before, we draw
the corresponding attenuation factors bk i.i.d. uniformly at random from the complex
unit disc. Our results, depicted in Figure 7.7, show that L1-ERR performs better in this
experiment than IAA, in particular for small signal-to-noise ratios.
7.7 Discussion and Current and Future Research Directions
In this section we discuss a class of signal recovery problems that are closely related
to the SISO and MIMO radar problem, corresponding results, and open theoretical
research problems, and comment on computational challenges in applying the methods
discussed here in practical radar systems.
The SISO and MIMO radar problems discussed in this chapter are versions of a
more general problem, namely that of recovering a signal that is sparse in a continu-
ously indexed dictionary, with the index corresponding to the locations and velocities of
objects. In contrast, traditional compressive sensing research has focused on the recov-
ery of signals that are sparse in discretely indexed dictionaries via convex programs [3]
220 Heckel
amongst other methods. As discussed in this chapter in the context of radar, signals
that are sparse in continuously indexed dictionaries can be recovered via a convex
program either by solving an atomic norm minimization problem, or by discretizing
the continuous parameter space. However, the discretization step induces a gridding
error. While in practice – provided the grid is chosen sufficiently fine – the gridding
error is negligible, fine discretization leads to dictionaries with extremely correlated,
i.e., coherent, columns, and the theory of compressive sensing, and many practical
algorithms, rely on the dictionary to be incoherent and therefore does not apply to
fine grids. The primary difficulty with recovering signals in such dictionaries is that
the elements of the dictionaries are very close to each other – which is the case both
for continuously indexed dictionaries as well as for finely discretized signals. Stable
recovery of signals that are sparse in such dictionaries requires excluding signals that
are supported on elements of the dictionary that are very close to each other. Such signals
are excluded here by imposing the minimum separation condition.
More specifically, the SISO and MIMO radar problems belong to a class of signal
recovery problems where the goal is to recover unknown coefficients {bj } and location
parameters {rj } from the measurement

S
y= bj Af(rj ).
j =1
Here, f(r) is a vector containing complex exponentials, and r is a d-dimensional

location parameter. For example, if r is a one-dimensional location parameter then
[f(r)]r = e−i2πr τ , and if r is two-dimensional location vector, as in the SISO radar
problem, then [f(r)](r,q) = e−i2π(r τ+q ν ) . The matrix A is a problem-dependent matrix
that parameterizes the dictionary; in the SISO case it is equal to A = Gx FH ; see (7.7). In
other words, radar signals are sparse in a continuously dictionary that is parameterized
by a matrix A. We hasten to add that there are a number of interesting signal recovery
problems in continuously indexed dictionaries that are not of this particular form: the
deconvolution problem in the paper [38] is such an example, and the computational
imaging problem in [39] is another.
7.7.1 Stability to Noise

In this section, we discuss analytical results on the stability to noise of the atomic norm
minimization framework. While currently there are no formal results for the SISO and
MIMO radar problem, we discuss statements pertaining to two closely related problems.
The first is the classical line spectral estimation problem where A is the identity matrix,
and the second is a generalized line spectral estimation problem, where A is a Gaussian
random matrix.
As mentioned before, for A = I, the sparse recovery problem reduces to the clas-
sical line spectral estimation problem studied for the noisy and noiseless case in [10,
12,40,41]. This problem is well understood, and atomic norm minimization succeeds
under very general conditions. Specifically, the paper [10] shows that a convex program
provably recovers the coefficients {bj } and location parameters {rj } perfectly, provided
that the minimum separation condition holds (see Section 7.3.2). While standard spec-
tral estimation techniques such as Prony’s method, MUSIC, and ESPRIT [9] also prov-
ably succeed for the noiseless case, even without requiring a separation condition, an
advantage of the convex program is that it does not require knowledge of S, and perhaps
more importantly is provably robust [12,40,42].
Specifically, Tang et al. [42] show that the atomic norm regularized least squares
estimate enables near-optimal denoising of z from a noisy measurement y = z + e,
S
where z is a signal of the form z = j =1 f(νj ), and e is zero-mean Gaussian noise
with variance σ I. Specifically, provided the minimum separation condition holds, the
2
atomic norm regularized least squares estimate ẑ obeys

4 4
4ẑ − z42 ≤ cσ2 S log(L),
2
with high probability. This result is essentially optimal; to see this, note that even if we
knew the location parameters {νj } exactly, the best bound we could achieve would be
σ2 S [43], only by a logarithmic factor short of the result. In addition, the paper [42]
shows that the corresponding estimator localizes the frequencies up to a certain (small)
error, provided that the number of samples is sufficiently large.
Next, suppose that A is a M × L Gaussian random matrix with i.i.d.
N (0,1/M) + iN (0,1/M) entries, with M typically much smaller than L. Assume we

are given a noisy measurement y = Az + e, with z a signal of the form z = Sj=1 f(νj )
and e is noise. To estimate the signal z from such a measurement one can use an atomic
norm optimization problem of the form
1
ẑ := arg min y − Az̄22 subject to z̄A ≤ τ, (7.24)
z̄ 2
with τ a tuning parameter. Theorem 4 in the paper [16] shows that, as long as
M ≥ cS log(L),
with c a fixed numerical constant, the minimizer of (7.24) with τ = zA obeys
4 4
4ẑ − z42 ≤ cS log(L)σ2,
2
with high probability. Again, this results is essentially optimal both with respect to the
number of measurements M required relative to the number of unknowns S, as well as
with respect to the best bound we can achieve for estimation under noise. Moreover,
there results do not make any assumptions on the coefficients bj . Unfortunately, the
corresponding proof strategy does not carry over to the case where A is a structured
random matrix, as in the SISO and MIMO radar problems considered here. However,
numerical simulations suggest that for a number of structured random matrices, includ-
ing the matrices parametrizing the SISO and MIMO radar problems, the performance of
the nuclear norm minimization program, as well as 1 -norm regularized least squares,
is similar.
222 Heckel
7.7.2 Computational Challenges

A challenge in applying the convex optimization-based super-resolution methods in the
context of radar is their computational complexity. Specifically, in the context of SISO
radar, if we estimate the time–frequency shifts based on solving (the dual of) the atomic
norm minimization problem, then the optimization variable of the corresponding convex
program has dimensions L2 × L2 . Thus, an algorithm that solves or approximates the
atomic norm minimization problem has computational complexity at least L4 , which
is infeasible for real-world problems. As discussed in Section 7.4, what comes to our
rescue is that in practice, we can solve the super-resolution radar problem on a fine
grid, and recover the signal by solving a 1 -minimization problem. The complexity of
numerically solving the corresponding program with a standard iterative algorithm such
as FISTA [44] depends on the dimension of the matrix (determined by the problem size
[BT and number of antennas]), as well as the corresponding super-resolution factor
(see Section 7.4.1) and the conditioning of the matrices involved. Increasing the super-
resolution factor leads to both a larger problem size (the number of columns in the
SISO radar problem increases quadratically in the super-resolution factor), which results
in a larger iteration complexity, as well as in general to a slower convergence of the
iterative algorithm, since the conditioning of the matrices involved become worse. Thus,
an interesting research direction is to develop computationally efficient algorithms for
the recovery of signals in continuously indexed dictionaries in general, and for the SISO
and MIMO radar problem in particular. See the papers [45,46] for some recent work in
this direction.
References
[1] D. W. Bliss and K. W. Forsythe, “Multiple-input multiple-output (MIMO) radar and
imaging,” in Asilomar Conf. on Signals, Syst. and Comput., 2003, pp. 54–59.
[2] J. Li and P. Stoica, “MIMO radar with colocated antennas,” IEEE Signal Process. Mag.,
vol. 24, no. 5, pp. 106–114, 2007.
[3] E. J. Candès, J. Romberg, and T. Tao, “Robust uncertainty principles: Exact signal
reconstruction from highly incomplete frequency information,” IEEE Trans. Inf. Theory,
vol. 52, no. 2, pp. 489–509, 2006.
[4] M. Herman and T. Strohmer, “High-resolution radar via compressed sensing,” IEEE Trans.
Signal Process., vol. 57, no. 6, pp. 2275–2284, 2009.
[5] R. Baraniuk and P. Steeghs, “Compressive radar imaging,” in IEEE Radar Conf., 2007,
pp. 128–133.
[6] R. Heckel and H. Bölcskei, “Identification of sparse linear operators,” IEEE Trans. Inf.
Theory, vol. 59, no. 12, pp. 7985–8000, 2013.
[7] D. Dorsch and H. Rauhut, “Refined analysis of sparse MIMO radar,” J. Fourier Anal. Appl.,
vol. 23, 2017.
[8] T. Strohmer and H. Wang, “Adventures in compressive sensing based MIMO radar,” in
Excursions in Harm. Anal., ser. Appl. Num. Harm. Anal., 2015, pp. 285–326.
[9] P. Stoica and R. L. Moses, Spectral Analysis of Signals. Pearson Prentice Hall, 2005.
[10] E. J. Candès and C. Fernandez-Granda, “Towards a mathematical theory of super-

resolution,” Comm. Pure Appl. Math., vol. 67, no. 6, pp. 906–956, 2014.
[11] G. Tang, B. N. Bhaskar, P. Shah, and B. Recht, “Compressed sensing off the grid,” IEEE
Trans. Inform. Theory, vol. 59, no. 11, pp. 7465–7490, 2013.
[12] B. N. Bhaskar, G. Tang, and B. Recht, “Atomic norm denoising with applications to line
spectral estimation,” IEEE Trans. Signal Process., vol. 61, no. 23, pp. 5987–5999, 2013.
[13] C. Aubel, D. Stotz, and H. Bölcskei, “A theory of super-resolution from short-time Fourier
transform measurements,” J. Fourier Anal. Appl., vol. 24, 2018.
[14] R. Heckel, V. I. Morgenshtern, and M. Soltanolkotabi, “Super-resolution radar,” Inf.
Inference, vol. 5, no. 1, pp. 22–75, 2016.
[15] R. Heckel, “Super-resolution mimo radar,” in IEEE International Symposium on Information
Theory, 2016, pp. 1416–1420.
[16] R. Heckel and M. Soltanolkotabi, “Generalized line spectral estimation via convex optimiza-
tion,” IEEE Trans. Inf. Theory, vol. 64, pp. 4001–4023, 2017.
[17] T. Strohmer, “Pseudodifferential operators and Banach algebras in mobile communications,”
Appl. Comput. Harmon. Anal., vol. 20, no. 2, pp. 237–249, 2006.
[18] G. Tauböck, F. Hlawatsch, D. Eiwen, and H. Rauhut, “Compressive estimation of doubly
selective channels in multicarrier systems,” IEEE J. Sel. Topics Signal Process., vol. 4, no. 2,
pp. 255–271, 2010.
[19] W. Bajwa, A. Sayeed, and R. Nowak, “Learning sparse doubly-selective channels,” in
Proc. of 46th Allerton Conf. on Commun., Control, and Comput., Monticello, IL, 2008,
pp. 575–582.
[20] W. U. Bajwa, K. Gedalyahu, and Y. C. Eldar, “Identification of parametric underspread
linear systems and super-resolution radar,” IEEE Trans. Signal Process., vol. 59, no. 6,
pp. 2548–2561, 2011.
[21] D. Slepian, “On bandwidth,” Proc. IEEE, vol. 64, no. 3, pp. 292–300, 1976.
[22] F. Krahmer, S. Mendelson, and H. Rauhut, “Suprema of chaos processes and the restricted
isometry property,” Commun. Pur. Appl. Math., vol. 67, no. 11, pp. 1877–1904, 2014.
[23] A. Gershman and N. Sidiropoulos, Eds., Space-Time Processing for MIMO Communica-
tions. John Wiley & Sons, 2005.
[24] V. Chandrasekaran, B. Recht, P. A. Parrilo, and A. S. Willsky, “The convex geometry of
linear inverse problems,” Found. Comput. Math., vol. 12, no. 6, pp. 805–849, 2012.
[25] Z. Yang, L. Xie, and P. Stoica, “Vandermonde decomposition of multilevel Toeplitz matrices
with application to multidimensional super-resolution,” IEEE Trans. Inform. Theory, vol. 62,
2015.
[26] P. A. Bello, “Characterization of randomly time-variant linear channels,” IEEE Trans.
Commun. Syst., vol. 11, no. 4, pp. 360–393, 1963.
[27] D. L. Donoho, “Superresolution via sparsity constraints,” SIAM J. on Math. Anal., vol. 23,
no. 5, pp. 1309–1331, 1992.
[28] A. Moitra, “Super-resolution, extremal functions and the condition number of vandermonde
matrices,” in Proc. of the Forty-seventh Annual ACM Symposium on Theory of Computing,
2015, pp. 821–830.
[29] A. Quinquis, E. Radoi, and F. C. Totir, “Some radar imagery results using superresolution
techniques,” IEEE Trans. Antennas Propag., vol. 52, no. 5, pp. 1230–1244, 2004.
[30] A. Jakobsson, A. L. Swindlehurst, and P. Stoica, “Subspace-based estimation of time delays
and Doppler shifts,” IEEE Trans. Signal Process., vol. 46, no. 9, pp. 2472–2483, 1998.
224 Heckel
[31] G. Tang, B. N. Bhaskar, and B. Recht, “Sparse recovery over continuous dictionaries-just
discretize,” in Asilomar Conf. on Signals, Syst. and Comput., Pacific Grove, CA, 2013,
pp. 1043–1047.
[32] S. R. Becker, E. J. Candès, and M. C. Grant, “Templates for convex cone problems with
applications to sparse signal recovery,” Math. Prog. Comp., vol. 3, no. 3, pp. 165–218, 2011.
[33] E. van den Berg and M. P. Friedlander, “Probing the pareto frontier for basis pursuit
solutions,” SIAM J. Sci. Comput., vol. 31, no. 2, pp. 890–912, 2008.
[34] B. Friedlander, “On the relationship between MIMO and SIMO radars,” IEEE Trans. Signal
Process., vol. 57, no. 1, pp. 394–398, Jan. 2009.
[35] T. Strohmer and B. Friedlander, “Analysis of sparse MIMO radar,” Appl. Comput. Harm.
Anal., vol. 37, no. 3, pp. 361–388, 2014.
[36] T. Yardibi, J. Li, P. Stoica, M. Xue, and A. B. Baggeroer, “Source localization and sensing:
A nonparametric iterative adaptive approach based on weighted least squares,” IEEE Trans.
Aerosp. Electron. Syst., vol. 46, no. 1, pp. 425–443, 2010.
[37] W. Roberts, P. Stoica, J. Li, T. Yardibi, and F. Sadjadi, “Iterative adaptive approaches to
MIMO radar imaging,” IEEE J. Sel. Topics Signal Process, vol. 4, no. 1, pp. 5–20, 2010.
[38] B. Bernstein and C. Fernandez-Granda, “Deconvolution of point sources: A sampling
theorem and robustness guarantees,” Commun. Pur. Appl. Math., to appear, 2017.
[39] N. Antipa, G. Kuo, R. Heckel et al., “Diffusercam: Lensless single-exposure 3D imaging,”
Optica, vol. 5, no. 1, p. 19, 2018.
[40] E. J. Candès and C. Fernandez-Granda, “Super-resolution from noisy data,” J. Fourier Anal.
Appl., vol. 19, no. 6, pp. 1229–1254, 2014.
[41] C. Fernandez-Granda, “Super-resolution of point sources via convex programming,” Infor-
mation and Inference, vol. 5, pp. 251–303, 2016.
[42] G. Tang, B. N. Bhaskar, and B. Recht, “Near minimax line spectral estimation,” IEEE Trans.
Inf. Theory, vol. 61, no. 1, pp. 499–512, 2015.
[43] F. Bunea, A. Tsybakov, and M. Wegkamp, “Sparsity oracle inequalities for the Lasso,”
Electron. J. Stat., vol. 1, pp. 169–194, 2007.
[44] A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm for linear
inverse problems,” SIAM Journal on Imaging Sciences, vol. 2, no. 1, p. 183202, 2009.
[45] N. Boyd, G. Schiebinger, and B. Recht, “The alternating descent conditional gradient
method for sparse inverse problems,” SIAM J. Optimiz., vol. 27, no. 2, pp. 616–639, 2017.
[46] N. Rao, P. Shah, and S. Wright, “Forward-backward greedy algorithms for atomic norm
regularization,” IEEE Trans. Signal Process., vol. 63, no. 21, pp. 5798–5811, 2015.
8 Adaptive Beamforming via
Sparsity-Based Reconstruction
of Covariance Matrix
Yujie Gu, Nathan A. Goodman, and Yimin D. Zhang
Traditional adaptive beamformers are very sensitive to model mismatch, especially

when the training samples for adaptive beamformer design are contaminated by the
desired signal. In this chapter, we reconstruct a signal-free interference-plus-noise
covariance matrix for adaptive beamformer design. Exploiting the sparsity of sources,
the interference covariance matrix can be reconstructed as a weighted sum of the outer
products of the interference steering vectors, and the corresponding parameters can be
estimated from a sparsity-constrained covariance matrix fitting problem. In contrast
to classical compressive sensing and sparse reconstruction techniques, the sparsity-
constrained covariance matrix fitting problem can be effectively solved as a modified
least-squares solution by using the a priori information on the array structure. Extensive
simulation results demonstrate that the proposed adaptive beamformer almost always
provides near-optimal output performance, regardless of the input signal power.
8.1 Introduction
Adaptive beamforming is an effective spatial filtering technique that adjusts the beam-
forming weight vector to increase the strength of the signal of interest while suppress-
ing interference and noise. As a ubiquitous task in array signal processing, adaptive
beamforming has been widely used in radar, sonar, wireless communications, radio
astronomy, seismology, speech processing, medical imaging, and many other areas (see,
for example, [1–5] and the references therein). Unlike conventional data-independent
beamformers (e.g., fixed or switched beamformers), adaptive beamformers depend on
the array received data and hence are expected to provide better capabilities for inter-
ference suppression and signal enhancement. Nevertheless, it is also well known that
adaptive beamformers are extremely sensitive to model mismatch, especially when the
training samples used for the calculation of the beamforming weight are contaminated
by the desired signal. In practice, such model mismatch commonly occurs. For example,
the data covariance matrix cannot be accurately estimated due to the limited number of
training samples, and the steering vector of the desired signal may also be imprecise
or even unknown due to look direction error, imperfect calibration, and other effects.
Whenever model mismatches exist, classical adaptive beamformers (e.g., Capon beam-
former [6]) will suffer severe performance degradation. To this end, adaptive beam-
former design with robustness against model mismatch has been an intensive research
225
226 Gu, Goodman, and Zhang
topic in the past decades, and various robust adaptive beamforming techniques have
been proposed (see, for example, [4,7,8] and the references therein). Based on the
principle of adaptive beamforming, these robust adaptive beamformers can be classified
into two major categories.
In the first category, robust adaptive beamforming techniques process the sample
covariance matrix, because the exact interference-plus-noise covariance matrix is usu-
ally unavailable in practical applications. The sample covariance matrix is a maximum
likelihood estimate of the data covariance matrix, and thus leads to the optimal output of
the resulting adaptive beamformer when the sample size tends to infinity. Unfortunately,
the sample size is often limited in practice, thus resulting in significant performance
degradation, especially when the desired signal is present in the training samples [9,10].
The most popular robust beamforming technique in this category is the diagonal loading
technique [9,11–13], which adds a scaled identity matrix to the sample covariance
matrix to reduce the conditional number. A major problem with diagonal loading is that
there is no clear rule to choose the optimal diagonal loading factor in different scenarios.
In order to choose the diagonal loading factor adaptively, rather than in an ad hoc way,
several user parameter–free adaptive beamforming algorithms were proposed (see, for
example, [14] and the references therein). The shrinkage estimation approach [15] in the
sense of minimizing mean squared error (MSE) can automatically compute the diagonal
loading levels without the need to specify any user parameters. However, this approach
leads to an estimate of the statistical covariance matrix of the array received data rather
than the required interference-plus-noise covariance matrix. In such a case, the perfor-
mance degradation becomes severe with the increase of the desired signal power, even
when the desired signal steering vector is exactly known. The eigenspace decomposition
technique [16,17] is another popular approach for robust adaptive beamforming that is
applicable to an arbitrary steering vector mismatch case. The key idea of this technique
is to use the projection of the presumed steering vector onto the sample signal-plus-
interference subspace. This approach requires the knowledge of the dimension of the
signal-plus-interference subspace. It is known that this approach suffers severe perfor-
mance degradation from the subspace swap1 when the signal-to-noise ratio (SNR) is
low [18,19]. It also suffers from the signal self-nulling problem, especially at high SNR
levels. A sparsity-based iterative adaptive approach (IAA) [20] can iteratively update
the spatial power estimates in the whole observation field and subsequently update
the covariance matrix used for adaptive beamformer design. Although it does improve
the power estimate, the IAA beamforming algorithm is not robust against direction-of-
arrival (DOA) mismatch because its weight is simply that of the scanning grid point
corresponding to the assumed DOA of the desired signal.
In the second category, robust adaptive beamforming techniques process the pre-
sumed desired signal steering vector because the exact knowledge of the steering vector
is not easy to obtain in practice. In practical situations, steering vector mismatch can
1 A subspace swap occurs when the measured data is better approximated by some components of the noise
subspace than by some components of the signal subspace, i.e., there is a switch of vectors between the
estimated signal and noise subspaces.
Adaptive Beamforming via Sparsity-Based Reconstruction of Covariance Matrix 227
easily occur due to look-direction errors [21,22] or imperfect array calibration and dis-
torted antenna shape [23]. Besides these, other common causes leading to steering vector
mismatches include array manifold mismodeling because of source wavefront distor-
tions that result from environmental inhomogeneities [24,25], near-far problem [26],
source spreading and local scattering [27–30], as well as other effects [10]. In this
category, the linear constrained minimum variance (LCMV) beamformer [31] is most
commonly used. It provides robustness against uncertainty in the signal look direction
by broadening the main lobe of the beampattern. However, the additional imposed
constraints reduce the degrees of freedom (DOFs) of the resulting adaptive beamformer.
More importantly, the LCMV beamformer becomes less robust when any other types
of steering vector mismatch beyond the look-direction errors become dominant. To
improve the robustness of adaptive beamformers against arbitrary unknown steering
vector mismatches, the worst-case performance optimization-based technique [32–34]
makes explicit use of an uncertainty set of the signal steering vector. This method
requires that the upper bound of the norm of the mismatch vector is a priori unknown.
Moreover, the worst operating conditions may not always occur. Hence, this adaptive
beamforming technique is also an ad hoc approach and will suffer from performance
degradation whenever the upper bound of the norm of the mismatch vector is either
overestimated or underestimated. Another representative technique in this category is to
estimate the desired signal steering vector by maximizing the beamformer output, under
the constraint that the convergence of the steering vector estimate to any interference
steering vector or their combinations is prohibited [35,36]. However, the imposed norm
constraint on the steering vector is too strict to be satisfied, particularly when there exist
local scattering encountered in, e.g., mobile communications and indoor speech signal
processing. In such cases, gain perturbations in different sensors cannot be ignored, and
then the norm constraint no longer holds.
As mentioned previously, these two categories of adaptive beamforming techniques
were developed almost independently in the past four decades. Obviously, these adaptive
beamforming techniques are not optimal, because they respectively assume that either
the desired signal steering vector or the interference-plus-noise covariance matrix is
exactly known. Since the pioneering work of Vorobyov et al. [37], adaptive beamform-
ing has been required to be jointly robust against covariance matrix uncertainty and
steering vector mismatch [38–42]. It is worth noting that, in [40], the interference-plus-
noise covariance matrix is reconstructed by integrating the outer products of interfer-
ence steering vectors weighted by the Capon spatial spectrum over a region separated
from the desired signal direction, thus removing the desired signal component from the
covariance matrix used for adaptive beamformer design. The reconstructed interference-
plus-noise covariance matrix is then used to correct the presumed signal steering vector
in order to maximize the beamformer output power under the only constraint that the
corrected steering vector does not converge to any interference steering vector or their
combinations. Based on the reconstructed interference-plus-noise covariance matrix
and the estimated desired signal steering vector, the resulting adaptive beamformer
provides a near-optimal output performance with a fast convergence rate. However,
the computational complexity of covariance matrix reconstruction is high due to the
integral operation. In addition, there is a certain level of performance loss when the
number of training samples is small, because the source power obtained from the Capon
spatial spectrum is underestimated and, as a result, the estimated interference-plus-noise
covariance matrix is inaccurate.
In this chapter, we will elaborate adaptive beamforming via sparsity-based recon-
struction of the interference-plus-noise covariance matrix [43]. By exploiting the spar-
sity of sources in the observed spatial domain, the interference covariance matrix is
reconstructed as a linear combination of the outer products of the interference steering
vectors weighted by their individual power, which can be estimated from a sparsity-
constrained covariance matrix fitting problem. As such, the proposed technique provides
a signal-free interference-plus-noise covariance matrix to enable robust adaptive beam-
former design that avoids the signal self-nulling problem. It requires low computational
complexity, as there is no matrix inversion or eigen-decomposition involved in the
sparsity-constrained covariance matrix fitting problem. Hence, the proposed adaptive
beamforming technique is suitable for an arbitrary number of training samples [44].
When the number of training samples is larger than the number of array sensors, the
formulated sparsity-constrained covariance matrix fitting problem can be effectively
solved by using the known array structure, i.e., estimate the directions of sources and
their power in turn. The proposed adaptive beamformer is compared to existing state-
of-the-art adaptive beamformers in terms of computational complexity, output signal-
to-interference-plus-noise ratio (SINR) performance, and convergence rate. Numerical
simulations clearly demonstrate the near-optimal output performance and faster conver-
gence rate of the proposed adaptive beamforming algorithm exploiting the sparsity of
sources in the spatial domain.
8.2 Adaptive Beamforming Criterion
In this section, we first build the narrowband array signal model, then briefly review
adaptive beamforming criteria and classical adaptive beamformers.
8.2.1 Array Signal Model

Consider a narrowband array consisting of M omnidirectional sensors, depicted in
Figure 8.1. The baseband received signal of the array at the time instant k,
x(k) = [x1 (k),. . .,xM (k)]T ∈ CM , can be represented as
x(k) = x s (k) + x i (k) + n(k), (8.1)
where x s (k), x i (k), and n(k) are statistically independent components of the desired
signal, interference, and noise, respectively. Here, ( · )T denotes the transpose operator.
Among them, the desired signal vector x s (k) is expressed as
x s (k) = a s s(k), (8.2)
where s(k) is the desired signal waveform, and a s ∈ CM is the corresponding signal
steering vector. Ideally, the steering vector is a function depending on the array geometry
Figure 8.1 System block diagram of the adaptive beamformer.
as well as source direction, e.g., a s a(θs ), where θs is the direction of the desired
signal impinging on the array. For example, the ideal steering vector of a uniform linear
array (ULA) has the form of
0 2π 2π
1T
a(θ) = 1,e−j λ d sin θ,. . .,e−j λ (M−1)d sin θ , (8.3)
√
where θ is the DOA of the source, j = −1 is the imaginary unit, λ is the wavelength
of the narrowband signal, and d = λ/2 is the inter-element spacing of the array.
Similarly, the steering vector of the interference has the similar form with a different
source direction. In contrast, there is no such form for the additive noise because noise
does not have a fixed direction.
8.2.2 Adaptive Beamforming Criteria

The objective of adaptive beamforming is to design a data-dependent beamforming
weight vector w = [w1,. . .,wM ]T ∈ CM , such that the beamformer output
y(k) = wH x(k) (8.4)
is the best estimate of the desired signal waveform s(k), where ( · )H denotes the
Hermitian transpose. To this end, a number of adaptive beamforming criteria have
been developed in the past decades. Among them, maximum SINR [6] is the most
popular one. Other feasible adaptive beamforming criteria include minimum MSE
[3], minimum least-squares error [45], and minimum mutual information [46]. The
interested readers are referred to references [3,4,47] for the detailed performance trade-
offs among different adaptive beamforming criteria. In this chapter, the maximum SINR
criterion will be mainly considered for adaptive beamformer design.
The beamformer output for the SINR maximization problem, defined as

2
E wH x s (k)
max SINR 2
w E wH (x i (k) + n(k))
2
σs2 wH a s
= H , (8.5)
w R i+n w
is mathematically equivalent to the minimum variance distortionless response (MVDR)
problem [6] as
min wH R i+n w s.t. wH a s = 1, (8.6)
w

where σs2 E |s(k)|2 is the desired signal power, and

R i+n E (x i (k) + n(k)) (x i (k) + n(k))H ∈ HM , (8.7)
is the interference-plus-noise covariance matrix. Here, E [ · ] denotes the statistical
expectation, and HM denotes the M × M Hermitian matrix. Using the Lagrange
multiplier method, the solution of the MVDR problem
R −1
i+n a s
wMVDR = −1
, (8.8)
aH
s R i+n a s
is easily obtained. The MVDR beamformer is sometimes referred to as the Capon

beamformer, which maximizes the output SINR.
Substituting the data covariance matrix

R = E x(k)x H (k)
s + R i+n,
= σs2 a s a H (8.9)
into (8.6) in lieu of the generally unavailable interference-plus-noise covariance matrix
R i+n , the corresponding solution,
R −1 a s
wMPDR = −1
, (8.10)
aH
s R as
is referred to as the minimum power distortionless response (MPDR) beamformer.
Using the matrix inversion lemma, the MPDR beamformer is proven to be equivalent to
the MVDR beamformer as
1 $ %−1
wMPDR = −1
R i+n + σs2 a s a H
s as
H
as R as
H −1
1 −1 R −1
i+n a s a s R i+n
= −1
R − as
σs−2 + a H −1
i+n
aHs R as s R i+n a s
= αwMVDR, (8.11)
& '
−1 H R −1 a 1−a H R −1 a /σ −2 +a H R −1 a
where the scalar coefficient α = a Hs R i+n a s /a s s s i+n s s s i+n s
does not affect the adaptive beamformer performance in terms of the output SINR.
Hence, the MPDR beamformer is also referred to as an MVDR beamformer in the
majority of the early literature.
In practical array applications including radar, however, the data covariance matrix
cannot be accurately estimated due to the limited training samples, and the signal steer-
ing vector may not be precisely known because of the imperfect knowledge of the
source location, propagation environment and/or array calibration. In such cases, the
MPDR beamformer suffers severe performance degradation, which becomes obvious
with the increase of input signal power. Hence, the MPDR beamformer underperforms
the MVDR beamformer in practical applications.
8.2.3 Adaptive Beamformer Design

Limited by the size of training samples, the exact data covariance matrix R is not easy
available in practical applications, not to mention the signal-free interference-plus-noise
covariance matrix R i+n . It is usually replaced by the sample covariance matrix
1
K
R̂ = x(k)x H (k), (8.12)
K
k=1
where K is the number of snapshots (i.e., training samples). The resulting adaptive
beamformer,
−1
R̂ ā s
wSMI = −1
, (8.13)
ā H
s R̂ ā s
is called the sample matrix inversion (SMI) beamformer [48], where ā s = a(θs ) is
the presumed signal steering vector. Whenever there exists a desired signal in the array
received signal x(k), the SMI beamformer is in essence an MPDR beamformer (8.10)
rather than an MVDR beamformer (8.8). As K → ∞, R̂ will converge to R, and
the corresponding output SINR will approach the optimal value under stationary and
ergodic assumptions. However, when K is small, the large gap between R̂ and R is
known to dramatically affect the output performance of the SMI beamformer, especially
when there is a desired signal in the training samples [9,10].
In order to reduce the sensitivity of the SMI beamformer to model mismatches,
many different beamforming algorithms have been developed in the past decades and
successfully applied in a wide range of areas (see, for example, [3,4,7,14] and the
references therein). In the following, several classical adaptive beamforming algorithms
are briefly reviewed.
Diagonal Loading Beamforming

Diagonal loading is the most popular adaptive beamforming approach that is robust to
the data uncertainty [9,13]. Replacing the sample covariance matrix R̂ in the SMI beam-
former (8.13) by a diagonally loaded sample covariance matrix R̂ + ξI , the resulting
beamformer,
$ %−1
R̂ + ξI ā s
wDL−SMI = $ %−1 , (8.14)
ā s R̂ + ξI
H ā s
is referred to as the diagonal loading SMI (DL-SMI) beamformer, where ξ is a diagonal
loading factor, and I is an identity matrix.
The performance of the DL-SMI beamformer depends on the diagonal loading factor
ξ. It is usually chosen in an ad hoc way, typically about ten times the noise power, i.e.,
ξ = 10σn2 , where the noise power σn2 is assumed to be known [9]. Obviously it is not
optimal, nor is the instantaneous noise power easy to know. In order to adaptively choose
the loading factor, several user parameter–free approaches have been proposed for adap-
tive beamforming [14]. However, this method leads to an estimate of the data covariance
matrix rather than that of the interference-plus-noise covariance matrix. Regardless of
the value of the chosen diagonal loading factor, the performance loss of the DL-SMI
beamformer is inevitable, and this degradation becomes more severe with the increase
of the desired signal power [49]. The main reason is that the desired signal component is
always active in any kind of diagonal loading beamformer and its effect becomes more
pronounced with the increase of input SNR [40].
Eigenspace Decomposition Beamforming

Motivated by the success of DOA estimation [50], the idea of eigen-decomposition
has also been introduced for adaptive beamformer design [16,17]. Replacing the pre-
sumed steering vector ā s in (8.13) by the projection of ā s onto the sample signal-plus-
interference subspace, the resulting eigenspace beamformer is given by
−1
wEIG = R̂ P E ā s = E−1 E H ā s , (8.15)
where P E = EE H is the orthogonal projection matrix onto the signal-plus-interference

subspace. Here, the matrix E contains the signal-plus-interference subspace eigenvec-
tors of R̂, and the diagonal matrix contains the corresponding eigenvalues.
The eigenspace beamformer is robust to arbitrary steering vector mismatch. However,
this approach does not work well at low SNR as well as at high signal-to-interference
ratio (SIR) cases. In the former case, the estimation of the projection matrix onto the
signal-plus-interference subspace breaks down because of the high probability of sub-
space swaps. In the latter case, the desired signal component denominates the sample
covariance matrix, thus degrading the performance of the adaptive beamformer. Further-
more, the eigenspace beamformer also does not work well when the dimension of the
signal-plus-interference subspace is high and/or difficult to determine.
Worst-Case Beamforming
The worst-case performance optimization-based adaptive beamforming [32] guarantees
a distortionless response for all possible steering vectors in a predetermined set. The
worst-case adaptive beamforming problem can be formulated as
min wH R̂w s.t. max |wH (ā s + es )| ≥ 1, (8.16)

w es 2 ≤ε
where es = a s − ā s denotes the mismatch vector between the actual signal steer-
ing vector a s and the presumed signal steering vector ā s , and ε is the upper bound
of the norm of the mismatch vector es . Here, · 2 denotes the 2 -norm, also called
the Euclidean norm. Because the constraint condition is nonlinear and nonconvex, the
worst-case adaptive beamforming problem (8.16) is a semi-infinite nonconvex quadratic
program, and is NP-hard2 . By using the special structure of the objective function and
the constraints, the nonconvex optimization problem can be reformulated as a second-
order cone programming (SOCP) problem
min wH R̂w s.t. wH ā s ≥ ε w2 + 1,
w
$ %
Im wH ā s = 0, (8.17)
which is convex and can be efficiently solved in polynomial time using the well-
established interior point methods. Here, Im ( · ) denotes the imaginary part of a
complex number.
The worst-case beamformer is robust to arbitrary unknown signal steering vector
mismatch with an upper-bounded norm. However, in practical applications, neither the
mismatch vector nor its upper bound is a priori known. Either overestimation or under-
estimation of the upper bound of the norm of the steering vector mismatch will degrade
the performance of the worst-case beamformer. In addition, the worst-case beamformer
also suffers the signal self-nulling problem because it uses the sample covariance matrix
R̂ rather than the interference-plus-noise covariance matrix R i+n .
Iterative Adaptive Beamforming

The IAA algorithm [20] is a kind of sparse approach to beamforming by itera-
tively updating the spatial spectrum estimation and beamforming weighting vectors
based on a weighted least-squares approach. Considering that the IAA depends on
the unknown spatial spectrum distribution, it must be implemented in an iterative
way. The initialization is done by a delay-and-sum (DAS) beamformer, which is
a spatial matched filter with a data-independent weight vector wDAS = a(θ) M , as
ŝl (k) = a H (θl )x(k)/M, l = 1,. . .,L, k = 1,. . .,K, from which the power estimates

are given by p̂l = K1 K k=1 |ŝl (k)| ,l = 1,. . .,L. Here, L is the number of potential
2
source locations in the observed field (or the number of scanning points), which is
usually much larger than the true number of sources. Then, the IAA algorithm repeats
the following iterative process
R̄ = A(θ)diag(p̂)AH (θ)
for l = 1,. . .,L
−1
R̄ a(θl )
wl = −1
a H (θl )R̄ a(θl )
p̂l = wH
l R̄wl
end for (8.18)
to converge, where A(θ) = [a(θ1 ),a(θ2 ),. . .,a(θL )] ∈ CM×L is the array steering
matrix, p̂ = [p̂1, p̂2,. . ., p̂L ]T ∈ RL
+ is the estimated spatial spectrum. Here, R+ denotes
L
the set of L-dimensional vectors of nonnegative real numbers.
2 In optimization theory, NP-hard problems represent a class of extremely difficult problems that cannot be
solved in polynomial time.
As such, the IAA algorithm can achieve the signal waveform (and hence signal power)
estimation by way of sparse signal representation. It performs well when there is no
model mismatch. However, when there is a slight model mismatch on the signal steering
vector, e.g., signal-look direction mismatch, performance degradation would occur, and
the degradation becomes severe with the increase of the input SNR.
8.3 Covariance Matrix Reconstruction-Based Adaptive Beamforming
In order to avoid, or at least mitigate, the signal self-nulling phenomenon prevalent

in adaptive beamformers, in this section, we will elaborate a covariance matrix sparse
reconstruction method to provide an estimate of the signal-free interference-plus-noise
covariance matrix for adaptive beamformer design. In such a case, the performance of
the resulting adaptive beamformer will always approach the optimal value in terms of
the output SINR. Moreover, the proposed adaptive beamformer has a faster convergence
rate than classical adaptive beamformers.
8.3.1 Interference-Plus-Noise Covariance Matrix Reconstruction

Similar to the signal covariance matrix in (8.9), i.e., R s = σs2 a(θs )a H (θs ), the
interference-plus-noise covariance matrix has the form of

Q
R i+n = σi2q a(θiq )a H (θiq ) + σn2 I, (8.19)
q=1
where Q is the number of interferers, a(θiq ) is the steering vector of the q-th interfer-
ence impinging from the DOA θiq , and σi2q is the corresponding interference power.
Hence, in order to have an accurate estimate of the interference-plus-noise covariance
matrix R i+n , we need to know the steering vectors of all interferers via DOAs and their
individual power, together with the noise power. When these pieces of information are
unavailable, the interference-plus-noise covariance matrix can be reconstructed as [40]

R̂ i+n = pCapon (θ)a(θ)a H (θ)d θ, (8.20)
¯

where a(θ) is the steering vector associated with a hypothetical direction θ,
1
pCapon (θ) = −1
(8.21)
a H (θ)R̂ a(θ)
is the Capon spatial spectrum estimator, and ¯ is the complement sector of . Here,
is a known or estimated angular sector in which the desired signal is located. Hence, the
covariance matrix estimator R̂ i+n collects all interference and noise in the out-of-sector
¯ which effectively excludes the desired signal component.
,
Correspondingly, the adaptive beamformer based on interference-plus-noise covari-

ance matrix reconstruction
−1
R̂ i+n ā s
wRecon = −1
(8.22)
ā H
s R̂ i+n ā s
can dramatically improve the performance regardless of the desired signal power
(see [40] and accompanying simulations). Nevertheless, the estimation accuracy of
R̂ i+n in (8.20) is poor because the Capon estimator (8.21) underestimates the true
power, especially when the number of snapshots is limited. On the other hand, the
computational complexity is high because the covariance matrix reconstruction process
introduces the unnecessary integral operation, where the number of interferers is
actually countable.
8.3.2 Sparsity-Based Interference-Plus-Noise Covariance Matrix Reconstruction

Because of the DOF requirement, the number of array sensors is typically larger than
the true number of sources. Hence, besides the low-rank characteristic of the array
covariance matrix, the target sources in the observed field have the sparse nature. In
such a case, this sparsity can be leveraged to reconstruct the interference-plus-noise
covariance matrix R i+n , which will provide better estimation accuracy and simplify the
integral operation of (8.20) over the entire complement sector .¯
According to (8.19), the interference-plus-noise covariance matrix is a function of the
directions and power of interferers, as well as the noise power. The estimation accuracy
of these parameters will affect the performance of the adaptive beamformer via the
reconstructed interference-plus-noise covariance matrix. To estimate the parameters of
both the desired signal and interferers, we formulate a sparsity-constrained covariance
matrix fitting problem according to (8.9) as
4 4
4 4
min 4R̂ − AP AH − σn2 I 4 s.t. p0 = Q + 1,
p,σn2 F
p ≥ 0,
σn2 > 0, (8.23)
where p ∈ RL + is the spatial spectrum distribution on the sample grids of the observed
spatial domain (e.g., {θ1,θ2,. . .,θ } ¯ ∈ RL×L
L ∈ ∪ ), P = diag(p) + is the corre-
sponding diagonal matrix, A = a(θ1 ),a(θ2 ),. . .,a(θL ) ∈ C M×L is the array manifold
matrix, and · F and · 0 , respectively, denote the Frobenius norm of a matrix
and the 0 “norm” of a vector. Note that, although it does not satisfy the positive
homogeneity, the 0 “norm”, which counts the number of nonzero elements in a vector,
is an ideal measure of sparsity. According to the sparse observation, the number of
potential sources is much larger than the true number of sources, i.e., L " Q + 1. The
idea behind (8.23) is intuitive in the sense that it tries to find the sparsest spatial spectrum
distribution p and the noise power σn2 such that the difference between the resulting
covariance matrix AP AH + σn2 I and the sample covariance matrix R̂ is minimized.
However, the true number of sources is a priori unknown. Even if known, it is under-
stood that (8.23) is a difficult combinatorial optimization problem due to the nonconvex
0 “norm” constraint, and is intractable even for moderately sized problems. In the past
decades, many approximation methods have been proposed to solve this nonconvex
optimization problem, such as greedy approximations [51,52] and lp (p ≤ 1) convex
relaxations [53,54]. When the solution p is sufficiently sparse, the 0 “norm” can be
approximately replaced by the 1 -norm. By introducing the 1 -norm convex relaxation,
the nonconvex optimization problem (8.23) can be formulated as a convex one:
4 4
K
4 4
min 4R̂ − AP AH − σn2 I 4 s.t. p1 ≤ σs2 + σi2k + σn2 + δ,
p,σn2 F
k=1
p ≥ 0,
σn2 > 0, (8.24)
K
where the 1 -norm of p equals the power sum of all sources (i.e., σs2 + k=1 σi2k + σn2 ),
and a small number δ > 0 is added to the power constraint in order to allow a space for
the optimization algorithm to search for p. However, in practical applications, the true
number of sources is not easy to know, not to mention their power.
Alternatively, the convex optimization problem in (8.24) can be reformulated as a
basis pursuit denoising (BPDN) problem [55] as
4 4
4 4
min 4R̂ − AP AH − σn2 I 4 + γp1 s.t. p ≥ 0,
p,σn2 F
σn2 > 0, (8.25)
where γ is a regularization parameter controlling the trade-off between the sparsity of

the spatial spectrum and the residual norm of covariance matrix fitting. The optimiza-
tion problem is convex and can be solved using standard and highly efficient interior
point methods. Besides the BPDN, the least absolute shrinkage and selection oper-
ator (LASSO) [56] is another popular formulation based on the 1 -norm relaxation.
Note that, there is no matrix inversion or eigen-decomposition required in the pro-
posed sparsity-constrained covariance matrix fitting problem. Hence, it is suitable for
arbitrary number of snapshots from one to infinity. However, the obtained solution is
not absolutely sparse because of the 1 -norm relaxation. In addition, the regularization
parameter γ is difficult to determine in different scenarios. Either overestimation or
underestimation will sacrifice the balance between data fidelity and sparsity, which
subsequently leads to performance degradation of the resulting adaptive beamformer.
When the number of snapshots is larger than the number of array sensors, we can
decompose the sparsity-constrained covariance matrix fitting problem into two associ-
ated subproblems: (1) a source localization problem to find the DOA support of sources;
and (2) a power estimation problem operating on the DOAs that were estimated in the
first subproblem. The combination of these two subproblems represents an approxima-
tion to the solution of the sparsity-constrained covariance matrix fitting problem.
Compared to adaptive beamforming, DOA estimation is a more mature array process-
ing technique, and there are many sophisticated methods available (see, for example,
[3,53] and the references therein). In general, the DOAs are estimated either from a
spectral search algorithm (see, for example, [50,57]) or from a search-free polynomial
rooting algorithm (e.g., [58] and the references therein). For convenience, here we
simply use the classical Capon spatial spectrum p Capon (θ) in (8.21) to estimate the
DOAs of sources. The estimated DOAs provide the support of the sparse vector defined
in the proposed sparsity-constrained covariance matrix fitting problem.
Let p denote the set of directions corresponding to the peaks of p Capon on the entire
observed spatial domain (i.e., ∪ ), ¯ for which the cardinality is usually greater than
the true number of sources because of the spurious peaks (i.e., |p | = p0 > Q + 1).
Here, | · | denotes the cardinality of a set. In order to minimize the 0 “norm” p0 to
find the sparsest solution of (8.23), a common method is to remove the spurious peaks by
setting a threshold, such as the noise power, which can be approximately$ estimated
% as the
minimum eigenvalue of the sample covariance matrix (i.e., σ̂n2 = λmin R̂ ) [59], where
λ min ( · ) denotes the minimum eigenvalue of a matrix. In theory, there are M − Q − 1
eigenvalues that equal the actual noise power σn2 . However, in practical applications with
a limited number of snapshots, the minimum eigenvalue of the sample covariance matrix
is always smaller than the noise power. Hence, if the value of a peak in the Capon spatial
spectrum pCapon is lower than the threshold, it will be regarded as a spurious peak and
its corresponding direction will be removed from the set p . After removing all the
spurious peaks, the residual set is denoted as ˜ p = { θ̃p,1,. . ., θ̃ } with cardinality
p, Q̃
˜
|p | = Q̃ ≤ |p |. In such a case, p0 = Q̃ ≥ Q + 1.
After finding the DOA support θ̃p = [ θ̃p,1,. . ., θ̃p, Q̃ ]T , the sparsity-constrained
covariance matrix fitting problem (8.23) degenerates into an inequality-constrained
least-squares problem:
4 4
min 4R̂ − A( θ̃ p )P ( θ̃p )AH ( θ̃p ) − σn2 I 4F s.t. p( θ̃ p ) > 0,
p(θ̃ p ),σn2
σn2 > 0, (8.26)
where P ( θ̃p ) = diag(p( θ̃ p )) is a diagonal matrix with the power distribution p( θ̃p ) ∈
Q̃
R++ on the DOA support θ̃p , and A( θ̃p ) = a(θ̃p,1 ),. . .,a( θ̃p, Q̃ ) ∈ CM×Q̃ is the cor-
Q̃
responding array manifold matrix. Here, R++ denotes the set of Q̃-dimensional vectors
of positive real numbers. The strict inequality constraint enforced here indicates that
the signal power on the found DOA support θ̃p are always positive. The optimization
problem (8.26) is convex and can be solved using highly efficient interior point methods.
It is noted that covariance matrix reconstruction-based adaptive beamformers are
not very sensitive to the estimation error in the noise power. Therefore, for the sake
of simplicity, the optimization variable of noise power σn2 is taken to be the mini-
mum eigenvalue of R̂, which leads to a simplified inequality-constrained least-squares
problem:
4 4
min 4R̂ − λmin (R̂)I − A( θ̃p )P ( θ̃p )AH ( θ̃ p )4F s.t. p( θ̃p ) > 0. (8.27)
p(θ̃ p )
Using the vectorization property, this can be further simplified as

4 4
min 4vec(R̂ − λmin (R̂)I ) − (A( θ̃ p ) A(θ̃ p ))p( θ̃ p )42 s.t. p( θ̃ p ) > 0, (8.28)
p(θ̃ p )
where vec ( · ) denotes the vectorization operator, and denotes the Khatri–Rao prod-
uct. Without the inequality constraint, the closed-form solution to (8.27) is given by
−1
p( θ̃p ) = GH G GH r, (8.29)

where G = A( θ̃p ) A( θ̃ p ) = a(θ̃p,1 ) ⊗ a( θ̃p,1 ),. . .,a( θ̃p, Q̃ ) ⊗ a( θ̃p, Q̃ )

= vec(a( θ̃p,1 )a H ( θ̃p,1 )),. . .,vec(a( θ̃p, Q̃ )a H ( θ̃p, Q̃ )) ∈ CM ×Q̃ is obtained by stack-
2
2
ing the outer products of the sources steering vectors, and r = vec(R̂−λ min (R̂)I ) ∈ CM
is vectorized from the sample covariance matrix subtracted by an estimated noise
covariance matrix. Here, ⊗ denotes the Kronecker product. Then, the estimated spatial
spectrum of (8.23) is Q̃-sparse and is expressed as

p( θ̃p ) θ∈˜ p,
p(θ) = ˜ p. (8.30)
0 θ∈
/
Namely, only Q̃ entries of the estimated spatial spectrum p(θ) are nonzero and all
other L − Q̃ entries are zero. An example of spatial spectrum comparison between the
proposed sparse spectrum (8.30) and the Capon spectrum (8.21) is illustrated in Figure
8.2, where three sources impinge from DOAs of −50◦ , −20◦ , and 5◦ with the SNR of
30 dB, 30 dB, and 20 dB, respectively. It is clear that the proposed method achieves a
more accurate estimate of the signal power.
30
Capon spectrum
25 Sparse spectrum
20
15
Power (dB)
10
–5
–10
–15
–90 –60 –30 0 30 60 90
θ( )
Figure 8.2 Spatial spectrum comparison.

However, when some sources are very weak, there may be negative entries in
p(θ̃p ) (8.29), which is obtained by discarding the inequality constraint in (8.27). With-
out loss of generality, assume that the q̃-th entry of p( θ̃ p ) is negative, i.e., p( θ̃p, q̃ ) < 0.
In such a case, the inequality constraint in (8.26) will not be satisfied, and the closed-
form solution in (8.29) should be modified. A simple method is to force p( θ̃p, q̃ ) to be a
small positive value δ > 0 (for example, δ = 10−5 is used in our simulations), and the
power estimation of other sources, θ̄p = [θ̃p,1,. . ., θ̃p, q̃−1, θ̃p, q̃+1,. . ., θ̃p, Q̃ ]T ∈ RQ̃−1 ,
will be modified as
p̄(θ̄ p ) = [Ḡ Ḡ]−1 Ḡ r̄,
H H
(8.31)

where Ḡ = vec(a(θ̃p,1 )a H ( θ̃p,1 )),. . .,vec(a( θ̃p, q̃−1 )a H ( θ̃p, q̃−1 )),vec(a( θ̃p, q̃+1 )a H

( θ̃p, q̃+1 )),. . .,vec(a( θ̃p, Q̃ )a H ( θ̃p, Q̃ )) ∈ CM ×(Q̃−1) , and r̄ = vec(R̂ − λmin (R̂)I
2
2
−δa( θ̃p, q̃ )a H ( θ̃p, q̃ )) ∈ CM . In other words, we recalculate the source powers after
fixing the power of weak sources, thus resulting in a modified spatial spectrum as
⎧
⎪
⎨ p̄( θ̄p ) θ ∈ θ̄ p,
p(θ) = δ θ = θ̃p, q̃ , (8.32)
⎪
⎩ 0 θ∈/ θ̃p .
Using the Q̃-sparse spatial spectrum p(θ), the interference-plus-noise covariance

matrix can be sparsely reconstructed as

R̂ i+n = p(θiq )a(θiq )a H (θiq ) + σ̂n2 I, (8.33)
¯
θiq ∈∩ ˜p
where a(θiq )a H (θiq ) is the outer product of the q-th interference steering vector a(θiq ).
Because there are at most Q̃ elements in the set ¯ ∩ ˜ p , the integral operation in (8.20)
is effectively simplified to be a summation operation (8.33) by using the sparse charac-
teristics of sources in the observed spatial domain. Note that there is no desired signal
component in the reconstructed interference-plus-noise covariance matrix.
Considering the possible look direction mismatch, the DOA of the desired signal can
be located by searching for the peak of pCapon in , i.e., θ̃s = arg maxθ∈ pCapon (θ),
and the corresponding steering steering vector is denoted as ã s = a( θ̃s ). When ∩ ˜p
is empty, which is common at low SNRs, we simply use the presumed signal steering
vector for adaptive beamformer design, i.e., ã s = ā s , even if there is a look direction
mismatch.
Substituting the reconstructed interference-plus-noise covariance matrix R̂ i+n and
the estimated signal steering vector ã s into the MVDR beamformer (8.8) together, we
can propose the adaptive beamformer as
−1
R̂ i+n ã s
w= −1
. (8.34)
ã H
s R̂ i+n ã s
The proposed adaptive beamforming algorithm based on sparse reconstruction of the

interference-plus-noise covariance matrix is summarized in Table 8.1.
Table 8.1 Adaptive beamforming algorithm based on sparse reconstruction of interference-plus-noise

covariance matrix.
Step 1: Estimate the DOAs of the sources by, e.g., searching for the peaks of the Capon
spatial spectrum p Capon (θ) (8.21);
Step 2: Solve the least-squares problem (8.27) to obtain the Q̃-sparse spatial spectrum
p(θ) (8.30) or (8.32);
Step 3: Reconstruct the interference-plus-noise covariance matrix R̂ i+n (8.33) and
estimate the signal steering vector ã s ;
Step 4: Compute the proposed adaptive beamformer w (8.34).
The computational complexity of the proposed adaptive beamforming algorithm is

O(LM 2 ) with L " M, which is mainly dominated by the spectral search. If a search-
free DOA estimation technique [58] is adopted, the computational complexity can be
further decreased to O(max(M 3, Q̃2 M 2 )), where O(M 3 ) is the complexity of DOA esti-
mation and O(Q̃2 M 2 ) is the complexity of power estimation. Therefore, the proposed
adaptive beamforming algorithm has complexity slightly higher than the DOA esti-
mation algorithm. Meanwhile, the computational complexity of the & covariance 'matrix
¯
||
reconstruction-based adaptive beamforming algorithm [40] is O |∪| ¯ LM 2 . Note
however that, if the spatial estimate of the sources in the entire region is desired, the
SMI beamformer has the complexity of O(LM 2 ) as well.
8.4 Simulation Results
In our simulations, a ULA with M = 10 omnidirectional sensors spaced half wavelength

apart is considered. It is assumed that there is one desired signal from the presumed
direction θ̄s = 5◦ and two uncorrelated interferers from −50◦ and −20◦ , respectively.
The interference-to-noise ratio (INR) at each sensor is equal to 30 dB. The additive
noise is modeled as a complex circularly symmetric Gaussian zero-mean spatially and
temporally white process. When comparing the performance of the adaptive beamform-
ing algorithms with respect to the input SNR, the number of snapshots is fixed to be
K = 30. In the performance comparison of mean output SINR versus the number of
snapshots, the SNR in each sensor is set to be fixed at 20 dB. For each data point (SNR
or number of snapshots), 500 Monte Carlo trials are performed.
The proposed interference-plus-noise covariance matrix sparse reconstruction-
based beamformer (8.34) is compared to the SMI beamformer [48], the DL-SMI
beamformer [9], the eigenspace decomposition-based beamformer [16], the worst-case
performance optimization-based beamformer [32], the IAA beamformer [20], and the
interference-plus-noise covariance matrix reconstruction-based beamformer [40]. All
the tested beamformers are adaptive beamformers, i.e., their weight vectors depend on
the received array data. The diagonal loading factor ξ in the DL-SMI beamformer (8.14)
is assumed to be ten times the noise power, where the noise power is regarded as a priori
known. The eigenspace-based beamformer (8.15) is assumed to know the exact number
of interference sources. In the worst-case beamformer (8.16), the upper bound of the
mismatched vector is ad hoc chosen to be ε = 0.3M, as suggested in [32]. Without
loss of generality, the angular sector covering the direction of the desired signal in
the reconstruction-based beamformers is set to be = [θ̄s − 5◦, θ̄s + 5◦ ] (namely,
¯ = [−90◦, θ̄s − 5◦ ) ∪ ( θ̄s + 5◦,90◦ ]
[0◦,10◦ ]), and the corresponding out-of-sector is
◦ ◦ ◦ ◦
(namely, [−90 ,0 ) ∪ (10 ,90 ]). The sampling grid is uniform in ∪ ¯ with 0.1◦
increment between adjacent grid points. As a benchmark, the optimal SINR (8.5)
is also shown in all figures, which is calculated from the exact interference-plus-
noise covariance matrix and the actual desired signal steering vector. Considering
that the output performance of the interference-plus-noise covariance matrix (sparse)
reconstruction-based beamformers is very close to the optimal SINR regardless of
the input signal power [40,43], we also compare the output performance in terms of
deviation from the optimal SINR. For fair comparison, the actual steering vector of the
desired signal is normalized so that a22 = M(= 10) [32,33]. The CVX software [60]
is used to solve the related convex optimization problems.
8.4.1 Example 1: Exactly Known Signal Steering Vector

In our first example, we consider an ideal scenario where the steering vectors of both
the desired signal and the interferers are exactly known. Namely, there is no steering
vector mismatch. Note that, even in this ideal case, the presence of the desired signal in
the training samples may still substantially degrade the output performance of adaptive
beamformers as compared with the signal-free training case [3,32,40]. However, it can
be seen from Figure 8.3(a) that the output performance of adaptive beamformers based
on interference-plus-noise covariance matrix reconstruction is almost always equal to
the optimal SINR for all SNR values between −30 dB and 50 dB (i.e., SIR ranges from
−60 dB to 20 dB), which illustrates the high dynamic range. In particularly, the output
SINR of the proposed interference-plus-noise covariance matrix sparse reconstruction-
based adaptive beamformer is approximated as
SINR ≈ a22 SNR = M × SNR, (8.35)
which achieves the design goal of the adaptive beamformer and outperforms the other
tested beamformers. From Figure 8.3(b), the average SINR performance loss of the
proposed adaptive beamformer is about 0.002 dB. In contrast, there is an average
performance loss of 0.158 dB for the interference-plus-noise covariance matrix
reconstruction-based adaptive beamformer [40], which is because the Capon spatial
spectrum estimator underestimates the interferences power and reduces the estimation
accuracy of the interference-plus-noise covariance matrix. It should be noted that
the signal power is 100 times higher than the interference power in the case of
SNR = 50 dB, which can be used to illustrate the situation when the SIR approximately
approaches to infinity. Figure 8.3(c) shows the convergence rates of the tested adaptive
beamformers versus the number of snapshots K. It is clear that the adaptive beamformer
based on interference-plus-noise covariance matrix (sparse) reconstruction converges
much faster than the other tested adaptive beamformers.
60
Optimal SINR
50 SMI
DLSMI
40 Eigenspace
Worst-Case
30
Output SINR (dB)
IAA
Reconstruction
20
Sparsity
10
–10
–20
–30
–30 –20 –10 0 10 20 30 40 50
Input SNR (dB)
(a)
2
SMI
1.8 DLSMI
Deviations from optimal SINR (dB)
1.6 Eigenspace
Worst-Case
1.4 IAA
Reconstruction
1.2 Sparsity
1
0.8
0.6
0.4
0.2
0
–30 –20 –10 0 10 20 30 40 50
Input SNR (dB)
(b)
30
25
20
Output SINR (dB)
15
10 Optimal SINR
SMI
5 DLSMI
Eigenspace
0 Worst-Case
IAA
–5 Reconstruction
Sparsity
–10
10 20 30 40 50 60 70 80 90 100
Snapshots
(c)
Figure 8.3 First example: exactly known steering vectors. (a) output SINR versus input SNR;
(b) deviations from optimal SINR versus input SNR; (c) output SINR versus number of
snapshots.
20
10
0
Beampattern (dB)
–10
–20
–30
Optimal
–40 SMI
DL-SMI
–50 EIG
WorstCase
–60
–90 –60 –30 0 30 60 90
θ( )
20
10
0
Beampattern (dB)
–10
–20
–30
–40 Optimal
IAA
–50 Reconstruction
Sparsity
–60
–90 –60 –30 0 30 60 90
θ( )
Figure 8.4 First example: beampattern comparison.
In Figure 8.4, we compare the beampattern of the proposed beamformer with those
of the other tested beamformers for K = 30 and SNR = 20 dB, where the vertical solid
line denotes the direction of the desired signal and the vertical dashed lines denote the
directions of interference. It is evident that the beampattern of the proposed adaptive
beamformer almost exactly matches that of the optimal one.
8.4.2 Example 2: Fixed Signal DOA Mismatch

In the second example, a scenario with fixed signal DOA mismatch is considered.
We assume that the actual DOA of the desired signal is 8◦ , while the assumed one
is 5◦ . Correspondingly, there is a fixed DOA mismatch of 3◦ for the desired signal.

By comparing Figure 8.5(a) to Figure 8.3(a), we can see that, when the input SNR is
20 dB, there is about 18 dB of performance loss for both the SMI beamformer and the
DL-SMI beamformer. There is no obvious performance change for the worst-case beam-
former, while the eigenspace-based beamformer suffers clear performance loss when
the signal power is higher than the interference power. Compared with the perfor-
mance loss (about 2 dB) of the interference-plus-noise covariance matrix reconstruction-
based beamformer, there is almost no performance loss for the proposed adaptive beam-
former, based on covariance matrix sparse reconstruction when the input SNR is above
0 dB. When the SNR is lower than 0 dB, the performance loss is mainly because of
the DOA estimation accuracy of the Capon spatial spectrum. The output performance
of the proposed beamformer can be further improved by introducing more sophisti-
cated DOA estimation methods, especially for low SNR cases. Similarly, as shown in
Figure 8.5(c), the output performance of the proposed adaptive beamformer is close
to the optimal SINR when the number of snapshots is larger than the number of array
sensors.
8.4.3 Example 3: Random Sources of DOA Mismatch

In the third example, a more practical scenario with random DOA mismatches is con-
sidered. More specifically, random DOA mismatches of both the desired signal and
the interferers are assumed to be uniformly distributed in [−4◦,4◦ ]. That is to say, the
actual DOA of the desired signal is uniformly distributed as U [ θ̄s − 4◦, θ̄s + 4◦ ] (i.e.,
U [1◦,9◦ ]), and the DOAs of the interferers are uniformly distributed as U [−54◦, − 46◦ ]
and U [−24◦, − 16◦ ], respectively. Note that, random DOAs of the signal and interferers
change from trial to trial but remain fixed from snapshot to snapshot.
It can be seen from Figure 8.6(a) that the output performance of the proposed beam-
former is much closer to the optimal SINR than other tested beamformers. When the
input SNR is less than −10 dB, there is an approximately 0.6 dB performance loss
because there may be no peak in the angular sector for the Capon spectrum or
the peak’s value is less than the threshold; therefore, there presents a random DOA
mismatch for the desired signal by using the presumed DOA θ̄s , the center of the desired
signal sector . In addition, due to the limited sampling grid, the performance of the
proposed beamformer does not exactly converge to the optimal one when the input SNR
is higher than 0 dB. In detail, the maximum estimation error of source DOAs is 0.05◦ ,
which is half of the grid increment of 0.1◦ . Such DOA estimation error will degrade
the output performance of the proposed beamformer because both the reconstructed
interference-plus-noise covariance matrix R̂ i+n and the modified signal steering vec-
tor ã s depend on the DOA estimation. The possible solutions to mitigate the effect
of grid limitation include the grid refinement method [53] and the off-grid direction
estimation method [61–64]. In addition to achieving a faster convergence rate than the
interference-plus-noise covariance matrix reconstruction-based beamformer [40], the
proposed interference-plus-noise covariance matrix sparse reconstruction-based beam-
former offers a stable output performance with the increase of SNR while others do not,
as shown in Figure 8.6(c).
60
Optimal SINR
50 SMI
DLSMI
40 Eigenspace
Worst-Case
30
Output SINR (dB)
IAA
Reconstruction
20
Sparsity
10
–10
–20
–30
–30 –20 –10 0 10 20 30 40 50
Input SNR (dB)
(a)
2
DLSMI
1.8 Eigenspace
1.6 Worst-Case
IAA
1.4 Reconstruction
Sparsity
1.2
0.8
0.6
0.4
0.2
0
–30 –20 –10 0 10 20 30 40 50
Input SNR (dB)
(b)
30
25
20
15
Output SINR (dB)
10
5 Optimal SINR
SMI
0
DLSMI
–5 Eigenspace
Worst-Case
–10 IAA
Reconstruction
–15
Sparsity
–20
10 20 30 40 50 60 70 80 90 100
Number of snapshots
(c)
Figure 8.5 Second example: fixed signal DOA mismatch. (a) output SINR versus input SNR;
snapshots.
60
Optimal SINR
50 SMI
DLSMI
40 Eigenspace
Worst-Case
30
Output SINR (dB)
IAA
Reconstruction
20
Sparsity
10
–10
–20
–30
–30 –20 –10 0 10 20 30 40 50
Input SNR (dB)
(a)
2
DLSMI
1.8 Eigenspace
1.6 Worst-Case
IAA
1.4 Reconstruction
Sparsity
1.2
0.8
0.6
0.4
0.2
0
–30 –20 –10 0 10 20 30 40 50
Input SNR (dB)
(b)
30
25
20
Output SINR (dB)
15
10 Optimal SINR
SMI
5 DLSMI
Eigenspace
0 Worst-Case
IAA
–5 Reconstruction
Sparsity
–10
10 20 30 40 50 60 70 80 90 100
Number of snapshots
(c)
Figure 8.6 Third example: random sources look direction mismatch. (a) output SINR versus input
SNR; (b) deviations from optimal SINR versus input SNR; (c) output SINR versus number of
snapshots.
8.4.4 Example 4: Coherent Local Scattering

In the fourth example, we consider a scenario where the spatial signature of the desired
signal is distorted by local scattering effects. Specifically, the desired signal is assumed
to be a plane wave with the presumed steering vector ā s , whereas the actual steering
vector a s is formed as the superposition of five signal paths, including four coherent
scattered paths, as

4
a s = ā s + ej ψt a(θt ), (8.36)
t=1
where a(θt ),t = 1,2,3,4, correspond to coherently scattered paths. The steering vector
of the t-th path, a(θt ), can be modeled as a plane wave from the direction of θt .
The DOAs of scattered paths follow independent normal distribution θt ∼ N ( θ̄s ,4◦ ),
t = 1,2,3,4, and the phases of scattered paths follow independent uniform distribution
ψt ∼ U [0,2π) ,t = 1,2,3,4. Note that the tested adaptive beamformers are imple-
mented in a block adaptive manner, which means that both θt and ψt ,t = 1,2,3,4
change from run to run but do not change from snapshot to snapshot. From Figure 8.7,
the output performance loss of the proposed beamformer is less than 0.7 dB, which is
much smaller than what is suffered by the other tested beamformers.
8.4.5 Example 5: Wavefront Distortion

In the fifth example, we consider the situation where the desired signal spatial signature
is distorted by wave propagation effects in an inhomogeneous medium. We assume
independent-increment phase distortions of the desired signal wavefront [25,65]. In
each Monte Carlo run, each of these phase distortions is independently drawn from a
Gaussian random generator N (0,0.04), which remains fixed from snapshot to snapshot.
From Figure 8.8, it is clear that the proposed beamformer provides more stable and near-
optimal output performance than the other tested beamformers regardless of the input
signal power or the number of snapshots.
8.4.6 Example 6: Incoherent Local Scattering

In the sixth example, we assume incoherent local scattering of the desired signal, which
is common in array applications due to the multipath scattering effects caused by the
presence of local scatters. In such a case, the desired signal is assumed to have a time-
varying spatial signature as [32,40]

4
a s (k) = s0 (k)ā s + st (k)a(θt ), (8.37)
t=1
where st (k) ∼ N (0,1),t = 0,1,2,3,4, are independent and identically distributed (i.i.d.)
zero-mean complex Gaussian random variables that change from snapshot to snapshot,
θt ∼ N ( θ̄s ,4◦ ),t = 1,2,3,4, are the random DOAs changing from run to run while
remaining fixed from snapshot to snapshot. This corresponds to the case of incoherent
60
Optimal SINR
50 SMI
DLSMI
40 Eigenspace
Worst-Case
30
Output SINR (dB)
IAA
Reconstruction
20
Sparsity
10
–10
–20
–30
–30 –20 –10 0 10 20 30 40 50
Input SNR (dB)
(a)
2
DLSMI
1.8 Eigenspace
1.6 Worst-Case
IAA
1.4 Reconstruction
Sparsity
1.2
0.8
0.6
0.4
0.2
0
–30 –20 –10 0 10 20 30 40 50
Input SNR (dB)
(b)
30
25
20
Output SINR (dB)
15
10
Optimal SINR
5 SMI
DLSMI
0
Eigenspace
–5 Worst-Case
IAA
–10 Reconstruction
Sparsity
–15
10 20 30 40 50 60 70 80 90 100
Number of snapshots
(c)
Figure 8.7 Fourth example: coherent local scattering. (a) output SINR versus input SNR;
snapshots.
60
Optimal SINR
50 SMI
DLSMI
40 Eigenspace
Worst-Case
30
Output SINR (dB)
IAA
Reconstruction
20
Sparsity
10
–10
–20
–30
–30 –20 –10 0 10 20 30 40 50
Input SNR (dB)
(a)
2
SMI
1.8 DLSMI
1.6 Eigenspace
Worst-Case
1.4 IAA
Reconstruction
1.2 Sparsity
1
0.8
0.6
0.4
0.2
0
–30 –20 –10 0 10 20 30 40 50
Input SNR (dB)
(b)
30
25
20
15
Output SINR (dB)
10
5 Optimal SINR
SMI
0
DLSMI
–5 Eigenspace
Worst-Case
–10 IAA
Reconstruction
–15
Sparsity
–20
10 20 30 40 50 60 70 80 90 100
Number of snapshots
(c)
Figure 8.8 Fifth example: wavefront distortion. (a) output SINR versus input SNR; (b) deviations
from optimal SINR versus input SNR; (c) output SINR versus number of snapshots.
local scattering [30], where the rank of the signal covariance matrix R s is higher than
one. In the general-rank case, the output SINR should be rewritten as [10]
wH R s w
SINR = , (8.38)
wH R i+n w
which is maximized by [10]
w = P{R −1
i+n R s }, (8.39)
where P { · } stands for the principal eigenvector of a matrix.

It can be seen from Figure 8.9(a) that the proposed beamformer outperforms all other
tested beamformers especially at high SNR. The performance loss of the proposed
beamformer is less than 0.1 dB. In contrast, there is about 7.5 dB performance loss
for the interference-plus-noise covariance matrix reconstruction-based beamformer. The
¯ due to the inco-
main reason is that the signal of interest leaks into the out-of-sector
herent local scattering, and then the reconstructed interference-plus-noise covariance
matrix R̂ i+n (8.20) is contaminated by the leaked desired signal component.
8.4.7 Discussion
From the extensive simulation results illustrated for different scenarios in this chapter,
it is clear that the proposed adaptive beamformer, based on interference-plus-noise
covariance matrix sparse reconstruction, consistently enjoys the best performance as
compared to other tested beamformers. More specifically, having benefitted from the
interference covariance matrix reconstruction, which excludes the desired signal com-
ponent, the output performance of the proposed beamformer is always close to or equal
to the optimal SINR regardless of the input SNR. In contrast, there is a slight output per-
formance degradation for the interference-plus-noise covariance matrix reconstruction-
based beamformer because the Capon spatial spectrum estimator underestimates the
power of interferers and thus decreases the estimation accuracy of the interference-plus-
noise covariance matrix. On the other hand, other tested beamformers use the signal-
contaminated covariance matrix, and thus degrade the output performance particularly
when the input SNR is high. Another significant advantage of the proposed adaptive
beamformer is that it achieves faster convergence, and only requires the number of
snapshots to be slightly higher than the number of array sensors in order to approach the
optimal SINR. In contrast, if an average performance loss of less than 3 dB is required
for the SMI beamformer, the number of signal-free snapshots must be at least twice the
number of array sensors [48]. However, in the underlying problem where the snapshots
are contaminated by the desired signal, the convergence becomes much slower, i.e., a
much higher number of snapshots are required [17].
Note that, although the ULA is adopted in our simulations, the idea of covariance
matrix sparse reconstruction proposed in this chapter for the adaptive beamformer
design can be generalized to arbitrary arrays, e.g., the coprime array [66,67]. Having
benefitted from the larger array aperture due to the sparse deployment, the coprime
array adaptive beamforming algorithm proposed in [67] achieves high robustness
60
Optimal SINR
50 SMI
DLSMI
40 Eigenspace
Worst-Case
30
Output SINR (dB)
IAA
Reconstruction
20
Sparsity
10
–10
–20
–30
–30 –20 –10 0 10 20 30 40 50
Input SNR (dB)
(a)
2
SMI
1.8 DLSMI
1.6 Eigenspace
Worst-Case
1.4 IAA
Reconstruction
1.2 Sparsity
1
0.8
0.6
0.4
0.2
0
–30 –20 –10 0 10 20 30 40 50
Input SNR (dB)
(b)
30
25
20
Output SINR (dB)
15
10 Optimal SINR
SMI
5 DLSMI
Eigenspace
0 Worst-Case
IAA
–5 Reconstruction
Sparsity
–10
10 20 30 40 50 60 70 80 90 100
Number of snapshots
(c)
Figure 8.9 Sixth example: incoherent local scattering. (a) output SINR versus input SNR;
snapshots.
against model mismatches with a significant reduction in the number of antennas and
the associated radio frequency chains.
It is also worth noting that the extension of the proposed narrowband adaptive
beamforming technique to the broadband adaptive beamforming is straightforward [3].
For example, applying fast Fourier transform (FFT) to the broadband signal yields nar-
rowband components, which can then be independently processed using the proposed
(narrowband) covariance matrix sparse reconstruction-based adaptive beamforming
technique. Subsequently, the time-domain broadband beamformer output is obtained by
applying an inverse FFT to the output of the individual narrowband beamformers.
8.5 Conclusion
In this chapter, we proposed a simple, effective robust adaptive beamforming algorithm

based on interference-plus-noise covariance matrix sparse reconstruction. Specifically,
by exploiting the sparsity of sources distributed in the observed spatial domain, accurate
interference-plus-noise covariance matrix reconstruction can be achieved by estimating
the sparse spatial spectrum distribution from a sparsity-constrained covariance matrix
fitting problem, which provides a signal-free interference-plus-noise covariance matrix
for the beamformer design. The formulated sparsity-constrained covariance matrix fit-
ting problem can be effectively solved with a priori information of the estimated source
DOAs rather than 1 -norm relaxation-type approximations. Simulation results evidently
demonstrate the effectiveness of the proposed algorithm. Compared to the existing tech-
niques, the performance of the proposed method is nearly optimal over a wide range of
input SNR and various error conditions. In addition, the proposed technique also has
low computational complexity.
References
[1] B. D. Van Veen and K. M. Buckley, “Beamforming: A versatile approach to spatial filtering,”
IEEE ASSP Mag., vol. 5, no. 2, pp. 4–24, Apr. 1988.
[2] H. Krim and M. Viberg, “Two decades of array signal processing research: The parametric
approach,” IEEE Signal Process. Mag., vol. 13, no. 4, pp. 67–94, July 1996.
[3] H. L. Van Trees, Optimum Array Processing: Part IV of Detection, Estimation, and
Modulation Theory. John Wiley & Sons, 2002.
[4] J. Li and P. Stoica, Eds., Robust Adaptive Beamforming. John Wiley & Sons, 2005.
[5] W. Liu and S. Weiss, Wideband Beamforming: Concepts and Techniques. Wiley, 2010.
[6] J. Capon, “High-resolution frequency-wavenumber spectrum analysis,” Proc. IEEE, vol. 57,
no. 8, pp. 1408–1418, Aug. 1969.
[7] S. A. Vorobyov, “Principles of minimum variance robust adaptive beamforming design,”
Signal Process., vol. 93, no. 12, pp. 3264–3277, Dec. 2013.
[8] K. Yang, T. Ohira, Y. Zhang, and C.-Y. Chi, “Super-exponential blind adaptive beamform-
ing,” IEEE Trans. Signal Process., vol. 52, no. 6, pp. 1549–1563, June 2004.
[9] H. Cox, R. M. Zeskind, and M. H. Owen, “Robust adaptive beamforming,” IEEE Trans.
Acoust. Speech Signal Process., vol. 35, no. 10, pp. 1365–1376, Oct. 1987.
[10] A. B. Gershman, “Robust adaptive beamforming in sensor arrays,” Int. J. Electron.
Commun., vol. 53, no. 6, pp. 305–314, Dec. 1999.
[11] W. F. Gabriel, “Spectral analysis and adaptive array superresolution techniques,” Proc.
IEEE, vol. 68, no. 6, pp. 654–666, June 1980.
[12] Y. I. Abramovich, “Controlled method for adaptive optimization of filters using the criterion
of maximum SNR,” Radio Eng. Electron. Phys., vol. 26, no. 3, pp. 87–95, Mar. 1981.
[13] B. D. Carlson, “Covariance matrix estimation errors and diagonal loading in adaptive
arrays,” IEEE Trans. Aerosp. Electron. Syst., vol. 24, no. 4, pp. 397–401, July 1988.
[14] L. Du, T. Yardibi, J. Li, and P. Stoica, “Review of user parameter-free robust adaptive
beamforming algorithms,” Digital Signal Process., vol. 19, no. 4, pp. 567–582, July 2009.
[15] P. Stoica, J. Li, X. Zhu, and J. R. Guerci, “On using a priori knowledge in space-time
adaptive processing,” IEEE Trans. Signal Process., vol. 56, no. 6, pp. 2598–2602, June 2008.
[16] L. Chang and C.-C. Yeh, “Performance of DMI and eigenspace-based beamformers,” IEEE
Trans. Antennas Propagat., vol. 40, no. 11, pp. 1336–1347, Nov. 1992.
[17] D. D. Feldman and L. J. Griffiths, “A projection approach for robust adaptive beamforming,”
IEEE Trans. Signal Process., vol. 42, no. 4, pp. 867–876, Apr. 1994.
[18] J. K. Thomas, L. L. Scharf, and D. W. Tufts, “The probability of a subspace swap in the
SVD,” IEEE Trans. Signal Process., vol. 43, no. 3, pp. 730–736, Mar. 1995.
[19] M. Hawkes, A. Nehorai, and P. Stoica, “Performance breakdown of subspace-based
methods: Prediction and cure,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process.,
Salt Lake City, UT, May 2001, pp. 4005–4008.
[20] T. Yardibi, J. Li, P. Stoica, M. Xue, and A. B. Baggeroer, “Source localization and sensing:
A nonparametric iterative adaptive approach based on weighted least squares,” IEEE Trans.
Aerosp. Electron. Syst., vol. 46, no. 1, pp. 425–443, Jan. 2010.
[21] L. C. Godara, “The effect of phase-shift errors on the performance of an antenna-array
beamformer,” IEEE J. Ocean. Eng., vol. 10, no. 3, pp. 278–284, July 1985.
[22] J. W. Kim and C. K. Un, “An adaptive array robust to beam pointing error,” IEEE Trans.
Signal Process., vol. 40, no. 6, pp. 1582–1584, June 1992.
[23] N. K. Jablon, “Adaptive beamforming with the generalized sidelobe canceller in the presence
of array imperfections,” IEEE Trans. Antennas Propagat., vol. 34, no. 8, pp. 996–1012, Aug.
1986.
[24] A. B. Gershman, V. I. Turchin, and V. A. Zverev, “Experimental results of localization of
moving underwater signal by adaptive beamforming,” IEEE Trans. Signal Process., vol. 43,
no. 10, pp. 2249–2257, Oct. 1995.
[25] J. Ringelstein, A. B. Gershman, and J. F. Böhme, “Direction finding in random inhomoge-
neous media in the presence of multiplicative noise,” IEEE Signal Process. Lett., vol. 7, no.
10, pp. 269–272, Oct. 2000.
[26] Y. J. Hong, C.-C. Yeh, and D. R. Ucci, “The effect of a finite-distance signal source on a
far-field steering applebaum array – two dimensional array case,” IEEE Trans. Antennas
Propagat., vol. 36, no. 4, pp. 468–475, Apr. 1988.
[27] K. I. Pedersen, P. E. Mogensen, and B. H. Fleury, “A stochastic model of the temporal and
azimuthal dispersion seen at the base station in outdoor propagation environments,” IEEE
Trans. Veh. Technol., vol. 49, no. 2, pp. 437–447, Mar. 2000.
[28] J. Goldberg and H. Messer, “Inherent limitations in the localization of a coherently scattered
source,” IEEE Trans. Signal Process., vol. 46, no. 12, pp. 3441–3444, Dec. 1998.
[29] D. Astely and B. Ottersten, “The effects of local scattering on direction of arrival estimation
with MUSIC,” IEEE Trans. Signal Process., vol. 47, no. 12, pp. 3220–3234, Dec. 1999.
[30] O. Besson and P. Stoica, “Decoupled estimation of DOA and angular spread for a spatially
distributed source,” IEEE Trans. Signal Process., vol. 48, no. 7, pp. 1872–1882, July 2000.
[31] O. L. Frost, “An algorithm for linearly constrained adaptive array processing,” Proc. IEEE,
vol. 60, no. 8, pp. 926–935, Aug. 1972.
[32] S. A. Vorobyov, A. B. Gershman, and Z.-Q. Luo, “Robust adaptive beamforming using
worst-case performance optimization: A solution to the signal mismatch problem,” IEEE
Trans. Signal Process., vol. 51, no. 2, pp. 313–324, Feb. 2003.
[33] J. Li, P. Stoica, and Z. Wang, “On robust Capon beamforming and diagonal loading,” IEEE
Trans. Signal Process., vol. 51, no. 7, pp. 1702–1715, July 2003.
[34] R. G. Lorenz and S. P. Boyd, “Robust minimum variance beamforming,” IEEE Trans. Signal
Process., vol. 53, no. 5, pp. 1684–1696, May 2005.
[35] A. Hassanien, S. A. Vorobyov, and K. M. Wong, “Robust adaptive beamforming using
sequential quadratic programming: An iterative solution to the mismatch problem,” IEEE
Signal Process. Lett., vol. 15, pp. 733–736, Nov. 2008.
[36] A. Khabbazibasmenj, S. A. Vorobyov, and A. Hassanien, “Robust adaptive beamforming
based on steering vector estimation with as little as possible prior information,” IEEE Trans.
Signal Process., vol. 60, no. 6, pp. 2974–2987, June 2012.
[37] S. A. Vorobyov, A. B. Gershman, Z.-Q. Luo, and N. Ma, “Adaptive beamforming with
joint robustness against mismatched signal steering vector and interference nonstationarity,”
IEEE Signal Process. Lett., vol. 11, no. 2, pp. 108–111, Feb. 2004.
[38] Y. J. Gu, W.-P. Zhu, and M. N. S. Swamy, “Adaptive beamforming with joint robustness
against covariance matrix uncertainty and signal steering vector mismatch,” Electronics
Lett., vol. 46, no. 1, pp. 86–88, Jan. 2010.
[39] Y. Gu and A. Leshem, “Robust adaptive beamforming based on jointly estimating covariance
matrix and steering vector,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process.,
Prague, Czech Republic, May 2011, pp. 2640–2643.
[40] Y. Gu and A. Leshem, “Robust adaptive beamforming based on interference covariance
matrix reconstruction and steering vector estimation,” IEEE Trans. Signal Process., vol. 60,
no. 7, pp. 3881–3885, July 2012.
[41] L. Huang, J. Zhang, X. Xu, and Z. Ye, “Robust adaptive beamforming with a novel
interference-plus-noise covariance matrix reconstruction method,” IEEE Trans. Signal
Process., vol. 63, no. 7, pp. 1643–1650, Apr. 2015.
[42] J. Yang, G. Liao, J. Li, Y. Lei, and X. Wang, “Robust beamforming with imprecise array
geometry using steering vector estimation and interference covariance matrix reconstruc-
tion,” Multidim. Syst. Signal Process., vol. 28, no. 2, pp 451–469, Apr. 2017.
[43] Y. Gu, N. A. Goodman, S. Hong, and Y. Li, “Robust adaptive beamforming based on
interference covariance matrix sparse reconstruction,” Signal Process., vol. 96, Part B,
pp. 375–381, Mar. 2014.
[44] Y. Gu and Y. D. Zhang, “Single-snapshot adaptive beamforming,” in Proc. IEEE Sensor
Array Multichannel Signal Process. Workshop, Sheffield, UK, July 2018.
[45] Y. C. Eldar, A. Nehorai, and P. S. La Rosa, “An expected least-squares beamforming
approach to signal estimation with steering vector uncertainties,” IEEE Signal Process. Lett.,
vol. 13, no. 5, pp. 288–291, May 2006.
[46] K. Kumatani, T. Gehrig, U. Mayer et al., “Adaptive beamforming with a minimum mutual
information criterion,” IEEE Trans. Audio Speech Lang. Process., vol. 15, no. 8, pp. 2527–
2541, Nov. 2007.
[47] Y. Rong, Y. C. Eldar, and A. B. Gershman, “Performance tradeoffs among adaptive
beamforming criteria,” IEEE J. Sel. Top. Signal Process., vol. 1, no. 4, pp. 651–659, Dec.
2007.
[48] I. S. Reed, J. D. Mallett, and L. E. Brennan, “Rapid convergence rate in adaptive arrays,”
IEEE Trans. Aerosp. Electron. Syst., vol. 10, no. 6, pp. 853–863, Nov. 1974.
[49] L. Du, J. Li, and P. Stoica, “Fully automatic computation of diagonal loading levels for
robust adaptive beamforming,” IEEE Trans. Aerosp. Electron. Syst., vol. 46, no. 1, pp. 449–
458, Jan. 2010.
[50] R. O. Schmidt, “Multiple emitter location and signal parameter estimation,” IEEE Trans.
Antennas Propag., vol. 34, no. 3, pp. 276–280, Mar. 1986.
[51] J. A. Tropp and A. C. Gilbert, “Signal recovery from random measurements via orthogonal
matching pursuit,” IEEE Trans. Inf. Theory, vol. 53, no. 12, pp. 4655–4666, Dec. 2007.
[52] A. J. Miller, Subset Selection in Regression. Chapman and Hall, 2002.
[53] D. Malioutov, M. Çetin, and A. S. Willsky, “A sparse signal reconstruction perspective
for source localization with sensor arrays,” IEEE Trans. Signal Process., vol. 53, no. 8,
pp. 3010–3022, Aug. 2005.
[54] Y. C. Eldar and G. Kutyniok, Eds., Compressed Sensing: Theory and Applications.
Cambridge University Press, 2012.
[55] S. Chen, D. Donoho, and M. Saunders, “Atomic decomposition by basis pursuit,” SIAM J.
Sci. Comput., vol. 20, no. 1, pp. 33–61, 1998.
[56] R. Tibshirani. “Regression shrinkage and selection via the lasso,” J. Roy. Stat. Soc. B, vol.
58, no. 1, pp. 267–288, 1996.
[57] R. Roy and T. Kailath, “ESPRIT-estimation of signal parameters via rotational invariance
techniques,” IEEE Trans. Acoust. Speech Signal Process., vol. 37, no. 7, pp. 984–995, July
1989.
[58] A. B. Gershman, M. Rubsamen, and M. Pesavento, “One- and two-dimensional direction-
of-arrival estimation: An overview of search-free techniques,” Signal Process., vol. 90, no.
5, pp. 1338–1349, May 2010.
[59] K. Harmanci, J. Tabrikian, and J. L. Krolik, “Relationships between adaptive minimum
variance beamforming and optimal source localization,” IEEE Trans. Signal Process., vol.
48, no. 1, pp. 1–12, Jan. 2000.
[60] M. Grant, S. Boyd, and Y. Y. Ye, “CVX: MATLAB software for disciplined convex
programming,” Dec. 2017. Available: http://cvxr.com/cvx/
[61] G. Tang, B. N. Bhaskar, P. Shah, and B. Recht, “Compressed sensing off the grid,” IEEE
Trans. Inf. Theory, vol. 59, no. 11, pp. 7465–7490, Nov. 2013.
[62] Y. Li and Y. Chi, “Off-the-grid line spectrum denoising and estimation with multiple
measurement vectors,” IEEE Trans. Signal Process., vol. 64, no. 5, pp. 1257–1269, Mar.
2016.
[63] C. Zhou, Y. Gu, X. Fan et al., “Direction-of-arrival estimation for coprime array via virtual
array interpolation,” IEEE Trans. Signal Process., vol. 66, no. 22, pp. 5956–5971, Nov.
2018.
[64] C. Zhou, Y. Gu, Z. Shi, and Y. D. Zhang, “Off-grid direction-of-arrival estimation using
coprime array interpolation,” IEEE Signal Process. Lett., vol. 25, no. 11, pp. 1710–1714,
Nov. 2018.
[65] O. Besson, F. Vincent, P. Stoica, and A. B. Gershman, “Approximate maximum likelihood

estimators for array processing in multiplicative noise environments,” IEEE Trans. Signal
Process., vol. 48, no. 9, pp. 2506–2518, Sept. 2000.
[66] C. Zhou, Z. Shi, and Y. Gu, “Coprime array adaptive beamforming with enhanced degrees-
of-freedom capability,” in Proc. IEEE Radar Conf., Seattle, WA, May 2017, pp. 1357–1361.
[67] C. Zhou, Y. Gu, S. He, and Z. Shi, “A robust and efficient algorithm for coprime
array adaptive beamforming,” IEEE Trans. Veh. Tech., vol. 67, no. 2, pp. 1099–1112,
Feb. 2018.
9 Spectrum Sensing for Cognitive
Radar via Model Sparsity Exploitation
Augusto Aubry, Vincenzo Carotenuto, Antonio De Maio, and Mark A. Govoni
9.1 Introduction
Radio frequency (RF) electromagnetic spectrum is a limited natural resource necessary

for an ever-growing number of services and systems. Indeed, both high-quality/high-
rate wireless services as well as accurate and reliable remote-sensing capabilities call
for increased amounts of bandwidth [1], posing a challenge to the overall usability
of noncooperative systems. Not surprisingly, the RF spectrum congestion problem has
attracted the interest of many scientists and engineers in the last few years and it is
currently becoming one amongst the hot topics in both regulation and research field [2].
In the RF spectrum congestion context, spectral cognizance is envisioned as a key
enabler to the next-generation cognitive radars, which take advantage of the perception-
action cycle [3]. Based on the characteristics of frequency overlaid emitters and rely-
ing on the waveform diversity paradigm, a primary objective of cognitive radar is to
utilize probing waveforms that optimize radar performance while guaranteeing spectral
coexistence [4–9]. This requires electromagnetic awareness of the operative scenario,
gathered via the dynamic estimation of the spectrum occupancy, which is mandatory for
a flexible and efficient utilization/management of the frequency resources [2]. In this
context, a multitude of spectrum-sensing algorithms has been developed mainly in the
field of communication networks to counter spectrum scarcity in some frequency bands
and to increase the degree of utilization of certain spectrum portions whose occupancy
varies sharply from time to time and location to location [10,11].
The available techniques can be clustered in two main classes [12]. On one side, there
are algorithms known as supervised procedures, which exploit intrinsic and specific
emitter characteristics by correlating actual measurements with known signal patterns
[11]. On the other side, there are unsupervised methods that rely on appropriate statistics
of the collected measurements, such as cyclostationary parameters [13] or data sam-
ple covariance matrix attributes [14] (for example the condition number, the largest
eigenvalue, and the trace). Other approaches, which take advantage of some degrees of
freedom available at the receiver, have been also proposed to accomplish multi-domain
spectral sensing. For instance, in [15,16] two-dimensional (2-D) sensing strategies rely-
ing on the cooperation between multiple single channel sensors are proposed to improve
detection performance. In [17–19], array signal-processing techniques are developed
to detect a possible primary user in a given bandwidth. An adaptive cyclostationary
257
258 Aubry, Carotenuto, De Maio, and Govoni
beamforming-based spectrum sensing method for a multiple-antenna sensor is designed

in [20]. Moreover, in [21], multidimensional spectrum data are described via a spec-
trum tensor model and an algorithm is developed to construct spectrum maps jointly
exploiting tensor completion and prediction scheme.
In this chapter, focusing on the modern cognitive radar context, we deal with
2-D spectrum sensing, namely, the space–frequency characterization of the radio
environment surrounding the radar of interest, without any statistical assumptions
on the sources’ signals (see the discussion in Section 9.3 about the interest toward
emitters’ angular information). To this end, we suppose a sensor, which, unlike usual
5G sensing devices, is equipped with multiple receive antennas (many modern radars
show more than 500 receive channels, often more than 1,000) as well as high-speed
processors and develop a formal discrete-time sensing signal model. Hence, we describe
two different signal-processing strategies to get accurate space–frequency awareness
via block sparsity exploitation at the recovery stage.
• The first technique [22] leans on the iterative adaptive algorithm (IAA) [23–25],
which is a sequential procedure aimed at enhancing the spectrum estimate,
suitably reducing the leakage effect suffered by the conventional filter bank, and
includes a Bayesian information criterion (BIC)-based stage [26] to promote
block-sparsity in the recovery process;
• The second approach [27] retrieves the space–frequency profile as solution to a
regularized maximum likelihood (RML) estimation problem, where a term pro-
moting the block-sparsity of the 2-D profile is directly included in the objective.
This penalty function is intertwined with the “lq -norm” (0 < q ≤ 1) of the
vector and contains the space–frequency source energies pushing for a sparse
2-D profile estimate. Since this procedure extends the sparse learning via iterative
minimization (SLIM) algorithm that was developed in [28] to the block-sparsity
scenario, in the following, it is referred to as block SLIM (BSLIM).
At the analysis stage, some case studies are illustrated to assess the effectiveness of
the proposed 2-D spectrum sensing strategies. Specifically, the capabilities of the differ-
ent signal processing techniques to recover the actual space–frequency occupancy maps
are evaluated for both simulated and measured data (acquired via the software-defined
radio (SDR) device “RTL-SDR R820T2 RTL2832U 1PPM TCXO”). The results high-
light that both BSLIM and IAA may provide a reliable space–frequency electromagnetic
cognizance with significant performance improvements as compared with the classic
filter bank at the price of increased computational complexity.
To summarize, the main contributions of this chapter are:
• The introduction of a 2-D sensing signal model exploiting the data collected at the
end of each radar pulse repetition interval (PRI) to address the spectrum-sensing
task for a modern (potentially cognitive) radar equipped with a phased-array
antenna and several TX-RX modules. These data are already used in modern
radars to establish (with heuristic tools) the “less disturbed frequency” and
accomplish frequency diverse transmissions.
Spectrum Sensing for Cognitive Radar via Model Sparsity Exploitation 259
• A formal connection between the 2-D spectrum-sensing task and undetermined

linear models with a sparse vector of unknowns. This paves the way for the
exploitation of a plethora of technically sound algorithms that are able to recover
the unknown parameters of interest, i.e., angular location and spectral support of
the emitters.
• The application of theoretically grounded criteria, borrowed from statistical
signal processing (i.e., matched filters, IAA, IAA plus BIC, BSLIM), to perform
sensing task in the context of a cognitive radar.
• The performance assessment (in terms of 2-D spectrum recovery capabilities) of
the aforementioned strategies both on simulated and measured data highlighting
the practical effectiveness of BSLIM and IAA BIC-based approach to provide
valuable and reliable information.
The chapter exploits the results in the papers [22,27], and it is organized as follows.
Section 9.2 introduces the system model for the considered spectrum-sensing scenario.
In Section 9.3, some signal-processing algorithms to perform the recovery of the 2-D
radio environmental map are described. Section 9.4 is devoted to the performance anal-
ysis of these procedures for both simulated and real data. Finally, Section 9.5 concludes
the chapter and highlights some possible future research.
9.2 System Model and Problem Formulation
Let us consider a sensor that is equipped with M antennas1 that collects the signals trans-
mitted by K sources located in specific, but unknown, angular directions θ̄1,. . ., θ̄K ,
with given, but unknown, spectral extensions, i.e., bandwidths.2 Figure 9.1 provides a
pictorial representation of the considered sensing scenario, where the goal is to charac-
terize the space–frequency features of the radio environment surrounding the radar of
interest.
The baseband discrete-time signal at the output of the receiving array (obtained by
sampling the continuous-time signals according to the Nyquist rate for the sensing
bandwidth B) can be written as

K
y(n) = s( θ̄h )xh (n) + w(n), n = 1,. . .N, (9.1)
h=1
where N is the number of available snapshots in the observation time or data-window,

y(n) ∈ CM , n = 1,. . .N, is the n-th observed snapshot, s(θ) ∈ CM is the spa-
tial steering vector associated with the angular location θ (depending on the array
configuration), xh (n) ∈ C, n = 1,. . .N, refers to the signal emitted by the source
located at angle θ̄h , h = 1,. . .,K, and w(n) ∈ CM , n = 1,. . .N, are independent and
1 A large number of channels/antennas is generally assumed to be the sensing task performed by a modern
(possibly cognitive) radar.
2 Symbols presenting the overbar usually refer to the true parameters.
Figure 9.1 Sensing scenario: a sensor equipped with multiple receive antennas is used to collect
signals emitted by multiple transmit sources. ©[2018] IEEE. Reprinted, with permission from
[“Two-dimensional spectrum sensing for cognitive radar,” 2018 IEEE Radar Conference
(RadarConf18), Oklahoma City, OK, USA, 2018.].
identically distributed (i.i.d.) zero-mean circularly symmetric white Gaussian vectors,

i.e., E[w(n)w(m)H ] = σ2 I if n = m, or otherwise E[w(n)w(m)H ] = 0.
In order to highlight the spectral characteristics of the different sources, let us resort
to the frequency representation of each signal xh (n), h = 1,. . .,K, i.e.,

NF
x h = [xh (1),xh (2),. . .,xh (N )]T = s F (ωk )āh,k , (9.2)
k=1
where NF ≥ N is the number of frequency bins,

1
s F (ωk ) = √ [1, exp(j 2πωk ),. . ., exp(j 2π(N − 1)ωk )]T ∈ CN ,
N
√
with ωk = NF , and āh,k is proportional, via N /NF , to the Fourier transform of the
k−1
signal samples emitted by the h-th source (during the observation time) evaluated at the
frequency ωk . Hence, (9.1) can be written as

K
NF
y = [y(1)H ,y(2)H ,. . .,y(N )H ]H = s̄( θ̄h,ωk )āh,k + w, (9.3)
h=1 k=1
where
• w = [w(1)H ,w(2)H ,. . .,w(N)H ]H ∈ CN M is the overall noise vector;

• s̄(θ,ωk ) = s F (ωk ) ⊗ s(θ), is the steering vector associated with the angle θ and
k-th frequency bin.
Based on (9.3), in the presence of N1 data-windows each composed of N snapshots, the

baseband discrete-time signal at the output of the receiving array for the h-th space–time
observation, h = 1,. . .,N1 , can be expressed as

K
NF
yh = s̄( θ̄m,ωk )ām,k,h + wh, (9.4)
m=1 k=1
where
• w1,. . .,wN1 ∈ CN M are the interference vectors affecting the different data-
windows, modeled as i.i.d. zero-mean circularly symmetric white Gaussian ran-
dom vectors, with covariance matrix σ2 I ;
• ām,k,h is proportional to the Fourier transform of the signal emitted by the m-th
source (during the h-th data-window), evaluated at the frequency ωk .
In order to proceed, let {θi }K 1
i=1 be a grid of angular locations that are assumed
fine enough such that the true positions of the existing sources lie on the grid, i.e.,
{ θ̄1,. . ., θ̄K } ⊆ {θi }K 1 3
i=1 . Under this mild assumption , the signal model (9.4) can be
recast as
K1 NF
yh = s̄(θi ,ωk )ai,k,h + wh, (9.5)
i=1 k=1
where s̄(θi ,ωk ), i = 1,. . .,K1 , k = 1,. . .,NF , define the overall dictionary and ai,k,h ,
i = 1,. . .,K1 , k = 1,. . .,NF , represent the overall space–frequency profile at the h-
th snapshot. Notice that the space–frequency profile in general depends on h, since
the signal transmitted by each emitter changes with the data-window, due to both data
modulation and channel condition state. However, if the space–frequency pair (i
,k
)
does not belong to the space–frequency support of the emitters at the h
-th snapshot (in
the following referred to as an inactive pair at the h
-th snapshot), then it is assumed that
ai
,k
,h = 0 for all the snapshots. This is tantamount to requiring that both the angular
location as well as the spectral support of the emitters are stationary along the overall
data acquisition time. We write this formally as
(i
,k
) is inactive at h
⇒ ai
,k
,h = 0, ∀h = 1,. . .,N1 . (9.6)
Based on (9.4) and (9.5), 2-D spectrum sensing is tantamount to recovering the overall
space–frequency profile. Precisely speaking, the space–frequency occupancy map can
be obtained from the estimated profile starting from the set of angle-frequency bins
whose energy is different from zero, i.e.,

N1
|ai
,k
,h |2 > 0.
h=1
3 In the presence of mismatches, some refinements can be accomplished, for instance resorting to the
RELAX algorithm [25,29,30].
To shed light on the intrinsic structure of the considered signal model (9.5), let us
introduce the following compact vectorial representation
y h = H x h + wh, h = 1,. . .,N1, (9.7)
where
• x h = [a1,1,h,. . .,a1,NF ,h,a2,1,h,. . .,aK1,NF ,h ]T ∈ CK1,NF , is the vector contain-

ing the space–frequency profile for the h-th snapshot;
• H ∈ CN M,K1 NF is the model matrix defined as
H = [s̄(θ1,ω1 ),. . ., s̄(θ2,ω1 ),. . ., s̄(θK1 ,ωNF )] = [h1,. . .,hK1 NF ]. (9.8)
Hence, in the presence of multiple data-windows, the overall signal snapshots can be
cast as follows
Y = H X + W, (9.9)
where
Y = [y 1,. . .,y N1 ] ∈ CMN,N1 , (9.10)
X = [x 1,. . .,x N1 ] ∈ CK1 NF ,N1 ; (9.11)
W = [w1,. . .,wN1 ] ∈ CMN,N1 . (9.12)
According to (9.9), the received signal for each data-window is the superposition of
the weighted angular-frequency components associated with the different sources. Such
configuration, usually referred to as the subspace signal model [31,32], is ever-present
in signal processing applications including direction of arrival evaluation and spectral
analysis where the functional form of the steering vectors is application-dependent.
Regardless of the actual steering vector structure, several algorithms have been proposed
in the open literature to estimate the unknown model parameters [24,33–35]. In the
following section, some advanced recovery strategy exploiting the block-sparsity of the
unknown matrix X induced by condition (9.6) are presented.
Before concluding this section, some interesting remarks about the importance of 2-D
spectrum awareness for cognitive radars are now provided. Precisely:
• Cognitive radars equipped with multiple-input multiple-output (MIMO) capa-

bility embrace the possibility of realizing adaptive beamforming at both the
transmission and the receiver end [36]. This is a relevant feature as in the
numerator of the radar equation there are both the transmit and receive gains.
Hence, overthrowing a sidelobe direction both at the transmitter and the receiver
could be much more effective for interference cancellation purposes than doing
this only at the receiver side. Besides, transmit beampattern shaping can be very
helpful to mitigate the interference induced by the radar on spectrally overlaid
communication systems. Now, in order to shape the beampattern on transmit,
exogenous information on angular locations of jammers and/or coexisting
networks is necessary to force the necessary angular constraints [37–41]. This

can be achieved via 2-D spectrum sensing, whose output can be used to shape the
transmit waveform in the frequency domain and to overthrow desired sidelobe
directions. The neat result is an improvement in the coexistence between radar
and overlaid communication systems. Furthermore, a space–time waveform
design accounting for specific space–frequency constraints can be also conceived
[42]. In this case, leveraging the information provided by the 2-D spectrum
sensing the space–time transmit waveform can be synthesized so as to exhibit an
appropriate frequency behavior in specific angular directions.
• Another important motivation of the 2-D spectrum sensing stems from the possi-
bility of exploiting the frequency and angle of a specific emitter to recover with
a suitable beamformer and frequency filter the signal of interest for additional
analysis, i.e., classification purposes, namely, continuous wave (CW) vs. pulsed,
modulation characteristics, cooperative vs. noncooperative emitters, and so on.
• Last but not least, the proposed 2-D approach paves the way to more advanced
forms of environmental awareness. Precisely, multiple sensors (with the related
2-D recovered maps) can be employed to acquire the knowledge of the spatial
coordinates and hence the power of emitters via triangulation on each active band-
width. By doing so, the coverage area of any emitter can be predicted and highly
appropriate space–frequency constraints can be forced so as to meet compatibility
requirements.
Finally, it is worth observing that, in the presence of multipath, virtual sources will
appear at the receiver side, i.e., K is larger than the actual number of emitters. In such a
case, under the no-restrictive assumption4 that K < M, the 2-D spectrum sensing shares
the potentiality of recovering not only the line of sight (LOS) components but also the
virtual sources. Once recovered, it is also conceivable that specific postprocessing aimed
at detecting the paths associated with the same sources via simple cross-correlation
approaches or via advanced clustering after a specific feature extraction process.
9.3 2-D Radio Environmental Map Recovery Strategies
In this section, two adaptive signal processing techniques with the ability to recover
the space–frequency occupancy map via block-sparsity exploitation are described:
BIC-based IAA processing and BSLIM .
9.3.1 BIC-based IAA Processing

A sequential procedure that jointly exploits the profile estimate provided by the IAA
strategy [24,25] and the BIC framework [26] is adopted at the recovery stage to reliably
evaluate the space–frequency occupancy map [22]. The main idea is to select the mini-
4 As already mentioned, modern phased array radars exploit several receive channels (often more than 500).
mum number of space–frequency sources able to appropriately fit the available data by
means of a smart successive interference cancellation approach.
In order to proceed, let us introduce the IAA technique. It is a sequential iterative
algorithm that tries to evaluate the unknown 2-D profile iteratively refining the estimates
provided by the conventional filter bank approach usually suffering the so-called leakage
effect [25]. Focusing on the ith steering vector, the idea is to consider all the other
contributions as interference. In this respect, assuming that the phases of xi,h , i ∈
{1,. . .,i − 1,i + 1,. . .,K1 NF }, h = 1,. . .,N1 , are i.i.d. random variables uniformly
distributed over [0,2π]5 , the average (over the N1 data-windows) covariance matrix
of the interference experienced by the source corresponding to the ith steering vector is
K
1 NF
Qi = l + σ I,
Pl hl hH 2
i = 1,. . .,K1 NF , (9.13)
l=1
l!=i
with Pl = N1 X l 2 ,
1 2 l = 1,. . .K1 NF , and
T
X k = Xk,1,Xk,2,. . .,Xk,N1 ∈ CN1 ,
is the k-th row of the matrix X. In other words, Pl is the average power (over the different
data-windows) of the l-th source. As shown in [24], the best linear unbiased estimator
of Xi (notice that the different data-windows are statistically uncorrelated) is given by
−1
hH
i Qi Y
X̂ i = −1
, i = 1,. . .,K1 NF (9.14)
hH
i Qi hi
Now, letting
K
1 NF
R= l + σ I,
Pl hl hH 2
(9.15)
l=1
it follows that
Qi = R − Pi hi hH
i . (9.16)
Therefore, applying the matrix inversion lemma [24,34] to (9.16)
hH −1
−1 i R
i Qi =
hH −1
, i = 1,. . .,K1 NF , (9.17)
1 − Pi hH
i R hi
consequently, the estimator of X i can be written as
hH −1
i R Y
X̂ i = −1
, i = 1,. . .,K1 NF (9.18)
hH
i R hi
5 This is in accordance with the key references [24,25,43], which lay the theoretical background of the
IAA approach. Nevertheless, at the analysis stage arbitrary signals, i.e., not necessarily characterized by
frequency bin weights with i.i.d. phases, are considered.
Algorithm 1 IAA for 2-D Spectrum Sensing

(0) hH
1: Initialization. Set p = 0 and X̂ i = i
hi 2 Y , i = 1,. . .,K1 NF .
2: repeat
3: p = p + 1.
(p) (p−1) 2
4: P̂i = N11 Xi 2 .
K
1 NF
(p)
5: R̂ = P̂i hi hH i + σ I.
2
i=1
−1
(p) hHi R̂ Y
6: X̂ i = −1 , i = 1,. . .,K1 NF .
hH
i R̂ hi
K
1 NF
(p) (p−1)
7: until |X̂ i 2 − X̂ i ¯ and p ≤ p̄
2 > .
i=1
p
8: Output. Estimated 2-D profile X̂i = X̂ , i = 1,. . .,K1 NF .
Now, since R depends on the unknowns X i 22 , i = 1,. . .,K1 NF , the IAA iteratively
replaces Pi with the estimates P̂i = N11 X̂ i 22 in (9.15) and initializes the process with
the filter bank outputs [22]. Otherwise stated, at each step the IAA predicts the matrix
R, R̂ say, using in (9.15) the source contributions X̂ that were estimated at the previous
step, according to (9.18). This procedure is iterated until a convergence condition is
reached, e.g., a maximum number of iterations p̄ is performed, or
K
1 NF (p)
(p−1)
X̂ i 2 − X̂ i ¯
2 ≤ ,
i=1
where p > 0 is the iteration step and ¯ is a maximum distance among two successive
profile estimates. The IAA process is summarized in Algorithm 1.
Finally, the per-iteration computational complexity of the IAA [24,25] is O(NF K1
N 2 M 2 ). Some computationally efficient implementations can be also conceived see
[43–46].
Let us now promote block sparsity in the recovery process jointly using the profile
estimate obtained by Algorithm 1 and the BIC framework. To this end, let us denote by
• K̄ an upper bound to the actual number of space–frequency sources (resulting

from some upper bounds on the number of sources K and their frequency
support);
• I(k) = {h̃
1,. . ., h̃
k−1 } the set of the k − 1 space–frequency sources chosen up to
the iteration k − 1;
•
⎛4 4 2⎞
4 4
4 4
⎜4 4 ⎟
⎜ 4
BICk (h̃) = N MN1 log ⎝4Y − hi X̂ i 4 ⎟
4 ⎠
4 5 6 4 (9.19)
4 i∈ I (k)∪{h̃} 4
2
+ 4k log (2N MN1 ) ,
where N MN1 is the size of the observation data, i.e., the product between the
number of available snapshots, antennas, and data-windows, and the factor 4 in
the second term of (9.19) accounts for the number of unknowns for each source,
i.e., its complex valued amplitude, angle, and frequency.
Then, at step k, 1 ≤ k ≤ K̄, the space–frequency source with index
h̃
k = arg min BICk (h̃)
h̃!∈I (k)
is selected as new profile entry, namely h̃

k is included in the index set defining the
updated space–frequency profile (I(k + 1) = I(k) ∪ {h̃
k }).
As to the procedure initialization, I(1) = ∅ , i.e., at step k = 1 the sources indices
define the empty set. In a nutshell, at the k-th iteration of the algorithm a new source is
selected such that the updated profile minimizes (9.19). This procedure is repeated until
k ≤ K̄. Hence, denoting by
$ %
k
= arg min BICk h̃
k ,
k∈{1,..., K̄}
the profile recovery is obtained from I(k

). In particular,

BI C X̂ i if i ∈ I(k
)
X̂ i = . (9.20)
0 otherwise
It is worth pointing out that the procedure requires the storage of the two K̄-dimensional
real-valued vectors (ordered sets)
[h̃
1, h̃
2,. . ., h̃
K̄ ],

BIC1 (h̃
1 ),BIC2 (h̃
2 ),. . .,BICK̄ (h̃
K̄ ) .
9.3.2 BSLIM Approach

In this subsection, a space–frequency profile recovery based on the RML estimation
paradigm is presented [27]. Precisely, the following regularized minimization problem
for block-sparse signal reconstruction is considered

min N MN1 log(σ2 ) + σ12 H X − Y 22 + f1 (X)
P X,σ 2 , (9.21)
s.t. σL2 ≤ σ2 ≤ σU 2
where
2 0$ 1
K
1 NF
%q/2
f1 (X) = Xk 22 + −1 (9.22)
q
k=1
is the block-sparsity promoting penalty term, with

T
X k = Xk,1,Xk,2,. . .,Xk,N1 ∈ CN1
the k-th row of the matrix X and > 0 a smoothing factor making (9.22) differentiable.
In (9.21), σL2 and σU
2 are respectively a lower bound and an upper bound for the white
interference. σL can be evaluated characterizing the power level associated with the
2
isolated operation of the receiver components, whereas σU2 can be obtained via measure-
ments in stressing conditions (for instance, in terms of device operating temperatures)

and accounting for a conservative confidence level on the estimate. Remarkably, when
q = 1 and = 0, (9.22) boils down to
K N

1 F
2
2 Xk 2 − N1 NF ,
q
k=1
which is equivalent to the objective function of the mixed l2 / l1 -optimization pro-

gram (L-OPT) that was proposed in [47–49] to develop a block-sparsity recov-
ery algorithm for noiseless measurements. Furthermore, if the noise power level is
fixed, (9.21) becomes the group version of the basis pursuit denoising algorithm
presented in [50] to perform the recovery of block-sparse signals in the presence
of noisy data. It is also worth pointing out that the regularized minimization prob-
lem P extends the recovery approach developed in [28] to a block-sparsity situ-
ation. Additionally, unlike [50,51], (9.21) accounts for an unknown noise power
level at the recovery stage, providing a more general framework to the regularized
version of the unconstrained smoothed l2 / lp minimization approach proposed in
[50,51].
Problem P is a nonconvex optimization problem (the objective is a nonconvex func-
tion) and the framework proposed in [52,53] is exploited to systematically solve it and
obtain high quality solutions to the formulated block-sparse recovery problem. Specifi-
cally, two independent variable blocks are considered: the former is the noise variance
σ 2 while the latter is the space–frequency profile X. Thus, denoting by g(X,σ2 ) =
NMN1 log(σ 2 ) + σ12 H X − Y 22 + f1 (X), the procedure developed in [27] and sum-
marized in Algorithm 2 can be used to solve P.
Algorithm 2 BSLIM for 2-D Spectrum Sensing

1: Input. σL2 , σU
2 , > 0, δ > 0, and q ∈ ]0,1].
(0)
2: Initialization. Set n = 0, σ2 = σL2 , and X(0) = diag(h1 2,. . .,hK1 NF 2 )−1
HHY
3: repeat
4: n = n + 1.$ % $ (n−1) 2 %1− q
5: D 1 = diag d̄1,. . ., d̄K1 NF , with d̄i = Xi 2 + 2 , i = 1,. . .,K N .
1 F
$ %−1
6: X = D1H H D1H σ
(n) H H (n−1) I Y,
(n) $ $ % 2%
7: σ2 = min max σL2 , σ̂2 ,σU , with σ̂2 = N MN
1
1
H X(n) − Y 22
(n−1) (n)
8: until g(X (n−1),σ2 ) − g(X(n),σ2 ) > δ.
9: Output. Estimated 2-D profile X̂ = X (n) .
Remarkably, as shown in [27],

& '
(n)
• the sequence of points X (n),σ2 generated by Algorithm 2 decreases the
objective function in P; & '
any cluster point of the produced sequence X (n),σ 2 (n) is a Karush–Kuhn–
•
Tucker (KKT) point to P.
To gain more insights on the effectiveness of the proposed recovery approach, let us
observe that when q → 0, (9.22) converges to
K
1 NF
$ %
X k 2 + . (9.23)
k=1
As a consequence, the penalty term (9.23) promotes block-sparsity in the profile

recovery, since small values of Xk 2 lead to very low values of the objective to
minimize. Otherwise stated, (9.21) pushes for multiple null rows in the recovered
matrix X. Before proceeding, it is worthwhile to noting that the BSLIM approach could
have also been framed in the context of Bayesian estimation. Specifically, the solution
to (9.21) defines the maximum a posteriori (MAP) estimate of X,σ2 for the following
Bayesian model
$ %
y i |X,σ2 ∼ N H x i ,σ2 I ,i = 1,. . .,N1
y i |X,σ2 , i = 1,. . .,N1,are statistically independent
random vectors,
$ %
σ ∼ U σL2 ,σU
2 2

: 'q/2 ;
2 &4
K81 NF
4
fX (X) ∝ exp − 4Xk 42 + , (9.24)
q
k=1
where σ2 and X are statistically independent quantities, whereas f (X) is a block-

sparsity promoting prior for X. Indeed, the MAP estimate of X and σ 2 is given by
the optimal solution to the following optimization problem
max fY |X,σ2 (Y |X,σ 2 )fX (X)fσ2 (σ2 ) , (9.25)
X,σ2
which is equivalent P (it is enough to consider the negative logarithm of the objective
2 → ∞, a so called improper prior is assumed for σ 2 at the
in (9.25)). Notice that, as σU
recovery stage [54,55], tantamount to assuming that σ2 has equal probability over the
range [σL2 ,∞].
Adaptive Selection of the Parameter q

To make the developed BSLIM algorithm user parameter–free, an adaptive computation
of q is reported (the approach can be easily extended to account also for the smoothing
factor > 0). Precisely, inspired by [28], a procedure that jointly exploits the profile
estimates provided by the BSLIM strategy for different values of q and the BIC frame-
work [24,26] is described.
Let BSLIM be run with q = q̄ and let
• Rq̄ be the set of selected active row indices (see equation (9.27) and the related
description for more details on its evaluation);
• h(q̄) = |Rq̄ |, namely the number of selected active rows;
q̄
• X̄ be the least-squares estimate of X associated with the selected active rows
Rq̄ .
Hence, denoting by Iq ⊆ ]0,1] the discrete set of the considered q̄ values, the overall
BSLI M q
space–frequency profile is recovered as X̂ = X̄ , where q

= arg min BIC(q),
q∈Iq
with the BIC-based objective function BIC(q), q ∈ Iq , defined as
$ q %
BIC(q) = 2N MN1 log H X̄ − Y 22 + (2N1 + 2)h(q) log (2N MN1 ) . (9.26)
In equation (9.26), N MN1 is the size of the observation data, i.e., the product between
the number of available snapshots, antennas, and data-windows, while the factor
(2N1 + 2) in the second term of (9.26) represents the number of unknowns for each
source, i.e., the N1 complex valued amplitudes, the angle, and the frequency.
q̄
As to the evaluation of Rq̄ , again a BIC-based strategy is employed. Let X̂ be the
q̄ q̄
profile recovered by Algorithm 2 as q = q̄, X̂o be the matrix obtained from X̂ sorting
q̄ q̄ q̄
its rows so that X̂o,1 22 ≥ X̂o,2 22 ≥ . . ., ≥ X̂ o,N1 22, namely, the per-row energy of
q̄ q̄ q̄ q̄ q̄
X̂o is arranged in decreasing order, and r o = [ro (1),ro (2),. . .,ro (K1 NF )]T ∈ NK1 NF
q̄ q̄
be the vector containing the corresponding ordered row indices, i.e., X̂o,i = X̂ r q̄ (i),
i = 1,. . .,K1 NF . Then
q̄ q̄
Rq̄ = ro (1),. . .,ro (kq̄
) , (9.27)
where
kq̄
= arg min BICq̄ (k),
k∈{1,... K̄}
with
& k '
q̄ k q̄
BICq̄ (k) = 2N MN1 log H o X̂ o − Y 22 + (2N1 + 2)k log (2N MN1 ) ,
(9.28)
k = 1,. . ., K̄.
In (9.28)
q̄ k q̄ q̄
• H o ∈ CMN,k is the matrix containing the first k columns of H o , with H o the
matrix obtained from H sorting its columns according to the permutation induced
q̄
by the vector r o ;
q̄ k q̄
• X̂ o is the matrix containing the first k rows of X̂ o ;
• K̄ is an upper bound to the actual number of space–frequency sources (result-
ing from some upper bounds on the number of sources K and their frequency
support);
• the second term of (9.28) represents the BIC-based penalty.
9.4 Performance Analyses
In this section, the capability to perform 2-D spectrum sensing of both BSLIM and
BIC-based IAA processing algorithms is assessed on both simulated and measured data
scenarios.
9.4.1 Simulated Data

A spectrum-sensing bandwidth of B = 500 MHz around the carrier frequency f0 =
2.4 GHz is considered. Moreover, the sensor is equipped with a uniform linear array
of M = 10 antennas where the spacing between the antennas is d = λ 0 /2 (λ0 the
operating wavelength) and σ2 = 1.
In the analyzed scenario K = 4 emitters with θ̄1 = π/(10 K1), θ̄2 = π/40, θ̄3 =
π/5, and θ̄4 = −π/10 are present. The first three emitters are communication sources
operating on [−0.1B − 2T1 s (1 + β), − 0.1B + 2T1 s (1 + β)], [0.2B − 2T1 s (1 + β),0.2B +
2Ts (1 + β)], and [− 2Ts (1 + β), 2Ts (1 + β)], respectively, with β = 0.5 and Ts = 10/B.
1 1 1
The fourth emitter is a jammer radiating a zero-mean circularly symmetric Gaussian

signal with a flat spectrum over [−0.1B,0.1B]. As to the communication sources, they
transmit data via a quadrature phase shift keying (QPSK) modulation employing a root-
raised-cosine filter with roll-off parameter β and symbol rate Ts as reference pulse.
Finally, the signal-to-noise ratio (SNR) of the different emitters is 10 dB.
In the following, K1 = 40 uniformly spaced discrete angles over the interval
[−π/2,π/2] are considered. N1 = 10 independent data-windows are processed each
with N = 100 snapshots and NF = 2N , i.e., twice the frequency resolution induced
by the number of available samples. Figure 9.2 displays the nominal space–frequency
profile of the analyzed scenario, obtained evaluating for each angle bin the average,
over the N1 data-windows, energy spectral density (ESD) of the signal received on
90 1
80
70 0.9
60
0.8
50
40
0.7
30
20 0.6
q (deg)
10
0 0.5
–10
–20 0.4
–30
0.3
–40
–50
0.2
–60
–70 0.1
–80
–90 0
–2.5–2.3–2.1–1.9–1.7–1.5–1.3–1.1–0.9–0.7–0.5–0.3–0.1 0.1 0.3 0.5 0.7 0.9 1.1 1.3 1.5 1.7 1.9 2.1 2.3 2.5
f(Hz) ×108
Figure 9.2 Space–frequency profile of the analyzed scenario. The gray scale is proportional to the
energy in each space–frequency bin.
(a)
90 1
80
70 0.9
60
0.8
50
40
0.7
30
20 0.6
q (deg)
10
0 0.5
–10
–20 0.4
–30
0.3
–40
–50
0.2
–60
–70 0.1
–80
–90 0
–2.5–2.3–2.1–1.9–1.7–1.5–1.3–1.1–0.9–0.7–0.5–0.3–0.1 0.1 0.3 0.5 0.7 0.9 1.1 1.3 1.5 1.7 1.9 2.1 2.3 2.5
f(Hz) 8
×10
(b)
Figure 9.3 Space–frequency occupancy map, first trial: (a) BSLIM; (b) BIC-based IAA
processing. The detected bins correspond to the hotter pixels.
the considered angle direction, normalized to the maximum value. The plot clearly
highlights the space–frequency portions occupied by the K emitters.
Figures 9.3 and 9.4 show the space–frequency occupancy maps recovered via the
BLSIM technique with adaptive selection of q and the BIC-based IAA processing for
two independent trials. In Algorithm 1, the exit condition based on the maximum num-
ber of iterations p̄ = 4 is considered and the BIC-based IAA processing assumes
K̄ = 180. In Algorithm 2 δ = 10−1 , = 10−6 , σL2 = 1, and σU 2 = 10. Also, the
adaptive selection of q supposes K̄ = 180 and Iq = {0.01,0.12,0.23,0.34,0.45,0.56,

0.67,0.78,0.89,1}.
The obtained maps clearly show that both BLSIM and BIC-based IAA are able to pro-
vide almost exact recovery localizing accurately the sources in both angle and frequency
domains. However, BSLIM exhibits fewer false alarms than the counterpart (there is no
(a)
90 1
80
70 0.9
60
0.8
50
40
0.7
30
20 0.6
q (deg)
10
0 0.5
–10
–20 0.4
–30
0.3
–40
–50
0.2
–60
–70 0.1
–80
–90 0
–2.5–2.3–2.1–1.9–1.7–1.5–1.3–1.1–0.9–0.7–0.5–0.3–0.1 0.1 0.3 0.5 0.7 0.9 1.1 1.3 1.5 1.7 1.9 2.1 2.3 2.5
f(Hz) 8
×10
(b)
Figure 9.4 Space–frequency occupancy map, second trial : (a) BSLIM; (b) BIC-based IAA
processing. The detected bins correspond to the hotter pixels.
angular dithering). More important, BSLIM presents a computational complexity lower

than the BIC-based IAA processing. Finally, due to the dictionary redundancy, both
approaches detect just a subset of frequency bins for each emitter. Indeed, the considered
temporal directions are linearly dependent vectors (due to the frequency oversampling)
and both BLSIM and BIC-based IAA processing are devised to automatically pick up
an essential subset of temporal signatures to reconstruct the signal. Otherwise stated, the
frequency representation of the signals is not unique and the results highlight the ability
of the considered approaches to retrieve a sparse signal description.
To further grasp insights on the performance of the considered recovery strategies, in
Table 9.1 the empirical false alarm rate, PF A , and the empirical detection probability,
PD , are reported assuming the same simulation setup as in Figure 9.2. Specifically,
Table 9.1 Empirical false alarm rate, PFA , and empirical detection
probability, PD , for BSLIM and BIC-based IAA processing for the
sensing scenario of Figure 9.2.
Recovery Algorithm PF A PD
BSLIM 5.0852 × 10−5 0.9812

BIC-based IAA processing 0.0027 0.9197
Figure 9.5 Empirical ROC for the filter bank approach (solid curve), IAA algorithm (dashed
curve), BSLIM q = 1 (dashed-dotted curve), and BSLIM q = 0.45 (dotted curve).
for both BSLIM and BIC-based IAA techniques, a moving average filtering (with 2
equal weights) along the frequency dimension is performed over the recovered profile
(containing in each space–frequency bin the estimated average energy) so as to handle
the intrinsic on-off behavior of the map. Hence, PF A is evaluated counting the detections
(over 20 independent trials, each composed of N1 = 10 independent data-windows6 ) in
the space–frequency bins where no emitters are present. Analogously, PD is obtained
counting the number of detections, over the same 20 independent trials, where space–
frequency sources are located. Inspection of the table confirms previous considerations
and highlight that BSLIM can provide, for the analyzed situations, better performance
than BIC-based IAA.
Finally, in Figure 9.5 the empirical receiver operating characteristic (ROC) curves of
the filter bank approach, IAA processing, and BLSIM (with q equal to either 0.45 or 1)
are reported, assuming the same simulation setup as in Figure 9.2, but with SNR = 5dB.
Specifically, for each approach, the threshold is set so as to ensure the desired empirical
6 This setup corresponds to 157,320 space–frequency bins, which are emitters free and 2,080 bins occupied
by RF sources, where some guard cells for the communication transmitters have been included (the 3-dB
bandwidth measure is considered for each emitter).
PF A over 20 independent trials. Hence, PD is evaluated comparing the actual threshold

with the final estimated map. The latter is obtained performing a noncoherent energy
integration of the N1 estimated space–frequency profiles, i.e., for any space–frequency
bin the mean of the square modulus for the available N1 estimates is evaluated regardless
of the approach. In addition, BSLIM employs a moving average filtering (with 2 equal
weights) along the frequency dimension for both q’s.
The results clearly corroborate the superiority of BSLIM and IAA over the filter bank
approach, especially for low values of the false alarm rate. This result reflects the leakage
effect suffered by the filter bank. Finally, BSLIM uniformly outperforms IAA when
q = 1.
9.4.2 Measured Dataset

In this subsection, the performance of the developed algorithms on measured data is
assessed via the SDR device “RTL-SDR R820T2 RTL2832U 1PPM TCXO.” It works
over the frequency range 24-1766 MHz and allows to sense an arbitrary bandwidth
belonging to the mentioned interval. The discrete-time signal is obtained first down-
converting the received continuous-time signal at the intermediate frequency (IF) of
3.57 MHz and using a sampling frequency of 28 MHz. Then the baseband discrete-time
signal at the Nyquist rate for the sensing bandwidth of interest is collected via digital
processing. Additional information about this device are available at [56].
The experimental setup herein considered is illustrated in Figure 9.6 where a log-
periodic antenna is connected to the SDR device to sense the environment. In this
context, no angle discrimination can be performed among the RF sources since just one
spatial channel is available at the receiver. Otherwise stated, the 2-D spectrum sensing
reduces to the conventional 1-D problem where the goal is to establish which frequency
bins are occupied by RF emitters. Hence, in the following, the radio environment state
characterization is described via the frequency occupancy map associated with a refer-
ence angle, i.e., θ = 0 without loss of generality.
The first conducted experiment refers to a sensing bandwidth lying in the range
of frequencies occupied by terrestrial trunked radio (TETRA) communications [57].
Precisely, a center frequency f0 = 393 MHz, a one-sided bandwidth of B = 2 MHz,
and a sampling frequency of 2 complex Mega-samples per second are considered.
As to the parameters used by the filter bank approach, the IAA-based algorithms, and
the BSLIM techniques, it is assumed:
• M = 1 receive antenna;
• N = 100 samples per snapshot;
• N1 = 10 snapshots;
• NF = 2N frequency bins.
As in Subsection 9.4.1, Algorithm 1 involves a maximum number of iterations given by
p̄ = 4 and the BIC-based IAA processing uses K̄ = 180. Additionally, in Algorithm 2,
δ = 10−1 , = 10−6 , σL2 = 0.004, and σU 2 = 10. The lower bound to the noise
power level is estimated via a measured data spectrum where no RF sources are present,
Figure 9.6 Experimental setup for measured data acquisition via SDR device.
and this value is also used to evaluate the interference covariance matrix in the IAA-
based procedures. The upper bound is higher than 10 times the power of the available
data. Moreover, the adaptive selection of q assumes K̄ = 180 and Iq = {0.01,0.12,
0.23,0.34,0.45,0.56,0.67,0.78,0.89,1}.
For comparison purposes also the behavior of the filter bank technique, Algorithm 1
without BIC-based stage, and Algorithm 2 for a fixed q is analyzed. To this end, it is
assumed that they perform a noncoherent energy integration of the N1 = 10 estimated
profiles, and the resulting value in each bin is compared with a threshold, i.e., the noise
power level times a multiplicative factor (empirically set).
In Figure 9.7, the results corresponding to the 1,000 collected samples are displayed.
In particular, Figure 9.7.a illustrates the square modulus of the acquired data Fourier
transform versus frequency, i.e., the data spectrum, while Figures 9.7(b), 9.7(c), 9.7(d),
9.7(e), and 9.7(f) illustrate the frequency occupancy maps obtained via the filter bank
approach, the IAA algorithm, the BIC-based IAA processing, the BSLIM algorithm
with adaptive selection of q, and Algorithm 2 with q = 1, respectively. Specifically,
the vertical yellow lines in these maps identify the frequency bins where RF sources
are detected.
Inspection of the maps highlights the superiority of the IAA-based strategies and
BSLIM techniques over the conventional filter bank approach. Indeed, the latter pro-
(a) (b)
(c) (d)
(e) (f)
Figure 9.7 Results associated with the first set of 1,000 samples collected in the TETRA
bandwidth. (a) data spectrum. Frequency occupancy map: (b) filter bank approach; (c) IAA
algorithm; (d) BIC-based IAA processing; (e) BSLIM (q = 1); (e) BSLIM (adaptive q
selection). The detected frequency bins correspond to the hotter pixels.
vides an unreliable recovery of the frequency occupancy profile mainly due to the energy
spillover effect induced by the multiple spectral contributions. On the other hand, both
IAA-based algorithms and BSLIM strategies are able to accurately retrieve the actual
frequency bins occupied by RF sources. However, BLIM shares a lower computational
complexity than the counterpart. Notice that a possible missed detected emitter could
be present around the frequency −0.9 MHz. However, it can not be a priori claimed if it
is an emitter or a noise spike. Remarkably, BSLIM algorithm with adaptive selection of
q and the BIC-based IAA processing are totally adaptive and do not require any ad hoc
threshold selection, which is a valuable feature from a practical point of view.
(a) (b)
(c) (d)
(e) (f)
Figure 9.8 Results associated with the second set of 1,000 samples collected in the TETRA
Figure 9.8 reports the results of the described spectrum-sensing algorithms for
another dataset still composed of 1,000 samples, assuming the same setup as in
Figure 9.7. The preceding remarks hold true also in this case, namely the IAA-based
strategies and BSLIM techniques prove effective to recover the frequency occupancy
maps.
In the second experiment the focus is on a sensing bandwidth belonging to the spec-
trum occupied by global system for mobile (GSM) communications [58]. Specifically,
a center frequency f0 = 943 MHz, a one-sided bandwidth of B = 2.8 MHz, and a
sampling frequency of 2.8 complex Mega-samples per second are considered.
(a) (b)
(c) (d)
(e) (f)
Figure 9.9 Results associated with the first set of 1,000 samples collected in the GSM bandwidth.
(a) data spectrum. Frequency occupancy map: (b) filter bank approach; (c) IAA algorithm;
(d) BIC-based IAA processing; (e) BSLIM (q = 1); (e) BSLIM (adaptive q selection).
The detected frequency bins correspond to the hotter pixels.
In this analysis, the parameters used by all the procedures are the same as in the
first experiment7 but for σL2 = 0.01 and the threshold used after noncoherent energy
integration in filter bank technique, Algorithm 1 without BIC-based stage, and Algo-
rithm 2 with q = 1. Indeed, a different multiplicative factor to the noise power level is
considered.
Figures 9.9 and 9.10 show the results associated with two different experiments,
each corresponding to a dataset of 1,000 samples. Like the previous analysis, both the
7 Specifically, the adopted parameters are: p̄ = 4, K̄ = 180,I = {0.01,0.12,0.23,0.34,0.45,0.56,0.67,

q
0.78,0.89,1},δ = 10−1, = 10−6 , and σU 2 = 10.
(a) (b)
(c) (d)
(e) (f)
Figure 9.10 Results associated with the second set of 1,000 samples collected in the GSM
acquired data spectrum and the obtained frequency occupancy maps are provided for
each dataset.
The obtained maps confirm the ability of the IAA-based and BSLIM strategies to
correctly detect and identify frequency bins occupied by RF sources. Finally, the con-
ventional filter bank approach exhibits performance close to the other algorithms for the
datasets in the GSM bandwidth, unlike the other analyses.
As final and concluding remark, the reported analysis has highlighted the effective-
ness of the introduced approaches to endow spectrum awareness. Moreover, accounting
for both computational complexity and recovery ability, BSLIM appears to be the best
choice.
9.5 Conclusions
2-D spectrum sensing has been considered to gather real-time space–frequency scenario
awareness, assuming a sensor equipped with multiple receive channels. A formal
discrete-time sensing signal model has been developed and two adaptive signal
processing algorithms have been introduced for recovering the space–frequency
occupancy map via block-sparsity exploitation. The former employs the IAA technique
and incorporates a BIC-based stage to promote block-sparsity in the recovery process.
The latter applies the RML estimation paradigm to automatically push for block-sparsity
in the 2-D profile evaluation.
At the analysis stage both simulated and measured data scenarios are considered to
evaluate the capability of the proposed procedures to retrieve the actual space–frequency
occupancy maps. The reported results clearly show the effectiveness of the developed
tools and their superiority over the conventional filter bank approach. Precisely, both
BIC-based IAA and BSLIM with adaptive q selection are able to grant an accurate
and reliable space–frequency occupancy map recovery without requiring an ad hoc
threshold selection. Furthermore, better performance than the classic filter bank is
obtained at the price of a higher computational complexity. Finally, considering both
computational complexity and recovery ability, it is reasonable to use BSLIM for 2-D
spectrum sensing.
Future research might concern the use of constant false alarm rate (CFAR) strategies
to pick up the most relevant space–frequency contributions within the estimated profile,
as well as the analysis of different model order selection rules both in the IAA context
and for the adaptive selection of the parameter q.
Acknowledgment
The work of Drs. Augusto Aubry and Antonio De Maio was sponsored by the US Army
RDECOM International Technology Center – Atlantic in partnership with the US Army
Research Laboratory under USAITC-A Seedling Project W911NF-17-2-0134.
References
[1] M. Wicks, “Spectrum crowding and cognitive radar,” in 2010 2nd International Workshop
on Cognitive Information Processing, Italy, June 2010, pp. 452–457.
[2] H. Griffiths, L. Cohen, S. Watts et al., “Radar spectrum engineering and management:
Technical and regulatory issues,” Proceedings of the IEEE, vol. 103, no. 1, pp. 85–102,
2015.
[3] A. Farina, A. De Maio, and S. Haykin, Eds., The Impact of Cognition on Radar Technology.
Schitech Publishing, 2017.
[4] H. He, P. Stoica, and J. Li, “Waveform design with stopband and correlation constraints for
cognitive radar,” in 2010 2nd International Workshop on Cognitive Information Processing,
Elba, Italy, June 2010.
[5] M. A. Govoni and R. A. Elwell, “Qualitative analysis of interference on receiver perfor-

mance using advanced pulse compression noise (APCN),” in SPIE Defense, Security, and
Sensing Conference, Baltimore, MD, May 2015.
[6] M. A. Govoni, “Enhancing spectrum coexistence using radar waveform diversity,” in IEEE
Radar Conference, Philadelphia, PA, May 2016.
[7] A. Aubry, A. De Maio, M. Piezzo et al., “Cognitive radar waveform design for spectral
coexistence in signal-dependent interference,” in IEEE Radar Conference, Cincinnati, OH,
May 2014.
[8] A. Aubry, V. Carotenuto, A. De Maio, A. Farina, and L. Pallotta, “Optimization theory-based
radar waveform design for spectrally dense environments,” IEEE Aerospace and Electronic
Systems Magazine, vol. 31, no. 12, pp. 14–25, 2016.
[9] A. Aubry, V. Carotenuto, and A. De Maio, “Forcing multiple spectral compatibility
constraints in radar waveforms,” IEEE Signal Processing Letters, vol. 23, no. 4, pp. 483–
487, 2016.
[10] Y. Zhao, J. Gaeddert, K. Bae, and J. Reed, “Radio environment map-enabled situation-aware
cognitive radio learning algorithms,” in Proceedings of Software Defined Radio Technical
Conference, Orlando, FL, November 2006.
[11] H. Tang, “Some physical layer issues of wide-band cognitive radio systems,” in IEEE Inter-
national Symposium on New Frontiers in Dynamic Spectrum Access Networks, Baltimore,
MD, November 2005.
[12] T. Yucek and H. Arslan, “A survey of spectrum sensing algorithms for cognitive radio
applications,” IEEE Communications Surveys & Tutorials, vol. 11, no. 1, pp. 116–130, 2009.
[13] A. Fehske, J. Gaeddert, and J. Reed, “A new approach to signal classification using spectral
correlation and neural networks,” in IEEE International Symposium on New Frontiers in
Dynamic Spectrum Access Networks, Baltimore, MD, November 2005.
[14] D. Guimaraes, R. A. de Souza, and A. Barreto, “Performance of cooperative eigenvalue
spectrum sensing with a realistic receiver model under impulsive noise,” Journal of Sensor
and Actuator Networks, vol. 2, pp. 46–69, 2013.
[15] Q. Wu, G. Ding, J. Wang, and Y. D. Yao, “Spatial-temporal opportunity detection for
spectrum-heterogeneous cognitive radio networks: Two-dimensional sensing,” IEEE Trans-
actions on Wireless Communications, vol. 12, no. 2, pp. 516–526, 2013.
[16] G. Ding, J. Wang, Q. Wu, F. Song, and Y. Chen, “Spectrum sensing in opportunity-
heterogeneous cognitive sensor networks: How to cooperate?” IEEE Sensors Journal,
vol. 13, no. 11, pp. 4247–4255, 2013.
[17] P. Wang, J. Fang, N. Han, and H. Li, “Multiantenna-assisted spectrum sensing for cognitive
radio,” IEEE Transactions on Vehicular Technology, vol. 59, no. 4, pp. 1791–1800, 2010.
[18] R. Zhang, T. J. Lim, Y. C. Liang, and Y. Zeng, “Multi-antenna based spectrum sensing for
cognitive radios: A GLRT approach,” IEEE Transactions on Communications, vol. 58, no. 1,
pp. 84–88, 2010.
[19] A. Taherpour, M. Nasiri-Kenari, and S. Gazor, “Multiple antenna spectrum sensing in
cognitive radios,” IEEE Transactions on Wireless Communications, vol. 9, no. 2, pp. 814–
823, 2010.
[20] K. L. Du and W. H. Mow, “Affordable cyclostationarity-based spectrum sensing for
cognitive radio with smart antennas,” IEEE Transactions on Vehicular Technology, vol. 59,
no. 4, pp. 1877–1886, 2010.
[21] M. Tang, G. Ding, Q. Wu, Z. Xue, and T. A. Tsiftsis, “A joint tensor completion and
prediction scheme for multi-dimensional spectrum map construction,” IEEE Access, vol. 59,
no. 4, pp. 8044–8052, 2016.
[22] A. Aubry, A. De Maio, and M. Govoni, “Two-dimensional spectrum sensing for cognitive
radar,” in IEEE Radar Conference, Oklahoma City, OK, April 2018.
[23] S. D. Blunt and K. Gerlach, “A novel pulse compression scheme based on minimum
mean-square error reiteration,” in Proceedings of the International Conference on Radar,
Adelaide, Australia, September 2003.
[24] W. Roberts, P. Stoica, J. Li, T. Yardibi, and F. A. Sadjadi, “Iterative adaptive approaches to
mimo radar imaging,” IEEE Journal of Selected Topics in Signal Processing, vol. 4, no. 1,
pp. 5–20, 2010.
[25] T. Yardibi, J. Li, P. Stoica, M. Xue, and A. B. Baggeroer, “Source localization and
sensing: A nonparametric iterative adaptive approach based on weighted least squares,”
[26] P. Stoica and Y. Selen, “Model-order selection: A review of information criterion rules,”
IEEE Signal Processing Magazine, vol. 21, no. 4, pp. 36–47, 2004.
[27] A. Aubry, V. Carotenuto, A. De Maio, and M. Govoni, “Multi-snapshot spectrum sensing for
cognitive radar via block-sparsity exploitation,” IEEE Transactions on Signal Processing,
vol. 67, no. 6, pp. 1396–1406, 2019.
[28] X. Tan, W. Roberts, J. Li, and P. Stoica, “Sparse learning via iterative minimization with
application to mimo radar imaging,” IEEE Transactions on Signal Processing, vol. 59, no. 3,
pp. 1088–1101, 2011.
[29] J. Li and P. Stoica, “Efficient mixed-spectrum estimation with applications to target feature
extraction,” IEEE Transactions on Signal Processing, vol. 44, no. 2, pp. 281–295, 1996.
[30] J. Li, P. Stoica, and D. Zheng, “Angle and waveform estimation via relax,” IEEE Transac-
tions on Aerospace and Electronic Systems, vol. 33, no. 3, pp. 1077–1087, 1997.
[31] S. M. Kay, Ed., Fundamentals of Statistical Signal Processing, Volume II: Detection Theory.
Prentice Hall, 1998.
[32] L. L. Scharf, Ed., Statistical Signal Processing. Addison-Wesley Reading, 1991.
[33] M. A. Richards, J. A. Scheer, and W. A. Holm, Eds., Principles of Modern Radar: Basic
Principles. SciTech Publishing, 2010.
[34] P. Stoica and R. L. Moses, Eds., Spectral Analysis of Signals. Pearson Prentice Hall, 2006.
[35] S. D. Blunt, K. Gerlach, and T. Higgins, “Aspects of radar range super-resolution,” in IEEE
Radar Conference, Boston, MA, April 2007.
[36] J. R. Guerci, Ed., Cognitive Radar: The Knowledge-Aided Fully Adaptive Approach. Artech
House Inc., 2010.
[37] D. R. Fuhrmann and G. San Antonio, “Transmit beamforming for mimo radar systems using
signal cross-correlation,” Transactions Aerospace Electronic System, vol. 44, no. 1, pp. 171–
186, 2008.
[38] N. Shariati, D. Zachariah, and M. Bengtsson, “Minimum sidelobe beampattern design for
mimo radar systems: A robust approach,” in IEEE International Conference on Acoustics,
Speech, and Signal Processing, Florence, Italy, May 2014.
[39] A. Aubry, A. De Maio, and Y. Huang, “MIMO radar beampattern design via PSL/ISL
optimization,” IEEE Transactions on Signal Processing, vol. 64, no. 15, pp. 3955–3976,
2016.
[40] A. Konar and N. D. Sidiropoulos, “Hidden convexity in qcqp with toeplitz-hermitian
quadratics,” IEEE Signal Processing Letters, vol. 10, no. 22, pp. 1623–1627, 2015.
[41] A. Aubry, V. Carotenuto, and A. De Maio, “New results on generalized fractional program-
ming problems with Toeplitz quadratics,” IEEE Signal Processing Letters, vol. 10, no. 22,
pp. 1623–1627, 2015.
[42] O. Aldayel, V. Monga, and M. Rangaswamy, “Successive qcqp refinement for mimo radar
waveform design under practical constraints,” IEEE Transactions on Signal Processing,
vol. 14, no. 64, pp. 1623–1627, 2015.
[43] M. Xue, L. Xu, and J. Li, “IAA spectral estimation: Fast implementation using the Gohberg–
Semencul factorization,” IEEE Transactions on Signal Processing, vol. 59, no. 7, pp. 3251–
3261, 2011.
[44] G. O. Glentis and A. Jakobsson, “Efficient implementation of iterative adaptive approach
spectral estimation techniques,” IEEE Transactions on Signal Processing, vol. 59, no. 9,
pp. 4154–4167, 2011.
[45] G. O. Glentis and A. Jakobsson, “Superfast approximative implementation of the iaa spectral
estimate,” IEEE Transactions on Signal Processing, vol. 60, no. 1, pp. 472–478, 2012.
[46] W. Sun, H. C. So, Y. Chen, L. T. Huang, and L. Huang, “Approximate subspace-based
iterative adaptive approach for fast two-dimensional spectral estimation,” IEEE Transactions
on Signal Processing, vol. 62, no. 12, pp. 3220–3231, 2014.
[47] Y. C. Eldar and M. Mishali, “Robust recovery of signals from a structured union of
subspaces,” IEEE Transactions on Information Theory, vol. 55, no. 11, pp. 5302–5316,
2009.
[48] Y. C. Eldar, P. Kuppinger, and H. Bolcskei, “Block-sparse signals: Uncertainty relations and
efficient recovery,” IEEE Transactions on Signal Processing, vol. 58, no. 6, pp. 3042–3052,
2010.
[49] Y. C. Eldar and G. Kutyniok, Eds., Compressed Sensing: Theory and Applications.
Cambridge University Press, 2012.
[50] Y. Wang, J. Wang, and Z. Xu, “On recovery of block-sparse signals via mixed l2 / lq (0 <
q ≤ 1) norm minimization,” EURASIP Journal on Advances in Signal Processing, vol. 76,
pp. 1–17, 2013.
[51] M. Lai, Y. Xu, and W. Yin, “Improved iteratively reweighted least squares for unconstrained
smoothed lq minimization,” SIAM Journal on Numerical Analysis, vol. 51, pp. 927–957,
2013.
[52] M. Razaviyayn, M. Hong, and Z.-Q. Luo, “A unified convergence analysis of block succes-
sive minimization methods for nonsmooth optimization,” SIAM Journal on Optimization,
vol. 23, no. 2, pp. 1126–1153, 2013.
[53] A. Aubry, A. D. Maio, A. Zappone, M. Razaviyayn, and Z. Luo, “A new sequential
optimization procedure and its applications to resource allocation for wireless systems,”
[54] A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin, Eds., Bayesian Data Analysis. Boca
Raton, FL: Chapman & Hall/CRC, 2004.
[55] C. P. Robert and G. Casella, Eds., Monte Carlo Statistical Methods.Springer Science +
Business Media, 2004.
[56] R. W. Stewart, K. W. Barlee, D. S. Atkinson, and L. H. Crockett, Software Defined Radio
Using MATLAB & Simulink and the RTL-SDR. Strathclyde Academic Media, 2015.
[57] J. Dunlop, D. Girma, and J. Irvine, Digital Mobile Communications and the TETRA System.
John Wiley & Sons, 2013.
[58] M. Rahnema, “Overview of the GSM system and protocol architecture,” IEEE Communica-
tions Magazine, vol. 31, no. 4, pp. 92–100, 1993.
10 Cooperative Spectrum Sharing
between Sparse Sensing-Based
Radar and Communication Systems
∗ ∗∗
Bo Li and Athina P. Petropulu
10.1 Introduction
Radio spectrum has been essential to a wide array of technologies, including com-
munications, position and navigation systems, and radars. Until recently, radar and
communication systems jointly consumed most of the spectrum below 6 GHz, with
commercial and noncommercial (i.e., military radar) uses assigned on distinct bands.
However, recent studies have shown that large chunks of spectrum designated for
radar applications are underutilized [1], while there is spectrum congestion in bands
devoted to commercial wireless communications. Further, as the number of connected
devices grows, the spectrum congestion problem will worsen. The nonuniform spectrum
utilization is clearly illustrated in the data collected in downtown Berkeley, California,
as shown in Figure 10.1 [2].
To address the need for more efficient use of spectrum, US government agencies have
been examining the possibility of allowing wireless broadband systems to operate in
the 3500–3650 MHz band, previously used exclusively by high-powered shipborne, air-
borne, and ground-based radar systems operated by the Department of Defense [3,4]. On
examining the viability of coexistence of radar and communication systems in that band,
the National Telecommunications and Information Administration (NTIA) proposed the
use of exclusion zones [5], which would protect base stations from radar interference.
However, those zones cover large US metropolitan areas, and such an approach would
not solve the problem of improving spectral efficiency.
The main problem with different systems using the same spectrum is the interference
that one system exerts to the other. According to a 2006 NTIA report [6], levels of
interference to noise ratio between −9 dB and −2 dB, which are below the thermal noise
floor level of the radar receiver, can reduce the probability of target detection. Also, the
interference generated by the radar reduces the throughput of a communication system
operating nearby.
Spectrum-sharing methods are focused on enabling radar and communication
systems to share the spectrum efficiently by minimizing interference effects. The
∗ Bo Li is now with Aurora Innovation, Inc. This work was supported by NSF under Grant ECCS-1408437.
∗∗ This work was supported by NSF under Grant ECCS-1408437.
284
Cooperative Spectrum Sharing 285
Figure 10.1 Spectrum utilization measurement in 0–6 GHz band [2].
spectrum-sharing literature for controlling interference includes works that explore

large physical separation [7–9], or dynamic spectrum access methods [10–14] using
orthogonal frequency division multiplexing (OFDM) signals and optimally allocating
subcarriers between the two systems [15–17]. Methods that use specially designed
radar waveforms have also been considered [18–21]. There are also works that explore
the spatial degrees of freedom enabled by the use of multiple antennas at both systems
[22–28]. Multiple-input multiple-output (MIMO) radars offer spatial degrees of freedom
and have been used for spectrum sharing with communication systems. In earlier MIMO
radar works, the interference mitigation was considered either for the communication
system [22–25], or for the radar [28], but not for both. For example, in the well studied
null space projection based scheme of [22–27], the radar eliminates its interference
to the communication system by projecting its waveforms onto the null space of
the interference channel. However, in those works, the interference generated by the
communication system to the radar was not addressed. By applying spatial multiplexing
to each system in isolation, one may miss out on the potential performance improvement
stemming from a coordinated operation of the two systems. Indeed, recent advances
in cognitive radio and cloud technology, provide a framework that can be leveraged
for a coordinated operation. A related emerging technology is the dual-function radar-
communication system, which achieves the objectives of two systems by using the
same transmitter or receiver resources. Readers can refer to [29] for an overview of
dual-function radar-communication systems.
In this chapter, we discuss some approaches that rely on spatial multiplexing and
sparse sensing, and exploit a cooperative framework for the use of spectrum [30–36]. In
particular, we study cooperative spectrum sharing between MIMO radars that employ
sparse sensing and matrix completion (MIMO-MC), and MIMO communication sys-
tems. As will be discussed in this chapter, MIMO-MC radars are particularly well
suited for spectrum sharing [30,31]. This is because each radar receive antenna uses a
time-varying sparse sampling scheme, which effectively modulates the communication-
radar interference channel and increases its null space. This gives the opportunity to the
communication system to transmit along that null space and thus avoid interfering with
the radar. Also, a cooperative design has the potential to improve spectrum utilization
due to the increased number of design degrees of freedom. Co-design requires access
to physical-layer information on both systems. For example, both systems would have
to share physical-layer information with a node designated as the control center, which
would optimally design the signaling schemes of each system. Obviously, this requires
286 Li and Petropulu
the systems’ willingness to cooperate, with payback reduced interference for the radar
and higher throughput for the communication system. Of course, the amount of infor-
mation that can be shared and the privacy issues involved would have to be evaluated
in each case. Examples of radar systems that could be amenable to such cooperation
include radar for autonomous vehicles, weather monitoring, etc.
The remainder of this chapter is organized as follows. Section 10.2 provides some
background on MIMO-MC radars. Section 10.3 presents the coexistence model between
a MIMO-MC radar and a MIMIO communication system, and outlines all assumptions
made. Section 10.4 formulates the problem of spectrum sharing and provides efficient
algorithms for solving it. Finally, Section 10.5 presents simulation results demonstrating
the performance gains of cooperation, and Section 10.6 offers some concluding remarks.
10.2 MIMO Radars Using Sparse Sensing
MIMO radars transmit different waveforms from their transmit (TX) antennas. Their
receive (RX) antennas forward their measurements to a fusion center, where a data
matrix is populated with the information received by each RX antenna. This matrix is
then used in array processing methods to estimate target information. For a relatively
small number of targets as compared to the number of TX and RX antennas, the data
matrix is low rank [37], and thus can be reconstructed with provable accuracy (under
certain conditions) based on a small set of its entries via matrix completion (MC)
[38,39]. This observation is the basis of MIMO-MC radars. In MIMO-MC radars, each
RX antenna forwards to the fusion center a small number of pseudo-randomly sub-
Nyquist sampled measurements, along with their sampling times, and partially fills a
row of the data matrix. Under certain conditions [39], the full data matrix, corresponding
to Nyquist sampled data, can be stably recovered via MC techniques, with reconstruc-
tion error proportional to the noise level. The recovered matrix can be subsequently used
for target detection via standard array processing methods [37]. The sub-sampling at
the antennas avoids the need for high-rate analog-to-digital converters, and the reduced
number of samples translates into power and bandwidth savings over the antenna-fusion
center link. Because MIMO-MC radars recover all missing entries of the data matrix,
they do not suffer from signal-to-noise ratio (SNR) loss due to data sub-sampling.
MIMO-MC radars can achieve the high resolution of traditional MIMO radars with
significantly fewer samples and reduced hardware complexity. We should note that
compressive sensing (CS)–based MIMO radars [40–42] are another sample-reduction
approach in the literature. Compared to MIMO-CS radars, MIMO-MC radars avoid
basis mismatch problems.
Let us consider a collocated MIMO radar system using uniform linear arrays (ULA)
for both transmission and reception. Let Mt,R and Mr,R be the number of TX and
RX antennas, respectively, and dt and dr the inter-element spacing at the transmit and
receive arrays, respectively. The radar transmits L pulses, with pulse repetition interval
(PRI) TPRI and carrier wavelength λc . The radar operates in two phases; in the first phase
the TX antennas transmit waveforms and the RX antennas receive target returns, while
in the second phase, the RX antennas forward their measurements to a fusion center.
The K far-field targets are with distinct angles {θk }, target reflection coefficients {βk },
and Doppler shifts {νk }, and are assumed to fall in the same range bin. In the absence of
clutter, the noisy data matrix at the fusion center can be formulated as [37,39,43]
Y R = V r V Tt P S + W R , (10.1)
where the m-th row of Y R ∈ CMr,R ×L contains the L fast-time raw samples for-
warded by the m-th antenna [44]; S = [s(1),. . .,s(L)], is the waveform matrix, with
s(l) = [s1 (l),. . .,sMt,R (l)]T being the l-th snapshot across the transmit antennas; the
transmit waveforms are assumed to be orthogonal, i.e., it holds that SS H = I Mt,R
[37]; W R denotes additive noise; and P ∈ CMt,R ×Mt,R denotes the transmit precoding
matrix. V t [vt (θ1 ),. . .,vt (θK )] and V r [vr (θ1 ),. . .,vr (θK )] respectively denote
the transmit and receive steering matrix and vr (θ) ∈ CMr,R is the receive steering vector
defined as
r T
vr (θ) e−j 2π0ϑ ,. . .,e−j 2π(Mr,R −1)ϑ ,
r
(10.2)
where ϑr = dr sin(θ)/λc denotes the spatial frequency with respect to the receive array.
vt (θ) ∈ CMt,R is the transmit steering vector and is respectively defined. Matrix is
defined as diag([β1 ej 2πν1 ,. . .,β K ej 2πνK ]). D V r V Tt is the target response
matrix. At the fusion center, Y R passes through matched filters, after which target
estimation is performed via standard array processing methods [45].
When K is smaller than Mr,R and L, the noise-free data matrix, M DP S, is
low rank and under certain conditions can be provably recovered based on a subset of
its entries. This observation gave rise to MIMO-MC radars [37,39,43], where each RX
antenna sub-samples the target returns and forwards the samples to the fusion center.
The partially filled data matrix at the fusion center can be mathematically expressed as
follows (see [37, scheme I]):
Y R = (M + W R ), (10.3)
where denotes the Hadamard product; is the sub-sampling matrix containing 0’s
and 1’s. The sub-sampling rate p equals 0 /(LMr,R ), where 0 denotes the num-
ber of 1’s in . When p equals 1, the matrix is filled with 1’s, and the MIMO-MC
radar is identical to the traditional MIMO radar. At the fusion center, the completion of
M can be achieved by the following nuclear norm minimization problem [38]
min M∗ s.t. M − Y R F ≤ δ, (10.4)

M
where ·∗ denotes the matrix nuclear norm; δ > 0 is a parameter determined by the
sampled entries of the noise matrix, i.e., W R . It is shown in [38] that the recovery
of M is stable against noise. The matrix recovery error is proportional to the noise level
δ, while the recovery conditions are given in terms of the coherence of M. [38]
The coherence parameters of M, (μ0,μ1 ), are defined as
μ0 ≥ max(μ(U ),μ(V )) (10.5)

K
K
μ1 ≥ ·k ∞,
U ·k V H (10.6)
Mr,R L
k=1
where U ∈ CMr,R ×K and V ∈ CL×K contain the left and right singular vectors of M;
U ·k denotes the k-th column vector of U ; · ∞ denotes the maximum entry of the
matrix. ; and μ(V ) is the coherence of subspace spanned by basis matrix V , defined as
: ;
L L
μ(V ) max V l· ∈ 1,
2
, (10.7)
K 1≤l≤L K
where V l· denotes the l-th row vector of V . The coherence of subspace U is similarly
defined.
According to [46], assuming that matrix M is sampled uniformly at random at m
points, there exist constants C and c, such that if
5 6
1/2
m ≥ C max μ21,μ0 μ1,μ0 n1/4 nK β log n (10.8)
for some β > 2, the minimizer of the nuclear norm problem is unique and equal to M
with probability at least 1 − cn−β .
The condition in (10.8) implies that the smaller the coherence parameters, the fewer
samples are needed for recovering the full matrix M. Upper bounds of the coherence
parameters can give us an idea of how small those parameters can be.
The coherence parameters of M for the case P = I Mt,R are given in the following
theorem [43].
theorem 10.1 (Coherence of M when P = I Mt,R ) Let the minimum spatial fre-
quency separation of the K targets be ξt and ξr with respect to the transmit and receive
2
arrays. On denoting the Fejér kernel by Fn (x) = n1 sin 2(πnx) , and for dt = dr = λc /2
sin (πx)
and

Mr,R Mt,R
K ≤ min , , (10.9)
FMr,R (ξr ) FMt,R (ξt )
it holds that

Mr,R
μ(U ) ≤ μr0 . (10.10)
Mr,R − (K − 1) FMr,R (ξr )
Further, if every snapshot of the waveforms S ·l ≡ s(l) satisfies the following equation:
Mt,R 0 π π1
|S T·l vt (θ)|2 = , ∀l ∈ N+
L ,θ ∈ − , , (10.11)
L 2 2
then μ(V ) is upper bounded by

Mt,R
μ(V ) ≤ μt0 . (10.12)
Mt,R − (K − 1) FMt,R (ξt )
Consequently,
√ the matrix M has coherence parameters μ0 max{μr0,μt0 } and μ1
K μ0 .
The bounds in Theorem 10.1, along with the orthogonality property of the radar wave-
forms in (10.11), were used to design waveforms with good incoherence properties. The
work of [43] involves numerical optimization on the complex Stiefel manifold, which
has high computational complexity. However, radar waveforms need to be updated
frequently as security against adversaries, which makes the issue of computational cost
more severe. It was also observed in [43] via simulations that using a random unitary
matrix [47] as the waveform matrix results in performance very close to that of the opti-
mum waveform, indicating that a random unitary matrix might be a good approximation
of the optimal waveform.
The matrix completion performance degrades when the signal-to-
interference-plus-noise ratio (SINR) drops below 10 dB [37], which suggests that
along with “good” radar waveforms, a precoder, designed to mitigate interference
would be very important. In the following subsection, we consider a MIMO-MC radar
that uses a random unitary matrix as the waveform matrix S, and a nontrivial precoder
maxtrix P [35,36].
10.2.1 MIMO-MC Radar Using Random Unitary Matrix

A random unitary matrix [47] can be obtained through Gram–Schmidt orthogonalization
of a random matrix with entries distributed as independent and identically distributed
(i.i.d.) Gaussian. This means that such waveforms can be generated easily. The follow-
ing theorem provides an upper bound of the coherence μ(V ) of M when a random
unitary waveform is used and a nontrivial radar precoder P is employed.
theorem 10.2 Consider the MIMO-MC radar presented in Section 10.2, using a
random unitary waveform matrix S, and with M as defined in (10.3). For any transmit
precoder P such that for K0 = Rank(M) it holds that K0 ≤ K, and for an arbitrary
transmit array geometry and target angles, the coherence of the right singular vector
subspace of M is bounded as
√
K0 + 2 3K0 ln L + 6 ln L
μ(V ) ≤ μ̃t0, (10.13)
K0
with probability 1 − L−2 , and the coherence of subspace U obeys μ(U ) ≤ K0 μ0 ,

K r
where
μr0 is defined in Theorems 10.1.
Proof The following lemma is used in the proof [48].
lemma 10.3 Let SN be a χ 2 random variable with N degrees of freedom. Then for
each t > 0
√
P SN − N ≥ t 2N + t 2 ≤ e−t /2 .
2
(10.14)
It is clear that K0 is not larger than K. Recall that M has a compact singular-value
decomposition (SVD) given as
M = U V H , (10.15)
where U ∈ CMr,R ×K0 and V ∈ CL×K0 contain the left and right singular vectors of M;
∈ RK0 ×K0 is diagonal containing the singular values. Consider the QR decomposition
of V r and S T P T V t :
V r = Qr R r ,
(10.16)
S P V t = Qt R t ,
T T
where Qr ∈ CMr,R ×K and Qt ∈ CL×K0 are with orthonormal columns, R r ∈ CK×K is

upper triangular, and R t ∈ CK0 ×K has an upper staircase form. The matrix R r R Tt ∈
CK×K0 is full column rank with a compact SVD given by U 1 1 V H 1 , where U 1 ∈
C K×K 0 , V1 ∈ C K0 ×K 0 , U 1 U 1 = V 1 V 1 = I K0 , and 1 is diagonal, containing the
H H
singular values of R r R Tt . Therefore, we have
∗
1 Qt = Qr U 1 1 (Qt V 1 ) ,
M = Qr U 1 1 V H T H
(10.17)
which is a valid SVD of M. The uniqueness of the singular values of a matrix indicates
that ≡ 1 . Therefore, we can choose U = Qr U 1 and V = Q∗t V 1 . We have
Mr,R
μ(U ) = sup (Qr )m· U 1 22
K0 m∈N+
M r,R
(10.18)
Mr,R K r
≤ sup (Qr )m· 22 = μ ,
K0 m∈N+ K0 0
M r,R
where μr0 is the upper bound on μ(U ) defined in Theorem 10.1. We also have
L L
μ(V ) = sup (Q∗t )l· V 1 22 = sup (Qt )l· 22 . (10.19)
K0 l∈N+ K0 l∈N+
L L
If K0 is strictly smaller than K, we cannot represent Qt in terms of S T P T V t and R t

because of the singularity of R t . To mitigate this issue, we apply column permutations
F on R t to bring forward the first nonzero elements in each row R t F = (R 1 R 2 ) such
that R 1 ∈ CK0 ×K0 is square, upper triangular, and invertible. The QR decomposition
S T P T V t can be rewritten as
S T P T V t F = Qt (R 1 R 2 ). (10.20)
We can represent Qt as
Qt = S T P T V t F K0 R −1
1 , (10.21)
where F K0 denotes the first K0 columns of F . Substituting Qt into μ(V ), we obtain

4$ % 42
L 4 4
μ(V ) = sup 4 S T l· P T V t F K0 R −1
1 42
K0 l∈N+
L
(10.22)
L $ % $ −1 %H H H ∗ ∗
= sup S T l· P T V t F K0 R −1
1 R1 F K0 V t P (S )·
K0 l∈N+
L
It holds that
$ −1 %H $ H %−1
R −1
1 R1 = R1 R1
$ %−1
= RH H
1 Qt Qt R 1
$ H ∗ ∗ T T
%−1 (10.23)
= FH
K0 V t P S S P V t F K0
$ H ∗ T
%−1
= FH
K0 V t P P V t F K0 ,
where the last equality holds because SS H = I Mt,R . Consider the QR decomposition of
P T V t F K0 given by
P T V t F K0 = Qa R a , (10.24)
where Qa ∈ CMt,R ×K0 contains orthonormal columns, and R a ∈ CK0 ×K0 is upper
triangular and full rank. Substituting (10.23) and (10.24) into (10.22), we have
L $ %−1 H ∗
μ(V ) = sup s T R a R H
a Ra Ra s l
K0 l∈N+ l
L
(10.25)
L L
= sup s T s ∗ = sup s l 22,
K0 l∈N+ l l K0 l∈N+
L L
where s l QTa S ·l , and the second equality holds because R a is invertible. Based on
[49, theorem 3], if Mt,R = O(L/ ln L), the entries of S can be approximated by i.i.d.
Gaussian random variables with distribution CN (0,1/L). Since Qa has orthonormal
columns, s l ∈ CK0 ,∀l ∈ N+ L also contains i.i.d. Gaussian random variable with distri-
bution CN (0,1/L), and Ls l 22 is distributed according to χK2 . Based on Lemma 10.3
√ 0
and setting t = 6 ln L, it holds that

P Ls l 22 ≥ K0 + 2 3K0 ln L + 6 ln L ≤ L−3 . (10.26)
Applying the union bound, we have that

⎡ ⎤
√
K + 2 3K ln L + 6 ln L
P ⎣ sup s l 22 ≥ ⎦ ≤ L−2 .
0 0
(10.27)
l∈N + L
L
Combining (10.25) and (10.27) gives

: √ ;
K0 + 2 3K0 ln L + 6 ln L
P μ(V ) ≥ ≤ L−2 . (10.28)
K0
From the derivation, the bound on μ(V ) holds for any target angles, array geometry, and
precoding matrix P as long as P T V t F K0 has full column rank K0 . This completes the
proof of Theorem 10.2.
Based on Theorem 10.2, we have the following theorem for the coherence parameters
of M.
theorem 10.4 (Coherence of M with random unitary waveform matrix) Consider

the MIMO-MC radar presented in Section 10.2 with S being random unitary. For dr =
λc /2, arbitrary transmit array geometry, and for

Mr,R
K≤ , (10.29)
FMr,R (ξr )
the matrix M has coherence parameters

K r t
μ0 max μ , μ̃ (10.30)
K0 0 0

μ 1 K0 μ 0 (10.31)
with probability 1 − L−2 , where μ̃t0 is defined in Theorem 10.2. μr0 is the upper bound
on μ(U ) given in Theorem 10.1.
The above result holds for any precoding matrix P such that the rank of M is K0 .
Proof The theorem can be proved by substituting the bounds on μ(U ) and μ(V ) in
Theorem 10.1 with the bounds derived in Theorem 10.2.
Some comments are in order. First, if K0 is O(ln L), the upper bound μ̃t0 > 1 is a
small constant O(1); this can be seen from the definition of μ̃t0 in (10.13) by choosing
K0 > ln L. Therefore, based on (10.13), the coherence parameter of M is close to 1,
which means that M has good coherence property. A similar bound was provided on
the coherence of the subspaces spanned by a random orthogonal basis in [46]. Second,
unlike the results in [43, theorem 2], the probabilistic bound on μ(V ) is independent
of the target angles and array geometry. Third, the results in Theorem 10.4 hold for
any random unitary matrix S. The radar waveform can be changed periodically, which
would be good for security reason, without affecting the matrix completion perfor-
mance. Fourth, the probabilistic bound on μ(V ) in Theorem 10.2 is independent of P .
This means that we can design P , for example for the purpose of transmit beamforming
and interference suppression, without affecting the incoherence property of M. This key
observation validates the feasibility of precoding for spectrum sharing between MIMO-
MC radar and communication systems, a topic that will be discussed next. Note that this
chapter focuses on the design of radar precoder in the spatial domain, not the waveform.
The radar precoder will not affect the waveform ambiguity properties in the time and
Doppler domains.
10.3 Coexistence System Model
We consider a coexistence scenario as shown in Figure 10.2, where a MIMO-MC radar

system and a MIMO communication system operate using the same carrier frequency.
Note that the coexistence model also applies to MIMO radar, because when full sam-
pling is adopted the MIMO-MC radar becomes equivalent to a traditional MIMO radar.
In the coexistence system, H ∈ CMr,C ×Mt,C denotes the communication channel, where
Mr,C and Mt,C denote respectively the number of RX and TX antennas of the communi-
cation system. G1 ∈ CMr,C ×Mt,R and G2 ∈ CMr,R ×Mt,C denote the interference channels
between the communication and radar systems. The radar operates in pulsed mode; in
each pulse, it first transmits a short pulse waveform, and then listens for target echoes
for a much longer period. The duration of these two phases comprises the PRI. Figure
10.3 shows the radar-communication coexistence signal model during two periods of
one radar PRI. At the communication receiver, radar interference is present only during
the radar transmit period. On the other hand, the communication interference at the radar
receiver is present during the entire radar PRI. In this chapter, we focus on joint radar
and communication waveform design during Period 1. Readers can refer to [33] for
communication waveform design schemes that can reduce the interference to the radar
receiver during the entire radar PRI.
… …
Collocated MIMO radar
… …
Communication TX Communication RX
Figure 10.2 A MIMO communication system sharing spectrum with a collocated MIMO
radar system.
Figure 10.3 Radar-communication coexistence signal model during one radar PRI.
Figure 10.4 The spectrum-sharing architecture. The cooperation is coordinated by the control
center, a node with high computing power that also serves as the radar fusion center. The control
center collects information from radar and communication systems, computes jointly optimal
signaling schemes for both systems and sends each scheme back to the corresponding system.
Cooperative spectrum sharing can be implemented via the system architecture of

Figure 10.4. The coordination of the cooperation is conducted by a control center,
which collects information from the two systems, formulates and solves an optimization
problem, and passes to each systems its optimal parameters. The control center can be
thought of as an enhanced spectrum access system (SAS) used in the FCC release [50],
and is connected to the radar/communication system via either a wireless link, or a
backhaul channel.
The control center can also integrate the functionality of the radar fusion center, i.e.,
target detection, estimation and tracking, and specifically for the MIMO-MC radar,
also matrix completion. There are several advantages in having a control center that
encompasses the radar fusion center. First, a powerful all-in-one center greatly simpli-
fies the complexity of the overall network. Second, radar operators, especially in military
applications, are not willing to share information directly with civilian cellular systems
out of security concerns. In such cases the control center can be operated by the radar,
and enable cooperation while maintaining the isolation of the radar and communica-
tion systems. Third, the radar and communication systems only need communication
interfaces with the control system.
The coexistence model considered here relies on the following assumptions.
Transmitted Signals
It is assumed that the two systems transmit narrowband waveforms with the same sym-
bol period. To evaluate the feasibility of radar and communication systems having the
same symbol period, let us consider an S-band search and acquisition radar with range
resolution equal to 300 m (a typical range resolution is between 100 m and 600 m
[51,52]). The corresponding radar sub-pulse duration is 2 μs. Communication symbol
duration of 2 μs is quite typical in model cellular systems [53]. The transmitted signal is
narrowband if the channel coherence bandwidth is larger than the signal bandwidth [53–
55]. In a macro-cell, typical values for the channel coherence bandwidth are of the order
of 1 MHz [56,57], which is larger than the signal bandwidth of 0.5 MHz (or symbol
interval 2 μs). Thus, the narrowband assumption is typically valid.
If higher signal bandwidth is needed, OFDM signaling can be used for both radar
[15,17] and the communication system [56,57]. Our coexistence model can still be valid
on each OFDM carrier, over which the signal can be considered as narrowband.
Fading
We assume that H , G1 , and G2 are flat fading, which is valid when the transmitted
signals are narrowband. The flat-fading assumption is common practice in the radar-
communication system coexistence literature [22–26]. In addition, all channels are
assumed to be block fading over the radar PRI. For a radar with medium pulse repetition
frequency, the PRI is usually between 30 μs and 0.3 ms. The typical channel coherence
time for 2.5 GHz and 5.8 GHz carrier frequency ranges from 2 ms up to 200 ms [58].
The channel coherence time is much larger than the radar PRI. As for the moving
targets, the resulting Doppler shifts are usually assumed to be constant during one PRI
[44,59]. Therefore, channel block fading is a reasonable assumption.
Channel State Information (CSI)

The channel H is assumed to be perfectly known at the communication transmitter. The
channels G1 and G2 are also assumed to be perfectly known at the radar. CSI estimation
can be achieved using pilot channels [22,60] scheduled by the control center in time-
division multiplexing (TDM) fashion. As a simple example, based on Figure 10.4,
the communication transmitter, i.e., the base station (BS), transmits a reference signal in
pilot burst A, and this is used by the radar to estimate G2 . The communication receiver,
i.e., a user entity, transmits a reference signal in pilot bursts B, and this is used by the
BS and the radar to estimate H and G1 , respectively, based on channel reciprocity [61].
All estimated CSI is sent to the control center by the radar and the BS, where it is used
to jointly optimize the spatial multiplexing. Note that CSI estimation and feedback can
be scheduled based on the channel coherence time, which is much larger than the radar
PRI. Figure 10.5 shows a simplified schematic diagram for CSI estimation/feedback
and receiving design results from the control center based on TDM. Existing techniques
in cognitive radios and multiuser MIMO (MU-MIMO) [62–68] can also be applied to
reduce the overhead for CSI feedback.
The Radar Mode of Operation

We consider the target tracking scenario, in which the radar searches for targets with
unknown radar cross section (RCS) variances, in particular directions of interest, given
by set {θk }, and at a particular range bin of interest [69,70]. In such scenarios, the target
parameters have typically been obtained from previous tracking cycles, and are used to
optimize the transmission for better SINR performance [44].
Figure 10.5 TDM-based CSI estimation and feedback and reception of design results from the
control center.
Under the aforementioned assumptions on transmitted signals, fading, CSI, and the
radar mode of operation, let us consider a target scene at a particular range bin as in
Section 10.2 but with clutter. The baseband signal received by the radar receivers during
L symbol durations in one radar PRI can be expressed as
$ %
Y R = DP
F GH SI + CP S + G2 X2 + W R . (10.32)
F GH I FGHI
signal interference noise
The signal received by the communication receivers can be expressed as
Y C = FGHI
H X + G1 P S1 + W C . (10.33)
F GH I FGHI
signal interference noise
Y R , D, P , S, W R , and appearing in the above equations are as defined in Section

10.2. Note that the delay in the radar signal model is assumed a to be known and
thus appropriately compensated for. The waveform-dependent interference term CP S
contains interferences from point scatterers (clutter or interfering objects). If there are
Kc point scatterers at angles {θkc }, and reflection coefficients {β ck } within the same range
c c
bin as the targets, then C K k=1 β k vr (θk )vt (θk ) denotes the clutter response matrix.
c T c
Y C and W C denote the received signal and additive noise at the communication RX
antennas, respectively. X [x(1),. . .,x(L)] is a matrix whose columns, x(i)’s are code-
words from the codebook of the communication system. W R/C contains i.i.d. random
entries distributed as CN (0,σR/C
2 ). i ,i ∈ {1,2} are diagonal matrices containing the
j α
random phase offset e il between the MIMO-MC radar and the communication system
at the l-th symbol. These phase offsets are time-varying and arise due to the random
phase jitters of the oscillators between the radar transmitter and the communication
receiver [31,71]. Note that the Doppler shift will not be an issue in the design considered
in this chapter. This is because the radar signal model in (10.32) is for the fast-time
samples received in one radar pulse, and during a pulse, the Doppler shift is usually
assumed to be constant [41,44,59,72]. In that case, the Doppler shift can be absorbed
into the target RCS, and does not affect our design.
The control center aims to protect the radar system and maximize the spectrum effi-
ciency. In the following, we present a joint design of the communication and radar
transmissions, which will be implemented at the control center, so that the interference
at the radar RX antennas is minimized, thus allowing for successful matrix completion,
while certain communication system requirements are met [36].
10.4 Cooperative Spectrum Sharing
In this section, we formulate the MIMO-MC radar and MIMO communication spectrum
sharing problem and present an algorithm to solve it [36].
For the communication system, the covariance of interference plus noise is given by
R Cin = G1 GH
1 + σC I
2
(10.34)
where P P H /L is positive semidefinite. For l ∈ N+ L , the instantaneous information

rate is unknown because the interference plus noise is not necessarily Gaussian due to
the random phase offset ej αil . However, a lower bound of the rate equals [73]

C(R xl ,) log2 I + R −1 Cin H R xl H
H
. (10.35)
The above bound is achieved when the codeword x(l), l ∈ N+ L is distributed as

CN (0,R xl ). The average communication rate over L symbols is as follows
1
L
Cavg ({R xl },) C(R xl ,), (10.36)
L
l=1
where {R xl } denotes the set of all R xl ’s.

The MIMO-MC radar only partially samples Y R . Therefore, only the sampled target
signal and sampled interference determine the matrix completion performance. There-
fore, it would make sense to define the effective signal power (ESP) and effective inter-
ference power (EIP) at the radar RX node, referring to the sampled entries only, i.e.,
5 & & ''6
ESP E tr (DP S) (DP S))H
5 50 1 0 166
= E tr βk (D k P S)) βk (D k P S)H
k k
5 5 0 166
= E tr βk βj (D k P S) (D j P S)H
k j
5 0 16
= tr E{βk βj }
l D k P E{s l s Hl }P H H
D j
l
k j l
(a)
5 0 16
= tr σ2
l D k D H (10.37)
k βk l k
l
(b)
& ' & '
= tr σβ2k
D k D H k = tr σβ2k D H
k
D k
k k
& '
2 ∗
= tr σβk vt (θk )vr (θk )
vr (θk )vTt (θk )
H
k
(c)
& '
= pLMr,R tr σβ2k v∗t (θk )vTt (θk )
k
= pLMr,R tr (D t )
5 & '6
EIP E tr (CP S) ( (CP S))H
5 & '6
+ E tr (G2 X2 ) ( (G2 X2 ))H
(10.38)

L & '
= pLMr,R tr (C t ) + tr G2l R xl GH
2l ,
l=1
Figure 10.6 The sub-sampling at the radar receiver modulates the interference channel from the
communication transmitter to the radar receiver G2 . As shown in the left figure, the null space of
G2 is typically empty; thus the communication system transmission would always introduce
interference to the radar. Due to the random sub-sampling, the null space of the modulated
interference channel G2l shown in the right figure becomes nonempty, and thus, it is possible for
the communication system to introduce zero EIP to the radar receiver if it transmits in the null
space of G2l .
where D k vr (θk )vTt (θk ) for k ∈ N + K , s l s(l), and

l = diag(·l ). From (10.37),
(a) follows from the fact that E{βk βj } = δj k σβ2k , where δj k denotes the Kronecker

delta; (b) follows from the fact that
l =
l
l and
= L l=1
l ; (c) follows from
the fact that vH (θ )
v (θ ) =
= pLM . Additionally, we have the following
r k
r k 1
2 v∗ (θ )vT (θ ), C =
r,R
Kc 2 ∗ c T c
definitions: D t = K σ
k=1 β k t k t k t σ
k=1 β t
c v (θ k )vt (θk ), σβ k and σ β k
c
k
denote the standard deviation of β k and βck , respectively; G2l
l G2 . The derivation
for EIP is similar to that for ESP and is omitted for brevity. The derivation in (10.37)
and (10.38) assumes that the target and clutter reflection coefficients are independent
complex Gaussian with zero mean, which are typical assumptions in the literature
[70,74,75].
The sub-sampling at the radar receiver modulates the interference channel G2 (see
Figure 10.6). At sampling instance l, only the interference at radar RX antennas corre-
sponding to 1’s in ·l is sampled. Thus, the effective interference channel during the l-th
symbol duration is G2l . To match the interference channel variation, the communication
system should use adaptive transmission with symbol-dependent covariance matrix R xl
[31]. This would be the optimal approach, however, it would involve high computational
cost. A suboptimal alternative would be constant rate communication transmission, i.e.,
R xl ≡ R x ,∀l ∈ N+ L , outlined in Section 10.4.3.
Incorporating the expressions for effective target signal, interference and additive
noise, the effective radar SINR becomes
tr (D t )
ESINR = L $ % .
tr (C t ) + l=1 tr G2l R xl G2l /(pLMr,R ) + σR
H 2
One can see that the joint design of the communication TX covariance matrices {R xl },
the radar precoder P (embedded in ), and the radar sub-sampling scheme is neces-
sary to maximize the ESINR. In Theorem 10.4, we prove that the radar precoder P can
be designed without affecting the incoherence property of M.
At the control center, the spectrum sharing problem can be formulated as follows:
(P1 ) max ESINR ({R xl },,) ,

{R xl }0,0,
s.t. Cavg ({R xl },) ≥ C, (10.39a)

L
tr (R xl ) ≤ PC ,Ltr () ≤ PR , (10.39b)
l=1
tr (V k ) ≥ ξtr (),∀k ∈ N+
K, (10.39c)
is proper, (10.39d)
where V k v∗t (θk )vTt (θk ). The constraint of (10.39a) restricts the communication
rate to be at least C, in order to support reliable communication and avoid service
outage. The constraints of (10.39b) restrict the total communication and radar transmit
power to be no larger than PC and PR , respectively. The constraints of (10.39c) restrict
the power of the radar probing signal in directions of interest to be no smaller than
the power achieved by the uniform precoding matrix trM() t,R
I , i.e., vTt (θk )v∗t (θk ) ≥
ξvTt (θk ) trM()
t,R
I v∗t (θk ) = ξtr (). It holds that ξ ≥ 1, which is used to control the
beampattern at the target angles of interest. The purpose of this constraint is to ensure
fairness across the multiple targets. The constraint in (10.39d) imposes restrictions on
the radar sub-sampling matrix such that it corresponds to a fixed sub-sampling rate p
and has large spectral gap. In the matrix completion literature, is either a uniformly
random sub-sampling matrix [38], or the adjacency matrix of a regular bipartite graph
with large spectral gap [76]. The spectral gap of a matrix is defined as the difference
between the largest and the second largest singular values [76].
In order for the control center to formulate and solve the problem of (10.39) it needs:
(i) the communication and radar system CSI; estimation and feedback of CSI are
discussed in Section 10.3.
(ii) target angles, and clutter parameters {σβ2c } and {θkc }. Since the control center
k
integrates the radar fusion center functionality, the target angles obtained from
the previous tracking cycle will be available. In practice, the clutter parameters
could be estimated when the targets are absent [74]. If the parameters {σβ2c } are
k
not known, we can use a single value, σ02 , for all the targets. With such choice,
the objective treats all target directions equally. One possible choice for σ02 is the
smallest target RCS variance that could be detected by the radar. Note that the
solution of (P1 ) is independent on the specific value of σ02 .
(iii) all parameters in the constraints. Parameters like power budget and required com-
munication rate could be provided by the radar and communication systems.
Problem (P1 ) is nonconvex with respect to optimization variable triplet ({R xl },,).
In Subsection 10.4.1 we present an algorithm to find a local solution via alternating opti-
mization, while in Subsection 10.4.2, we discuss the feasibility and solution properties
of (P1 ) [36].
10.4.1 Solution to the Spectrum Sharing Problem Using Alternating Optimization

In this section we discuss the alternating iterations with respect to {R xl }, , and .
The Alternating Iteration with Respect to {R xl }

We first solve for {R xl } while setting and to be equal to the solution from the
previous iteration, i.e., we formulate the following problem:

L
$ %
(PR ) min tr G2l R xl GH
2l
{R xl }0
l=1
(10.40)

L
s.t. Cavg ({R xl },) ≥ C, tr (R xl ) ≤ PC .
l=1
Problem (PR ) is convex and involves multiple matrix variables, the joint optimization
with respect to which requires high computational complexity. The semidefinite matrix
variables {R xl } have LMt,C
2 real scalar variables, which would result in a complexity
of O((LMt,C ) ) if an interior-point method [77] was used. An efficient algorithm for

2 3.5
solving Problem (PR ) can be implemented based on the Lagrangian dual decomposition
[77]. The Lagrangian of (PR ) can be written as

L
$ %
L({R xl },λ1,λ2 ) = tr G2l R xl GH
2l
l=1
L
$ %
+ λ1 tr (R xl ) − PC + λ2 C − Cavg ({R xl }) ,
l=1
where λ1 ≥ 0 and λ2 ≥ 0 are the dual variables associated with the transmit power and
the communication rate constraints, respectively. The dual problem of (PR ) is
(PR -D) max g(λ1,λ2 ),

λ 1,λ 2 ≥0
where g(λ 1,λ2 ) is the dual function defined as
g(λ 1,λ2 ) = inf L({R xl },λ1,λ2 ).

{R xl }0
The domain of the dual function, i.e., dom g, is λ1,λ2 ≥ 0 such that g(λ1,λ2 ) > −∞.
The problem is called dual feasible if (λ1,λ2 ) ∈ dom g. The dual function g(λ 1,λ2 )
can be obtained by solving L independent subproblems, each of which can be written
as follows
&& ' '
(PR -sub) min tr GH 2
l G2 + λ 1 I R xl
R xl 0 (10.41)

− λ2 log2 I + R −1
wl H R xl H H
.
Before giving the solution of (PR -sub), let us first state some observations.
Observation 1) The average capacity constraint should be active at the optimal point.
This means that the achieved capacity is always C and λ2 > 0. To show this, let us
assume that the optimal point {R ∗xl } achieves Cavg ({R ∗xl }) > C. Then we can always
shrink {R ∗xl } until the average capacity reduces to C, which would also reduce the
objective. Thus, we end up with a contradiction. By complementary slackness, the
corresponding dual variable is positive, i.e., λ2 > 0.
$ % +
Observation 2) GH 2
l G2 + λ 1 I is positive definite for all l ∈ NL . This can be shown
by contradiction. Suppose that there exists l such that G2
l G2 + λ1 I is singular. Then
H
it must hold that GH 2

l G2 is singular and λ 1 = 0. Therefore, we can always find a
nonzero vector v lying in the null space of GH 2
l G2 . At the same time, it holds that
−1/2
R wl H v != 0 with very high probability, because H is a realization of the random
channel. If we choose R xl = αvvH and α → ∞, the Lagrangian L({R xl },0,λ2 ) will
be unbounded from below, which indicates that λ1 = 0 is not dual feasible. This means
that λ1 is strictly larger than 0 if GH 2
l G2 is singular for any l. Thus, the claim is
proven.
Based on these observations, we have the following lemma.
lemma 10.5 [63,64] For given feasible dual variables λ1,λ2 ≥ 0, the optimal solu-
tion of (PR -sub) is given by
−1/2 −1/2
R ∗xl (λ1,λ2 ) = l U l l U H
l l , (10.42)
−1/2 −1/2
where l GH 2
l G2 + λ 1 I ; U l is the right singular matrix of H̃ l R wl H l ;
+
l = diag(βl1,. . .,β lr ) with βli = (λ2 − 1/σli ) , r and σli ,i = 1,. . .,r, respectively,
2
being the rank and the positive singular vales of H̃ l . It also holds that
r & '+
H
log2 I + R −1
wl H R ∗
xl H = log(λ2 σli2 ) . (10.43)
i=1
Based on Lemma 10.5, the solution of (PR ) can be obtained by finding the optimal
dual variables λ∗1,λ∗2 . The cooperative spectrum sharing problem (PR ) can be solved via
the procedure outlined in Algorithm 1. The convergence of Algorithm 1 is guaranteed by
the convergence of the ellipsoid method [78]. The complexity of the dual decomposition
based algorithm is only linearly dependent on L.
Algorithm 1 Algorithm for the alternating iteration (PR )

1: Input: H,G1,G2,,Pt ,C,σC2
2: Initialization: λ1 ≥ 0,λ2 ≥ 0
3: repeat
4: Calculate R ∗xl (λ1,λ2 ) according to (10.42) with the
$ given λ1 and
% λ2 ;
L ∗
5: Compute the subgradient of g(λ1,λ2 ) as l=1 tr R xl (λ1,λ2 ) − Pt and
C − Cavg ({R ∗xl (λ1,λ2 )}) respectively for λ1 and λ2 ;
6: Update λ1 and λ2 accordingly based on the ellipsoid method [78];
7: until λ1 and λ2 converge to a prescribed accuracy.
8: Output: R ∗xl = R ∗xl (λ1,λ2 )
The Alternating Iteration with Respect to

Via simple algebraic manipulations, the EIP from the communication transmission can
be reformulated as

L & '
2l ≡ tr ( Q),
tr G2l R xl GH T
l=1
where the l-th column of Q contains the diagonal entries of G2 R xl GH

2 . With fixed
{R xl } and , we can solve via
min tr (T Q) s.t. is proper. (10.44)

Recall that the sampling matrix is required to have large spectral gap. However, it is
difficult to incorporate such conditions in the optimization problem (10.44). Based on
the fact that row and column permutation of the sampling matrix would not affect its
singular values and thus the spectral gap, a suboptimal approach is to search the best
sampling scheme by permuting rows and columns of an initial sampling matrix 0 , i.e.,
min tr (T Q) s.t. ∈ ℘(0 ), (10.45)

where ℘(0 ) denotes the set of matrices obtained by arbitrary row and/or column per-
mutations. 0 is generated with binary entries and pLMr,R ones, where x denotes
the largest integer smaller or equal to x. Therefore, the constraint on the number of 1’s
in can also be satisfied. One good candidate for 0 would be a uniformly random
sampling matrix, as such matrix exhibits large spectral gap with high probability [76].
Multiple trials with different 0 ’s can be used to further improve the choice of .
However, the search space is very large since the cardinality of ℘(0 ) is Mr,R ! L!,
where operator ! denotes the factorial. One can reduce the search space as follows [36]:
min tr (T Q) ≡ tr (QT ) s.t. ∈ ℘r (0 ), (10.46)

where ℘r (0 ) denotes the set of matrices obtained by arbitrary row permutations. The
search space in (10.46) equals Mr,R !, i.e., the cardinality of ℘r (0 ), which is greatly
reduced compared to that in (10.45). Furthermore, the following proposition shows that
such reduction of search space comes without any performance loss.
proposition 10.6 For any 0 , searching for an in ℘r (0 ) can achieve the same
EIP as searching in ℘(0 ).
Proof We can prove the proposition by showing that the EIP achieved by any 1 ∈
℘(0 ) can also be achieved by a certain 2 ∈ ℘r (0 ). For the pair (1,{R xl }), the
same EIP can be achieved by the pair (2,{R̃ xl }), where
• 2 is constructed by performing on 0 the row permutations performed from 0

to 1 , and
• {R̃ xl } is a permutation of {R xl } according to the column permutations performed
from 0 to 1 .
In other words, the column permutations on is unnecessary because {R xl } will be

automatically optimized to match the column pattern of . The claim is proven.
The problem in (10.46) aims to find the best one-to-one match between the rows
of 0 and the rows of Q. Let us construct a cost matrix C r ∈ RMr,R ×Mr,R with
[C r ]ml 0m· (Ql· )T . The problem turns out to be a linear assignment problem with
cost matrix C c , which can be solved efficiently in polynomial time O(Mr,R
3 ) using the
Hungarian algorithm [79].
The Alternating Iteration with Respect to

For the optimization of with fixed {R xl } and , the constraint in (10.39a) is noncon-
vex with respect to . The first order Taylor expansion of C(R xl ,) at ¯ is given as
$ %
¯ − tr Al ( − )
C(R xl ,) ≈ C(R xl , ) ¯ , (10.47)
where Al is defined as
T
∂C(R xl ,)
Al −
∂Re() = ¯

= + 2 −1
σC I ) − (G1 GH H −1 ¯.
1 + σC I + H R xl H ) ]G1 =
2
GH H
1 [(G1 G1
(10.48)
The sequential convex programming technique is applied to solve by repeatedly
solving the following approximate optimization problem
tr (D t )
(P ) max ,
0 tr (C t ) + ρ
(10.49)
s.t. tr () ≤ PR /L,tr (A) ≤ C̃,
tr (V k ) ≥ ξtr () ,∀k ∈ N+
K,
where

L
$ %
C̃ = ¯ + tr (A
C(R xl , ) ¯ l) − C ,
l=1

L
A= Al , (10.50)
l=1
1
L
$ %
ρ= 2
l G2 + σR .
tr R xl GH 2
pLMr,R
l=1
¯ is updated as the solution

C̃ and ρ are real positive constants with respect to , and
of the previous repeated problem. Problem (10.49) could be equivalently formulated
as a semidefinite programming problem (SDP) via the Charnes–Cooper transformation
[74,80].
Algorithm 2 The overall spectrum sharing algorithm.

1: 2 ,δ
Input: D t ,C t ,H,G1,G2,PC/R ,C,σC/R 1
PR
2: Initialization: = LM t,R
I , = 0 ;
3: repeat
4: Update {R xl } by solving (PR ) using Algorithm 1 with fixed and ;
5: Update by solving (10.46) with fixed {R xl } and ;
6: Update by solving a sequence of approximated SDP problem (10.49) with
fixed {R xl } and ;
7: until ESINR increases by √ amount smaller than δ1
8: Output: {R xl },,P = L1/2
max ˜ t ),
tr (D
˜
0,φ>0
˜ t ) = 1 − φρ
s.t. tr (C (10.51)
tr () ˜
˜ ≤ φPR /L,tr (A) ≤ φ C̃,
$ %
˜ k − ξI ) ≥ 0,∀k ∈ N+ .
tr (V K
∗
The optimal solution of (10.51), denoted by (˜ ,φ∗ ), can be obtained by using any
standard interior point method–based SDP solver with a complexity of O((Mt,R 2 )3.5 ).
∗
˜ /φ∗ . In each alternating iteration with respect
The solution of (10.49) is given by
to , it is required to solve several iterations of SDP due to the sequential convex
programming.
It is easy to show that the objective function, i.e., ESINR, is nondecreasing during the
alternating iterations of {R xl }, and , and is upper bounded. According to the mono-
tone convergence theorem [81], the alternating optimization is guaranteed to converge.
The cooperative spectrum sharing algorithm in the presence of clutter maximizing the
effective radar SINR is summarized in Algorithm 2.
10.4.2 Insight on the Feasibility and Solutions of the Spectrum-Sharing Problem

In this subsection, we provide some key insights on the feasibility of (P1 ) and the rank
of the solutions obtained by Algorithm 2.
Feasibility
A necessary condition on C for the feasibility of (P1 ) with respect to {R xl } is C ≤
Cmax (PC ) where
1
L
Cmax (PC ) max log2 I + σC−2 H R xl H H ,
{R xl }0 L
l=1
(10.52)

L
s.t. tr (R xl ) ≤ PC .
l=1
The above optimization problem is convex and has a closed-form solution based on
water-filling [54]. It can be shown that Cmax (PC ) is essentially the largest achievable
communication rate when there is no interference from radar transmitters to the com-
munication receivers. Note that C = Cmax (PC ) will generate a nonempty feasible set
for {R xl } only if G1 GH1 = 0 (omitting the trivial case = 0), i.e., the radar transmits
in the null space of the interference channel G1 to the communication receivers.
A necessary condition on ξ for the feasibility of (P1 ) with respect to is ξ ≤ ξmax ,
where
ξmax max ξ,
0,ξ≥0
(10.53)
s.t. tr (V k ) ≥ ξtr (),∀k ∈ N+
K.
Note that the above optimization problem is independent of tr (). Without loss of
generality, we assume that tr () = 1, based on which we have the following equivalent
SDP formulation
ξmax max ξ, s.t. tr () = 1,
0,ξ≥0
(10.54)
tr (V k ) ≥ ξ,∀k ∈ N+
K.
It is easy to check that ξmax ≥ 1, which can be achieved by set (,ξ) to be (I /Mt,R ,1).
The following proposition provides a sufficient condition for the feasibility of (P1 ).
proposition 10.7 If C,ξ,PC > 0,PR > 0 are chosen such that C < Cmax (PC ) and
ξ ≤ ξmax , then (P1 ) is feasible.
Proof If C < Cmax (PC ), the feasible set for {R xl } determined by constraints
in (10.39a) and (10.39b) F{Rxl } is nonempty as long as tr () is sufficiently small.
If ξ ≤ ξmax , the feasible set for determined by constraints in (10.39c) F1 is
nonempty and has no restriction on tr (). If ∈ F1 , then α ∈ F1,∀α > 0.
The overall feasible set for , F , is the intersection of feasible sets determined
by (10.39a), (10.39b), and (10.39c). F is nonempty as long as F1 and F{Rxl } are
nonempty because we can choose any ∈ F1 and scale it down to make (P1 ) feasible.
The claim is proven.
The Rank of Solutions

We are also particularly interested in the rank of , obtained using Algorithm 2.
Since the sequential convex programming technique is used for finding , it suffices to
focus on the rank of the solution of (P ). To achieve this goal, we first introduce the
following SDP problem
tr (D t )
min tr () s.t. tr (A) ≤ C̃, ≥ γ,
0 tr (C t ) + ρ (10.55)
tr (V k ) ≥ 0,∀k ∈ N+
K,
where γ is a real positive constant. The following proposition relates the optimal solu-
tions of problems (10.49) and (10.55).
proposition 10.8 If γ in (10.55) is chosen to be the maximum achievable SINR

of (10.49), denoted as SINRmax , the optimal of (10.55) is also optimal for (10.49).
Proof Denote ∗1 and ∗2 the optimal solutions of (10.49) and (10.55), respectively. It
is clear that ∗1 is a feasible point of (10.55). This means that tr (∗2 ) ≤ tr (∗1 ) ≤ PR .
Therefore, ∗2 is a feasible point of (10.49). It holds that
tr (∗1 D t ) tr (∗2 D t )
SINRmax ≡ ≥ ≥ SINRmax . (10.56)
tr (∗1 C t ) + ρ tr (∗2 C t ) + ρ
It is only possible when all the equalities hold. In other words, ∗2 is optimal for (10.49).
This completes the proof.
In order to characterize the optimal solution of (10.55), we need the following key
lemma:
lemma 10.9 Matrix Al defined in (10.48) is positive semidefinite. In addition,

A= L l=1 Al is also positive semidefinite.
Proof For simplicity of notation, we denote that X G1 GH 1 + σC I 0 and Y

2
−1 −1
H R xl H 0. Let us rewrite Al as Al = G1 [X − (X + Y ) ]G1 . It is clear to see
H H
that Al is Hermitian because both X −1 and (X + Y )−1 are Hermitian. It is sufficient to

show that Z X−1 − (X + Y )−1 is positive semidefinite. We have that
X−1 − (X + Y )−1 = X−1 Y (X + Y )−1, (10.57)
which could be shown by right multiplying (X + Y ) on both sides of the equality. Since
X, Y , and Z are Hermitian, we have
Z = X −1 Y (X + Y )−1 = (X + Y )−1 Y X −1 . (10.58)
Since (X + Y )−1 is invertible, there exists a unique positive definite matrix V , such that
(X + Y )−1 = V 2 . Simple algebra manipulation shows that
V −1 ZV −1 = (V −1 X−1 V −1 )(V Y V )
(10.59)
= (V Y V )(V −1 X−1 V −1 ),
i.e., V −1 ZV −1 is a product of two commutable positive semidefinite matrices V −1

X−1 V −1 and V Y V . Therefore, V −1 ZV −1 and thus Z is positive semidefinite. We
prove that Al is semidefinite. Further, A is also semidefinte because it is the sum of
L semidefinite matrices.
Based on Lemma 10.9, we prove the following result by following the approach
in [80]:
proposition 10.10 Suppose that (10.55) is feasible when γ is set to SINRmax . Then,
the following claims hold:
1) Any optimal solution of (10.55) has rank at most K.
2) All rank-K solutions ∗K of (10.55) have the same range space.
3) Any solution ∗K − with rank less than K has range space such that Range (∗K − ) ⊂
Range (∗K ).
4) (10.49) and (10.51) always have solutions with rank at most K and with the same
range space properties as that for (10.55).
Proof Problem (10.55) is an SDP, whose Karush–Kuhn–Tucker (KKT) conditions [77]

are given as

K
K
+ λ2 Dt + νk V k = I + λ1 A + λ2 γC t + νk ξI (10.60a)
k=1 k=1
= 0 (10.60b)
0, 0,λ1 ≥ 0,λ2 ≥ 0,{νk } ≥ 0 (10.60c)
tr (D t ) ≥ γtr (C t ) + γρ (10.60d)
tr (V k ) ≥ 0,∀k ∈ N+
K, (10.60e)
where 0,λ1 ≥ 0,λ2 ≥ 0, and {νk } ≥ 0 are dual variables. We can rewrite (10.60a)
as follows

K
Rank () + Rank λ2 D t + νk v∗t (θk )vTt (θk )
k=1
(10.61)

K
≥ Rank I + λ1 A + λ2 γC t + νk ξI .
k=1
2 ∗ ∗
Recall that D t = k σβ k vt (θk )vt (θk ). It holds that λ 2 D t +
T T
k νk vt (θk )vt (θk )
has rank at most K. Since A and C t are positive semidefinite, the matrix on right
hand side of (10.61) has full rank. Therefore, Rank () is not smaller than Mt,R − K.
From (10.60b) and (10.60d) we conclude that any optimal solution must have rank at
most K.
The second claim asserts that if there are multiple solutions with rank K, they all have
the same range space. This can be proved using contradiction. Suppose that ∗1 and ∗2
are rank-K solutions of (10.55) and Range (∗1 ) != Range (∗2 ). Based on convex theory,
any convex combination of ∗1 and ∗2 , saying ∗3 α∗1 + (1 − α)∗2,∀α ∈ (0,1), is
also a solution of (10.55). However, ∗3 is with rank at least K + 1, which contradicts
the fact that any solution must have rank at most K. The third claim could also be
proved using contradiction. Suppose that ∗1 and ∗2 are respectively rank-K solution
and solution with rank smaller than K, and Range (∗2 ) \ Range (∗1 ) is nonempty. Then
any convex combination of ∗1 and ∗2 , saying ∗3 α∗1 + (1 − α)∗2,∀α ∈ (0,1),
is also a solution of (10.55). However, ∗3 is again with rank at least K + 1, which
contradicts the fact that any solution must have rank at most K.
The last claim on the solutions of (10.49) and (10.51) follows from Proposition
10.8.
Proposition 10.10 indicates that the rank of the optimal precoding matrix will not be
larger than the number of the targets.
10.4.3 Constant-Rate Communication Transmission for Spectrum Sharing

Adaptive communication transmission for spectrum sharing methods involves high
complexity. A suboptimal transmission approach of constant rate, i.e., R xl ≡ R x ,
∀l ∈ N+L , has lower implementation complexity. In such case, the spectrum sharing
problem can be reformulated as
(P#1 ) max ESINR# (R x ,,), (10.62)

R x 0,0
s.t. C(R x ,) ≥ C,

Ltr (R x ) ≤ PC ,Ltr () ≤ PR ,
tr (V k ) ≥ 0,∀k ∈ N+
K,
where
tr (D t )
ESINR# = $ % (10.63)
2 /(pLMr,R ) + σR
tr (C t ) + tr
G2 R x GH 2

and
= L l=1
l is diagonal and with each entry equal to the number of 1’s in the
corresponding row of . Similar techniques in Algorithm 2 can be used to solve (P#1 ).
We can see that (P#1 ) has much lower complexity because there is only one matrix
variable for the communication transmission. However, the drawback of the constant-
rate communication is that R x cannot adapt to the variation of the effective interference
channel G2l . On the other hand, the adaptive communication transmission considered
in (P1 ) can fully exploit the channel diversity introduced by the radar sub-sampling
procedure.
Another consequence is that the ESINR# depends on only through
. Since
is searched among the row permutations of a uniformly random sampling matrix, the
number of 1’s in each row of is close to pL, or equivalently,
will be very close
to the scaled identity matrix pLI . To further reduce the complexity, the optimization
with respect to in (P#1 ) is omitted because all row permutations of will result in a
very similar ESINR# . From a different perspective, if the radar sub-sampling matrix
is not available for the radar and communication cooperation, we can safely replace
with pLI in the ESINR# . The discussion in this paragraph asserts that, for the case of
constant-rate communication transmission almost no performance degradation occurs
due to the absence of the knowledge of .
10.4.4 Traditional MIMO Radars for Spectrum Sharing

The traditional MIMO radars without sub-sampling can be considered as a special case
with p = 1, and thus there is no need for the matrix completion. In such case, the
constant-rate communication transmission becomes the optimal scheme because the
interference channel G2 stays as a constant for the period of L symbol time due to
the block fading assumption. The spectrum sharing problem has the same form as (P#1 )
with the objective function being
tr (D t )
SINR = $ % . (10.64)
tr (C t ) + tr G2 R x GH2 /Mr,R + σR
2
Note that SINR ≈ ESINR# because

≈ pLI . Therefore, traditional MIMO radars
can achieve approximately the same spectrum sharing performance as MIMO-MC
radars when the communication system transmits at a constant rate. However, for
MIMO-MC radars, the adaptive communication transmission and the radar sub-
sampling matrix can be designed to achieve significant radar SINR increase over
the traditional MIMO radars. This advantageous flexibility is introduced by the sparse
sensing (i.e., sub-sampling) in MIMO-MC radars. Performance results comparing
MIMO-MC radars with different p values against the traditional MIMO radars are
provided in Section 10.5.4.
10.5 Numerical Results
In this section, we provide simulation examples to quantify the performance of the

jointly designed spectrum-sharing method described in this chapter for the coexistence
of the MIMO-MC radars and communication systems.
Unless otherwise stated, we use the following default values for the system param-
eters. The MIMO radar system consists of collocated Mt,R = 16 TX and Mr,R = 16
RX antennas, respectively forming transmit and receive half-wavelength uniform linear
arrays. The radar waveforms are chosen from the rows of a random orthonormal matrix
[30]. We set the length of the radar waveforms to L = 16. The wireless communication
system consists of collocated Mt,C = 4 TX and Mr,C = 4 RX antennas, respectively
forming transmit and receive half-wavelength uniform linear arrays. For the communi-
cation capacity and power constraints, we take C = 16 bits/symbol and PC = 6400
(the power is normalized by the additive noise power). The radar transmit power budget
is PR = 1000 × PC , which is typical for radar systems; high power is needed to to
combat path loss associated with far-field targets [44]. The additive white Gaussian
noise variances are σC2 = σR 2 = 1. There are three stationary targets with RCS variance
2 = 0.5, located in the far field with path loss of 30 dB. Clutter is generated by four
σβ0
point scatterers, all having the same RCS variance, σβ2 ; the variance is determined by the
clutter-to-noise ratio (CNR) 10 log σβ2 /σR
2 . Based on these numbers, the possible range
of SNR at the communication receiver is between 12 dB and 26 dB, which is supported

by LTE systems [82,83]. The radar power budget corresponds to a per-receive-antenna
SNR of about 23 dB when only additive noise is considered. For a typical radar system
with a single antenna, operating with probability of detection of 0.9 and probability of
false alarm of 10−6 , the required SNR is about 13.2 dB [44]. However, the actual SNR
may be much smaller because spatial degrees of freedom are used to mitigate clutter
and interference from the communication systems.
The channel H is modeled as Rayleigh fading, i.e., it contains independent entries,

distributed as CN (0,1). The interference channels G1 and G2 are modeled as Rician
fading. The power in the direct path is 0.1, and the variance of Gaussian components
contributed by the scattered paths is 10−3 .
The performance metrics considered include the following:
• The radar effective SINR, i.e., the objective of the spectrum sharing problem;
• The matrix completion relative recovery error, defined as M − M / F /MF ,
/
where M is the completed data matrix at the radar fusion center;
• The radar transmit beampattern, i.e., the transmit power for different azimuth
angles vTt (θ)P v∗t (θ);
• The MUSIC pseudo-spectrum and the relative target RCS estimation RMSE
obtained using the least-squares estimation on the completed data matrix M. /
Monte Carlo simulations with 100 independent trials are carried out to get an average
performance.
10.5.1 The Radar Transmit Beampattern and the MUSIC Spectrum

In this subsection, we present an example demonstrating the advantages of the above
described jointly designed radar precoding scheme as compared to uniform precod-
ing,
i.e., P = LPR /Mt,R I , and null space projection (NSP) precoding, i.e., P =
H
LPR /Mt,R V V , where V contains the basis of the null space of G1 [25]. For the
joint design–based scheme of (10.39), we choose ξ = ξmax . The target angles with
respect to the array are respectively −10◦ , 15◦ , and 30◦ ; the four-point scatterers are
at angles −45◦ , −30◦ , 10◦√ , and 45◦ . The CNR is 30 dB. In this simulation, the direct
◦
path in G1 is generated as 0.1vt (φ)vH t (φ), where φ = 15 , with vt (φ) being defined
in (10.2). In other words, the communication receiver is taken at the same azimuth angle
as the second target.
Recall that the NSP technique projects the radar waveform onto the null space of the
interference channel G2 in order to avoid creating interference to the communication
receiver. Because the null space and row space of a matrix are orthogonal to each
other, there will be no radar power radiated along the null space of G2 , thus, targets
in those locations will be missed. The precoding approach presented here does not
suffer from such problem, because the precoding is computed via a joint design method
instead of projecting to the null space of G2 . The radar transmit beampattern and the
spatial pseudo-spectrum obtained using the MUSIC algorithm are shown in Figure 10.7.
The achieved ESINR, MC relative recovery error, and relative target RCS estimation
RMSE are listed in Table 10.1. We observe that the jointly designed precoding scheme
achieves significant improvement in ESINR, MC relative recovery error, and target
RCS estimation accuracy. As expected, the uniform precoding scheme just spreads
the transmit power uniformly in all directions. The NSP precoding scheme achieves
a similar beampattern as the uniform precoding scheme, with the exception of the deep
null that the NSP places in the direction of the communication receiver. The null means
that the transmit power towards the second target is severely attenuated and thus the
Table 10.1 The radar ESINR, MC relative recovery errors, and the relative target
RCS estimation RMSE for MIMO-MC radar and communication spectrum sharing.
MC Relative Relative RCS

Precoding schemes ESINR Recovery Errors Est. RMSE
Joint-design precoding 31.3 dB 0.038 0.028

Uniform precoding −44.3 dB 1.00 1.000
NSP based precoding −46.3 dB 1.00 0.995
Jointly Designed Precoding Scheme

60
Radar TX Beampattern (dB)
Uniform Precoding Scheme

Null Space Projection Scheme
40
20
–20
–50 0 50
Azimuth Angle
0
–5
Spatial Spectrum in dB
–10
–15
–20
–25
–30
–50 0 50
Azimuth Angle
Figure 10.7 The radar transmit beampattern and the MUSIC spatial pseudo-spectrum for
MIMO-MC radar and communication spectrum sharing. The true positions of the targets and
clutters are labeled using solid and dashed vertical lines, respectively.
probability of missing the second target is increased. We should note that neither the
uniform nor the NSP precoding schemes have any capability of clutter mitigation. From
Figure 10.7, we observe that the jointly designed precoding scheme successfully focuses
the transmit power toward the three targets and nullifies the power toward the point
scatterers. The three targets can be accurately estimated from the pseudo-spectrum
obtained by the joint design. Meanwhile, the communication system can still achieve
the required rate by aligning its transmission along a channel subspace that does not
interfere with the radar emissions. This significant advantage is enabled by the joint
design of radar and communication transmissions.
10.5.2 Comparison of Different Levels of Cooperation

In this subsection, we compare several algorithms with different levels of radar and
communication cooperation. The compared algorithms include:
• Uniform radar precoding and selfish communication:

the radar transmit anten-
nas use the trivial precoding, i.e., P = LPR /Mt,R I ; and the communication
system minimizes the transmit power to achieve certain average capacity without
any concern about the interference it exerts to the radar system. This algorithm
involves no radar and communication cooperation.
• NSP based radar precoding and selfish communication:
the radar transmit anten-
nas use the fixed precoding, i.e., P = LPR /Mt,R V V H , while the selfish
communication scheme is the same with the previous case.
• Uniform radar precoding and joint design of R xl and to minimize the effective
interference to the radar receiver.
• Design of P and selfish communication: only the radar precoding matrix P is
designed to maximize the radar ESINR.
• Joint design of P , R xl , and in (10.39).
We use the same values for all parameters as in the previous simulation except that the
radar transmit power budget PR changes from 51,200 to 2.56 × 106 . Figure 10.8 shows
1
MC Relative Recovery Error
30
0.8
20 Uniform Precoding + Selfish Comm. Uniform Precoding + Selfish Comm.
ESINR in dB
NSP Precoding + Selfish Comm. NSP Precoding + Selfish Comm.

0.6 Uniform Precoding + Design R xl&W
Uniform Precoding + Design R xl&W
10
Design P + Selfish Comm. Design P + Selfish Comm.
Jointly Designed Precoding 0.4 Jointly Designed Precoding
0
0.2
–10
Rada r T X P ower Budget P R × 10 5 Radar TX Power Budget P R × 10 5
1
Relative RCS Estimation RMSE
0.8
Uniform Precoding + Selfish Comm.
NSP Precoding + Selfish Comm.
0.6
Uniform Precoding + Design R xl&W
Design P + Selfish Comm.
0.4 Jointly Designed Precoding
0.2
Radar TX Power Budget P R × 10 5
Figure 10.8 Comparison of spectrum sharing with different levels of cooperation between the
MIMO-MC radar and the communication system under different PR .
the achieved ESINR, the MC relative recovery error, and the relative target RCS esti-
mation RMSE. The algorithms that use trivial uniform and NSP-based radar precoding
perform poorly because the point scatterers are not properly mitigated. The scheme that
designs P only could mitigate the scatterers but the interference from the communica-
tion transmission is not controlled. The joint design of P , R xl , and simultaneously
addresses the clutter and the mutual interference between the radar and the communi-
cation systems, and thus achieves the best performance amongst all the algorithms. The
performance gains come from high-level cooperation between the two systems.
10.5.3 Comparison between Adaptive and Constant-Rate Communication Transmissions

In this subsection, we evaluate the performance of two communication transmission
schemes, namely, adaptive transmission with different R xl ’s for all l ∈ N+ L , and
constant-rate transmission with R x across all pulses. We use the following parameter
setting: Mt,R = 16,Mr,R = Mt,C = 8,Mr,C = 2, C = 10 bits/symbol, PC = 64
and PR = 1000 × PC . For the G1 and G2 , Rayleigh fading is used with fixed σG 2 and
1
varying σG2 . The results of ESINR, MC relative recovery error and the relative target
2
RCS estimation RMSE for different values of σG 2 are shown in Figure 10.9. The value
2
of σG2 varies from 0.05 to 0.5, which effectively simulates different distances between
2
the communication transmitter and the radar receiver. It is clear that the adaptive
communication transmission outperforms the constant-rate counterpart under various
values of interference channel strength. As discussed in Section 10.4.3, the adaptive
Figure 10.9 Comparison of spectrum sharing with adaptive and constant-rate communication
transmissions under different levels of variance of the interference channel from the
communication transmitter to the radar receiver.
communication transmission can fully exploit the channel diversity of G2l introduced
by the radar sub-sampling procedure. The price for the performance advantages is high
complexity. The average running times on a laptop with Intel Core i5 dual-core 2.4 GHz
CPU for the adaptive and constant-rate communication transmissions are respectively
15.6 and 4.8 seconds. The choice between these two transmission schemes can be made
depending on the available computing resources.
10.5.4 Comparison between MIMO-MC Radars and Traditional MIMO Radars

In this subsection, we present a simulation to show the advantages of MIMO-MC radars
compared to the traditional full-sampled MIMO radars. The parameters are the same as
2 = 0.3 and σ 2 = 1, which
those in the simulation in Section 10.5.3, but with fixed σG 1 G2
indicates strong mutual interference, especially interference from the communication
transmitter to the radar receiver. The radar transmit power budget PR is taken to be
equal to 10 × PC . We consider two targets; one is randomly located and the other is
taken to be 25◦ away. We also consider 4 randomly located point scatterers. Figure 10.10
shows the results under different MIMO-MC sub-sampling rates p. Note that full sam-
pling is used for the traditional MIMO radar. The MC relative recover error for the
traditional radar is actually the output distortion-to-signal ratio. A smaller distortion-
to-signal ratio corresponds to a larger output SNR. For ease of comparison, a black
dashed line is used for the traditional MIMO radar. We observe that the MIMO-MC
radar achieves better performance in ESINR than the traditional radar. This is due to
the fact that the communication system can effectively prevent its transmission from
Figure 10.10 Comparison of spectrum sharing with traditional MIMO radars and MIMO-MC
radars with different sub-sampling rates p.
interfering the radar system when the number of actively sampled radar RX antennas
is small, i.e., sub-sampling is small. In addition, the larger ESINR of the MIMO-MC
radar results in a larger output SINR than that of the traditional radar. Furthermore, the
MIMO-MC radar achieves better target RCS estimation accuracy than the traditional
radar if its sub-sampling rate is between 0.4 and 0.7. For p larger than 0.7, the target
RCS estimation accuracy achieved by the MIMO-MC radar is worse than that achieved
by the traditional radar because small ESINRs for p ≥ 0.7 introduce high distortion
in the completed data matrix. The results in Figure 10.10 could be used to help the
selection of radar sub-sampling rate p. For the best target RCS estimation accuracy,
p = 0.6 is the best choice, while for the biggest savings in terms of samples and
similar performance as traditional radars, p = 0.4 is the best choice. Since there is no
closed-form solution for the joint design problem, it is difficult to provide a theoretical
justification.
Based on these results, we conclude that MIMO-MC radars can coexist with commu-
nication systems and achieve better target RCS estimation than traditional radars, while
saving up to 60% of data samples. Such significant advantage is introduced by sparse
sensing (i.e., sub-sampling) in MIMO-MC radars, as discussed in Section 10.4.4.
10.6 Conclusions
In this chapter, we have considered the coexistence of a MIMO-MC radar and a wire-
less MIMO communication system by sharing a common carrier frequency. The radar
transmits random unitary waveforms, and both radar and communication systems use
precoders. The precoders and the radar sub-sampling scheme have been jointly designed
by the control center to maximize the radar SINR while meeting certain rate and power
constraints for the communication system. Random unitary waveforms can be easily
generated and updated for waveform security. We should note that the presented joint
design–based spectrum sharing method can also be applied to traditional MIMO radars,
which is a special case of MIMO-MC radars with 100% sub-sampling rate, i.e., p = 1.
The jointly designed spectrum sharing scheme has been evaluated via extensive sim-
ulations. Specifically, we have shown that cooperative design brings a significant perfor-
mance advantage as compared to noncooperative design. The jointly designed spectrum
sharing scheme successfully focuses the transmit power towards the targets and nulli-
fies the power towards the clutter. It achieves significant improvement in ESINR, MC
relative recovery error, and target RCS estimation accuracy. We have also compared
the performance and complexity of the adaptive and the constant-rate communication
transmission schemes for radar-communication spectrum sharing. Finally, we have pro-
vided simulation-based comparison of MIMO-MC radars and traditional MIMO radars
coexisting with communication systems. We have observed that the MIMO-MC radar
achieves better performance in terms of ESINR and output SNR. Our simulations sug-
gest that MIMO-MC radars can coexist with communication systems and achieve better
target RCS estimation than traditional radars, while saving up to 60% in data samples.
Of course these advantages come at increased computations for MC.
We should note that the constraint requiring that the number of targets is smaller
than the number of radar antennas results in an inefficient usage of the MIMO radar
degrees of freedom. However, the high resolution of traditional MIMO radar is retained
by MIMO-MC radars with a great reduction of sample and hardware complexity. The
considered signal model is for narrow-band waveforms. Broadband MIMO systems
typically use OFDM waveforms [16]. In such case, the joint design still applies on indi-
vidual component carriers. This would substantially expand the application scenarios of
the results presented in this chapter.
References
[1] “Realizing the full potential of government-held spectrum to spur economic growth,” The
Presidents Council of Advisors on Science and Technology (PCAST), technical report, July
2012. [Online]. www.dtic.mil/dtic/tr/fulltext/u2/a565091.pdf.
[2] D. Cabric, I. D. O’Donnell, M. S. Chen, and R. W. Brodersen, “Spectrum sharing radios,”
IEEE Circuits and Systems Magazine, vol. 6, no. 2, pp. 30–45, 2006.
[3] “FCC proposes innovative small cell use in 3.5 GHz band,” Federal Communications
Commission (FCC), news release, December 2012. [Online]. apps.fcc.gov/edocs_public/
attachmatch/DOC-317911A1.pdf.
[4] G. Locke and L. E. Strickling, “An assessment of the near-term viability of accommodating
wireless broadband systems in the 1675–1710 MHz, 1755–1780 MHz, 3500–3650 MHz,
and 4200–4220 MHz, 4380–4400 MHz bands,” US Dept. of Commerce, the National
Telecommunications and Information Administration, technical report TR-13-490, 2012.
[5] E. Drocella, J. Richards, R. Sole, F. Najmy, A. Lundy, and P. McKenna, “3.5 GHz exclusion
zone analyses and methodology,” US Dept. of Commerce, the National Telecommunications
and Information Administration, technical report TR-15-517, 2015.
[6] F. H. Sanders, R. L. Sole, B. L. Bedford, D. Franc, and T. Pawlowitz, “Effects of RF
interference on radar receivers,” US Dept. of Commerce, the National Telecommunications
[7] A. Lackpour, M. Luddy, and J. Winters, “Overview of interference mitigation techniques
between WiMAX networks and ground based radar,” in 20th Annual Wireless and Optical
Communications Conference, April 2011, pp. 1–5.
[8] F. H. Sanders, R. L. Sole, J. E. Carroll, G. S. Secrest, and T. L. Allmon, “Analysis and res-
olution of RF interference to radars operating in the band 2700–2900 MHz from broadband
communication transmitters,” US Dept. of Commerce, the National Telecommunications
[9] M. R. Bell, N. Devroye, D. Erricolo, T. Koduri, S. Rao, and D. Tuninetti, “Results on
spectrum sharing between a radar and a communications system,” in 2014 International
Conference on Electromagnetics in Advanced Applications (ICEAA), 2014, pp. 826–829.
[10] Q. Zhao and B. M. Sadler, “A survey of dynamic spectrum access,” IEEE Signal Processing
Magazine, vol. 24, no. 3, pp. 79–89, May 2007.
[11] E. Hossain, D. Niyato, and Z. Han, Dynamic Spectrum Access and Management in Cognitive
Radio Networks. Cambridge University Press, 2009.
[12] L. S. Wang, J. P. McGeehan, C. Williams, and A. Doufexi, “Application of cooperative
sensing in radar-communications coexistence,” IET Communications, vol. 2, no. 6, pp. 856–
868, July 2008.
[13] S. S. Bhat, R. M. Narayanan, and M. Rangaswamy, “Bandwidth sharing and scheduling

for multimodal radar with communications and tracking,” in IEEE Sensor Array and
Multichannel Signal Processing Workshop, June 2012, pp. 233–236.
[14] R. Saruthirathanaworakun, J. M. Peha, and L. M. Correia, “Opportunistic sharing between
rotating radar and cellular,” IEEE Journal on Selected Areas in Communications, vol. 30,
no. 10, pp. 1900–1910, 2012.
[15] S. C. Surender, R. M. Narayanan, and C. R. Das, “Performance analysis of communications
& radar coexistence in a covert UWB OSA system,” in IEEE Global Telecommunications
Conference, 2010, pp. 1–5.
[16] S. Gogineni, M. Rangaswamy, and A. Nehorai, “Multi-modal OFDM waveform design,” in
IEEE Radar Conference, April 2013, pp. 1–5.
[17] A. Turlapaty and Y. Jin, “A joint design of transmit waveforms for radar and communications
systems in coexistence,” in IEEE Radar Conference, 2014, pp. 0315–0319.
[18] A. Aubry, D. M. A., M. Piezzo, and A. Farina, “Radar waveform design in a spec-
trally crowded environment via nonconvex quadratic optimization,” IEEE Transactions on
Aerospace and Electronic Systems, vol. 50, no. 2, pp. 1138–1152, 2014.
[19] A. Aubry, A. De Maio, Y. Huang, M. Piezzo, and A. Farina, “A new radar waveform
design algorithm with improved feasibility for spectral coexistence,” IEEE Transactions on
Aerospace and Electronic Systems, vol. 51, no. 2, pp. 1029–1038, April 2015.
[20] K. Huang, M. Bica, U. Mitra, and V. Koivunen, “Radar waveform design in spectrum sharing
environment: Coexistence and cognition,” in IEEE Radar Conference, 2015, pp. 1698–1703.
[21] M. Bica, K. W. Huang, V. Koivunen, and U. Mitra, “Mutual information based radar
waveform design for joint radar and cellular communication systems,” in 2016 IEEE
International Conference on Acoustics, Speech and Signal Processing (ICASSP), March
2016, pp. 3671–3675.
[22] S. Sodagari, A. Khawar, T. C. Clancy, and R. McGwier, “A projection based approach
for radar and telecommunication systems coexistence,” in IEEE Global Telecommunication
Conference, December 2012, pp. 5010–5014.
[23] A. Babaei, W. H. Tranter, and T. Bose, “A practical precoding approach for
radar/communications spectrum sharing,” in 8th International Conference on Cognitive
Radio Oriented Wireless Networks, July 2013, pp. 13–18.
[24] S. Amuru, R. M. Buehrer, R. Tandon, and S. Sodagari, “MIMO radar waveform design to
support spectrum sharing,” in IEEE Military Communication Conference, November 2013,
pp. 1535–1540.
[25] A. Khawar, A. Abdel-Hadi, and T. C. Clancy, “Spectrum sharing between S-band radar and
LTE cellular system: A spatial approach,” in IEEE International Symposium on Dynamic
Spectrum Access Networks, April 2014, pp. 7–14.
[26] C. Shahriar, A. Abdelhadi, and T. C. Clancy, “Overlapped-MIMO radar waveform design
for coexistence with communication systems,” in IEEE Wireless Communications and
Networking Conference, 2015, pp. 223–228.
[27] A. Khawar, A. Abdelhadi, and T. C. Clancy, MIMO Radar Waveform Design for Spectrum
Sharing with Cellular Systems: A MATLAB Based Approach. Springer, 2016.
[28] H. Deng and B. Himed, “Interference mitigation processing for spectrum-sharing between
radar and wireless communications systems,” IEEE Transactions on Aerospace and Elec-
tronic Systems, vol. 49, no. 3, pp. 1911–1919, July 2013.
[29] A. Hassanien, M. G. Amin, Y. D. Zhang, and F. Ahmad, “Signaling strategies for dual-
function radar-communications: An overview,” IEEE Aerospace and Electronic Systems
Magazine, vol. 31, no. 10, pp. 36–45, October 2016.
[30] B. Li and A. P. Petropulu, “Spectrum sharing between matrix completion based MIMO
radars and a MIMO communication system,” in IEEE International Conference on Acous-
tics, Speech and Signal Processing, April 2015, pp. 2444–2448.
[31] B. Li, A. P. Petropulu, and W. Trappe, “Optimum co-design for spectrum sharing between
matrix completion based MIMO radars and a MIMO communication system,” IEEE
Transactions on Signal Processing, vol. 64, no. 17, pp. 4562–4575, September 2016.
[32] B. Li and A. P. Petropulu, “Radar precoding for spectrum sharing between matrix comple-
tion based MIMO radars and a MIMO communication system,” in IEEE Global Conference
on Signal and Information Processing, December 2015, pp. 737–741.
[33] B. Li, H. Kumar, and A. P. Petropulu, “A joint design approach for spectrum sharing between
radar and communication systems,” in IEEE International Conference on Acoustics, Speech
and Signal Processing, March 2016, pp. 3306–3310.
[34] B. Li and A. P. Petropulu, “MIMO radar and communication spectrum sharing with clutter
mitigation,” in IEEE Radar Conference, May 2016, pp. 1–6.
[35] B. Li and A. P. Petropulu, “Matrix completion based MIMO radars with clutter and
interference mitigation via transmit precoding,” in 2017 IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP), March 2017, pp. 3216–3220.
[36] B. Li and A. P. Petropulu, “Joint transmit designs for coexistence of MIMO wireless
communications and sparse sensing radars in clutter,” IEEE Transactions on Aerospace and
Electronic Systems, vol. 53, no. 6, pp. 2846–2864, December 2017.
[37] S. Sun, W. Bajwa, and A. P. Petropulu, “MIMO-MC radar: A MIMO radar approach based
on matrix completion,” IEEE Transactions on Aerospace and Electronic Systems, vol. 51,
no. 3, pp. 1839–1852, July 2015.
[38] E. J. Candès and Y. Plan, “Matrix completion with noise,” Proceedings of the IEEE, vol. 98,
no. 6, pp. 925–936, June 2010.
[39] D. S. Kalogerias and A. P. Petropulu, “Matrix completion in colocated MIMO radar: Recov-
erability, bounds and theoretical guarantees,” IEEE Transactions on Signal Processing,
vol. 62, no. 2, pp. 309–321, Jan 2014.
[40] C. Chen and P. P. Vaidyanathan, “Compressed sensing in MIMO radar,” in Asilomar
Conference on Signals, Systems and Computers, 2008, pp. 41–44.
[41] Y. Yu, A. P. Petropulu, and H. V. Poor, “MIMO radar using compressive sampling,” IEEE
Journal of Selected Topics in Signal Processing, vol. 4, no. 1, pp. 146–163, February 2010.
[42] M. G. Amin, Compressive Sensing for Urban Radar. CRC Press, 2014.
[43] S. Sun and A. P. Petropulu, “Waveform design for MIMO radars with matrix completion,”
IEEE Journal of Selected Topics in Signal Processing, vol. 9, no. 8, pp. 1400–1414,
December 2015.
[45] H. Krim and M. Viberg, “Two decades of array signal processing research: The parametric
approach,” IEEE Signal Processing Magazine, vol. 13, no. 4, pp. 67–94, 1996.
[46] E. J. Candès and B. Recht, “Exact matrix completion via convex optimization,” Foundations
of Computational Mathematics, vol. 9, no. 6, pp. 717–772, 2009.
[47] K. Zyczkowski and M. Kus, “Random unitary matrices,” Journal of Physics A: Mathemati-
cal and General, vol. 27, no. 12, p. 4235, 1994.
[48] B. Laurent and P. Massart, “Adaptive estimation of a quadratic functional by model

selection,” Annals of Statistics, vol. 28, no. 5, pp. 1302–1338, 2000.
[49] T. Jiang, “How many entries of a typical orthogonal matrix can be approximated by
independent normals?” The Annals of Probability, vol. 34, no. 4, pp. 1497–1529, 2006.
[50] “Amendment of the commissionâĂŹs rules with regard to commercial operations in the
3550–3650 MHz band,” Federal Communications Commission (FCC), technical report,
April 2015. [Online]. https://apps.fcc.gov/edocs_public/attachmatch/FCC-15-47A1.pdf.
[51] C. Kopp, “Search and acquisition radars (S-band, X-band),” Air Power Australia,
technical report APA-TR-2009-0101, 2009. [Online]. www.ausairpower.net/APA-
Acquisition-GCI.html.
[52] “Radar performance,” Radtec Engineering Inc., technical report, 2015. [Online]. http://
radar-sales.com/PDFs/Performance_RDR%26TDR.pdf.
[53] T. Rappaport, Wireless Communications Principles and Practice. Prentice Hall, 2001.
[54] D. Tse and P. Viswanath, Fundamentals of Wireless Communication. Cambridge University
Press, 2005.
[55] A. Goldsmith, Wireless Communications. Cambridge University Press, 2005.
[56] R. P. Jover, “LTE PHY fundamentals,” technical report, 2015. [Online]. www.slideshare.net/
PrashantSengar/lte-phy-fundamentals-50510450.
[57] “LTE in a nutshell: The physical layer,” Telesystem Innovations Inc., white paper, 2010.
[58] J. G. Andrews, A. Ghosh, and R. Muhamed, Fundamentals of WiMAX: Understanding
Broadband Wireless Networking. Prentice Hall, 2007.
[59] B. Li and A. P. Petropulu, “Distributed MIMO radar based on sparse sensing: Analysis and
efficient implementation,” IEEE Transactions on Aerospace and Electronic Systems, vol. 51,
no. 4, pp. 3055–3070, October 2015.
[60] M. Filo, A. Hossain, A. R. Biswas, and R. Piesiewicz, “Cognitive pilot channel: Enabler
for radio systems coexistence,” in 2nd International Workshop on Cognitive Radio and
Advanced Spectrum Management, May 2009, pp. 17–23.
[61] R. Rogalin, O. Y. Bursalioglu, and H. Papadopoulos, “Scalable synchronization and
reciprocity calibration for distributed multiuser MIMO,” IEEE Transactions on Wireless
Communications, vol. 13, no. 4, pp. 1815–1831, 2014.
[62] R. Zhang and Y. Liang, “Exploiting multi-antennas for opportunistic spectrum sharing in
cognitive radio networks,” IEEE Journal of Selected Topics in Signal Processing, vol. 2,
no. 1, pp. 88–102, February 2008.
[63] R. Zhang, Y. Liang, and S. Cui, “Dynamic resource allocation in cognitive radio networks,”
IEEE Signal Processing Magazine, vol. 27, no. 3, pp. 102–114, May 2010.
[64] S. J. Kim and G. B. Giannakis, “Optimal resource allocation for MIMO ad hoc cognitive
radio networks,” IEEE Transactions on Information Theory, vol. 57, no. 5, pp. 3117–3131,
May 2011.
[65] L. Lu, X. Zhou, U. Onunkwo, and G. Y. Li, “Ten years of research in spectrum sensing and
sharing in cognitive radio.” EURASIP J. Wireless Comm. and Networking, vol. 2012, p. 28,
2012.
[66] K. T. Phan, S. A. Vorobyov, N. D. Sidiropoulos, and C. Tellambura, “Spectrum sharing in
wireless networks via QoS-aware secondary multicast beamforming,” IEEE Transactions
on signal processing, vol. 57, no. 6, pp. 2323–2335, 2009.
[67] H. Du and T. Ratnarajah, “Robust utility maximization and admission control for a MIMO
cognitive radio network,” IEEE Transactions on Vehicular Technology, vol. 62, no. 4, pp.
1707–1718, 2013.
[68] X. Hou and C. Yang, “How much feedback overhead is required for base station cooperative
transmission to outperform non-cooperative transmission?” in 2011 IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011, pp. 3416–3419.
[69] P. Stoica, J. Li, and Y. Xie, “On probing signal design for MIMO radar,” IEEE Transactions
on Signal Processing, vol. 55, no. 8, pp. 4151–4161, 2007.
[70] G. Cui, H. Li, and M. Rangaswamy, “MIMO radar waveform design with constant modulus
and similarity constraints,” IEEE Transactions on Signal Processing, vol. 62, no. 2, pp. 343–
353, 2014.
[71] R. Mudumbai, G. Barriac, and U. Madhow, “On the feasibility of distributed beamforming
in wireless networks,” IEEE Transactions on Wireless Communications, vol. 6, no. 5,
pp. 1754–1763, 2007.
[72] C. Chen and P. P. Vaidyanathan, “MIMO radar ambiguity properties and optimization using
frequency-hopping waveforms,” IEEE Transactions on Signal Processing, vol. 56, no. 12,
pp. 5926–5936, 2008.
[73] S. N. Diggavi and T. M. Cover, “The worst additive noise under a covariance constraint,”
IEEE Transactions on Information Theory, vol. 47, no. 7, pp. 3072–3081, November 2001.
[74] Z. Chen, H. Li, G. Cui, and M. Rangaswamy, “Adaptive transmit and receive beamforming
for interference mitigation,” IEEE Signal Processing Letters, vol. 21, no. 2, pp. 235–239,
February 2014.
[75] C. Chen and P. P. Vaidyanathan, MIMO Radar Spacetime Adaptive Processing and Signal
Design. John Wiley & Sons, 2008, pp. 235–281.
[76] S. Bhojanapalli and P. Jain, “Universal matrix completion,” in Proceedings of The 31st
International Conference on Machine Learning, 2014, pp. 1881–1889.
[77] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge University Press, 2004.
[78] R. G. Bland, D. Goldfarb, and M. J. Todd, “The ellipsoid method: A survey,” Operations
Research, vol. 29, no. 6, pp. 1039–1091, 1981.
[79] H. W. Kuhn, “The Hungarian method for the assignment problem,” Naval Research Logistics
Quarterly, vol. 2, no. 1-2, pp. 83–97, 1955.
[80] Q. Li and W.-K. Ma, “Optimal and robust transmit designs for MISO channel secrecy
by semidefinite programming,” IEEE Transactions on Signal Processing, vol. 59, no. 8,
pp. 3799–3812, 2011.
[81] J. Yeh, Real Analysis: Theory of Measure and Integration. World Scientific, 2006.
[82] “3GPP TS LTE evolved universal terrestrial radio access (E-UTRA) physical layer proce-
dures,” 3rd Generation Partnership Project (3GPP), Technical Specification TS 36.213 V8.0,
2009.
[83] M. T. Kawser, B. Hamid, N. Hasan, M. S. Alam, and M. M. Rahman, “Downlink SNR to
CQI mapping for different multiple antenna techniques in LTE,” International Journal of
Information and Electronics Engineering, vol. 2, no. 5, p. 757, 2012.
11 Compressed Sensing Methods for
Radar Imaging in the Presence of
Phase Errors and Moving Objects
Ahmed Shaharyar Khwaja,∗ Naime Ozben Onhon, and Mujdat Cetin
11.1 Introduction and Outline of the Chapter
Compressed sensing (CS) is a useful tool for processing sparse signals, i.e., signals
that have a few large values and many small or zero values. In [1], the authors state
that if an object has a sparse representation in a basis, a lesser number of nonadaptive
measurements contains enough information to reconstruct the object via basis pursuit.
A dictionary for sparse representation can be implicitly defined through a transform or
explicitly defined through a collection of signals that can represent an object or a signal
in a compressed form, e.g., a piece-wise smooth object can be represented by only a few
coefficients after taking a wavelet transform, or a narrow-band signal can be described
in terms of a few entries of a dictionary consisting of exponential signals with varying
frequencies. In the former case, CS reconstruction can be described as a dictionary-less
approach, whereas in the latter case, it can be described as a dictionary-based approach.
In [2], the authors explain that a sparse or piece-wise constant signal can be recovered
from a random number of its partial Fourier coefficients. Candès and Tao show in [3] that
a measured coded signal corrupted by errors can be recovered using linear programming,
provided the signal is sparse and there are certain constraints on the coding matrix.
References [4,5] provide a review of CS, as well as identify potential application areas
for data compression, channel coding, inverse problems, data acquisition, single-pixel
imaging, etc.
The purpose of radar imaging is to extract an underlying signal present in the elec-
tromagnetic data received by an antenna. The underlying signal can be the image of an
area or of an object being observed by the antenna. In many cases, these signals are
sparse, thus making them suitable for applications of CS. References [6] introduces a
new approach to radar imaging based on CS where a low-dimensional, nonadaptive,
linear projection is used to acquire an efficient representation of a compressible signal
directly, using just a few measurements. The use of this approach can result in the
design of new radar systems, where a matched filter is not needed at the receiver and
the analog-to-digital conversion bandwidth can be reduced. Reference [7] demonstrates
the advantage of CS for radar imaging by achieving higher resolution and artifact-free
∗ Ahmed Shaharyar Khwaja is supported by The Scientific and Technological Research Council of Turkey
(TUBITAK) - 2236 Co-Funded Brain circulation grant.
321
322 Khwaja, Onhon, and Cetin
images from randomly undersampled data. A new radar imaging system based on CS is
presented in [8].
In this chapter, we present CS and its applications in radar imaging, specifically
synthetic aperture radar (SAR) and inverse synthetic aperture radar (ISAR) imaging,
focusing on scenarios involving unknown motion. This chapter is organized as follows:
• We first provide the relevant mathematical expressions for CS and SAR, and then
formulate the problem of CS SAR imaging.
• Thereafter, we consider the case where there are unknown motion errors present
during the SAR acquisition process. The compensation of these errors blindly is
called autofocus. We formulate the problem, present a general autofocus-based
CS solution, and then review existing literature on this topic.
• In the next section, we formulate the problem of SAR moving target imaging,
discuss the types of existing CS-based solutions and present a survey of existing
methods in the literature.
• In the next section, we present CS ISAR imaging, followed by a literature review
of this topic.
We would like to mention that in this chapter, all equations that represent functions
with arguments show matrices or vectors obtained using each combination of values of
their arguments, e.g., S(k,y) = exp{−j 2ky} represents a function. The arguments of
this function are vectors k and y having lengths equal to Nt and Ny , respectively. This
function results in the creation of a two-dimensional (2D) matrix S whose number of
rows and columns are equal to Ny and Nt , respectively. The qth row of S is obtained
by using all entries in k and the qth entry of y, i.e., exp{−j 2kyq }. All other equations
not representing functions with arguments, e.g., z = α are obtained by following the
normal matrix multiplication rules.
11.2 Compressed Sensing and Radar Imaging
Compressed sensing has been shown to be a useful tool in applications of radar imaging
[9,10]. Reference [11] demonstrates the advantage of CS for achieving higher resolution
compared to traditional processing techniques. In [12] and [13], the authors validate the
applications of CS with real data by showing that even after many missing samples in the
raw data, an image can still be reconstructed without any loss of resolution. Examples of
traditional reconstruction with full and limited data, and CS-based reconstruction using
limited data, are shown in Figure 11.1. These examples have been taken from [9]. It can
be observed that CS-based reconstruction can generate a high resolution image even
from limited data.
The concept of CS states that a sparse unknown signal can be recovered from incom-
plete sets of linear measurements by a specifically designed nonlinear recovery algo-
rithm, offering the possibility of signal compression, hardware simplification, as well
as emphasizing certain features in an image [14]. The CS theory mainly involves for-
mulating a system model first and, subsequently, the incorporation of the model into a
Compressed Sensing, Radar Imaging, Phase Errors, and Moving Objects 323
(a) (b) (c)
Figure 11.1 The reconstructions of the Slicy target from the MSTAR data set. (a) The reference
image reconstructed from high-bandwidth data. (b) The conventional image reconstructed from
limited-bandwidth data. (c) Image reconstructed from limited-bandwidth data using compressed
sensing. (Taken from [9] with permission.)
least squares–based solution regularized by l0 -norm minimization to encourage sparse

solutions. Recent work on compressed sensing has shown that to make the solution
scalable for large problems, the l0 norm can be replaced by an lp -norm, with 0 < p ≤ 1
[15,16].
11.2.1 Compressed Sensing

We first provide the basic expressions involved in CS. Let z be a signal that is sampled
and y be the measured samples of the signal. The size of z is N × 1 and the size of
y is K × 1, where K N . Let A be a K × N measurement matrix used to acquire
the measured samples. In many cases, the signal z is compressible or sparse in some
dictionary of dimensions N × M,M ≥ N . The signal z can be written as z = α.
The measured samples y can then be written in terms of A and as
y = Aα + (11.1)
or
y = α + , (11.2)
where denotes noise and = A. If the number of measurements is less than the
number of unknown variables, the recovery of α from y is hampered by the fact that
there can be many solutions to this under-determined problem. However, if satisfies
a property known as the restricted isometric property (RIP), α can be recovered by
solving the following problem:
minz y − α22 + λα0 . (11.3)
The RIP means that the columns of should not be very similar, as this would ensure
a sufficient number of linearly-independent measurements. The above problem is a
combinatorial optimization problem and is solved by either using a relaxed version
of the problem that replaces the l0 -norm with an lp -norm, e.g., with p = 1, or using
greedy algorithms. The first term in (11.3) represents the closeness of the solution to the
observations, while the second term represents a priori sparse information. The term λ
is used to balance these two terms and is known as a hyper-parameter. Its choice can
influence the solution. Note that (11.3) can also be obtained by assuming a Laplacian
prior for α and Gaussian distribution for the noise, and subsequently solving for α using
maximum a posteriori estimation.
11.2.2 Synthetic Aperture Radar Imaging

A flat scene, divided into nr × ny scatterers in the range (across-track) and azimuth
(along-track) directions, respectively, is considered. A SAR antenna moves along a
certain path in azimuth direction with velocity V . The antenna emits a chirp pulse p̄(t)
from every azimuth position, also called aperture position.

t $ %
p̄(t) = rect cos 2πfc t + πKt 2 . (11.4)
Tp
This pulse has a length Tp , a chirp-rate K, and t is the across-track time sampled at
a frequency ft , i.e., t = 0, f1t , f2t ,. . ., Nft −1
t
with Nt being the total number of across-
track time samples. The received pulse is a delayed version of the transmitted pulse and
this delay is given by the two-way distance between the antenna and the scatterer. The
total received raw data are given by an integration of the received pulse over the whole
scene. However, as a discretized scene is considered, the raw data are considered as a
sum of signals received from each of the scatterers observed by the antenna at different
positions in the azimuth direction, and can be described by the following equation after
demodulation:

N
S(t,y) = σ(rn,yn )Pn (t,y),N = nr × ny , (11.5)
n=1
where
Pn (t,y) = p (t − 2dn (y)/c) ,n = {1,2,. . .,N } (11.6)
and y is the along-track distance sampled at a frequency fτ /V , i.e., y = 0, fVτ , 2V

fτ ,. . .,
V (Ny −1)
fτ . The total number of along-track time samples is given by Ny and
&
t − 2dn (y)/c
p (t − 2dn (y)/c) = rect exp − j 4πfc (dn (y)/c)
Tp
'
+j πK(t − 2dn (y)/c)2 . (11.7)
The backscattering coefficient of the nth scatterer is represented by σ(rn,yn ). These

backscattering coefficients constitute a reflectivity map σ(r,y), whose discretized
version is defined as follows:

N
(r,y) = σ(rn,yn )δ(r − rn,y − yn ), (11.8)
n=1
where δ(.,.) represent a 2D Dirac pulse. This reflectivity map represents the intensity
of
each scatterer in an observed scene. The sensor-target distance is given as dn (y) =
rn + (y − yn ) , where rn = xn2 + H 2 is the slant-range, xn is the ground-range,
2 2
and H is the height of the SAR antenna above the flat scene. Different SAR processing
algorithms, e.g., the wavefront reconstruction algorithm [17], the omega-k algorithm
[18], range-Doppler algorithm [19], or the chirp scaling algorithm [20] can be used to
generate images from the raw data.
The expression for raw data in a one-dimensional (1D) wavenumber domain calcu-
lated using the principle of stationary phase (POSP) [19] is given as

.
S(k,y) = p(
k) σ(rn,yn ) exp −j 2k rn2 + (y − yn )2 , (11.9)
n
where p(
k) is the Fourier transform (FT) of p(t),
k is the range wavenumber, i.e.,

k = {−Nt /2×2πft /(Nt c),(−Nt /2+1)×2πft /(Nt c),. . .,(Nt /2−1)×2πft /(Nt c)},
k = kc +
k, and kc = 2πfc /c.
11.2.3 Compressed Sensing SAR Imaging

As a first instance, we note that the received raw data can be considered as a weighed
sum of pulses Pn (.), given by (11.5). We reshape these pulses in one column of size
Nt Ny × 1 using a function ξ(.) that converts a matrix into a column, i.e.,
pn = ξ(Pn (t,y)),n = {1,2,. . .,N } (11.10)
and subsequently arranges them as different columns of a matrix of size Nt Ny × N as

follows:
= [p1,p2,. . .,pN ]. (11.11)
Similarly, converting the reflectivity as a column of size N × 1 as follows:
σ = [σ1,σ2,. . .,σN ]T , (11.12)
where σn = σ(rn,yn ),n = {1,2,. . .,N}, we can write the received data s as a column of
size Nt Ny × 1 as follows:
s = σ + , (11.13)
where a nonzero value of the reflectivity leads to the selection of raw data of a column
from . This selected column corresponds to the position of the nonzero value of the
reflectivity.
In case of data loss or undersampling in the azimuth and/or range, we can consider
different rows or columns to be missing in , thus we can rewrite (11.13) as
s = σ + , (11.14)
Figure 11.2 Equivalence between CS formulation and SAR imaging.
where = A can be considered as an undersampled basis, which is equivalent to

the undersampling operator/measurement matrix A multiplied by the basis. The under-
sampling operator A can be considered as a downsampling operator in the across-track
direction or along-track direction. It can either represent an analog-to-digital converter
for across-track sampling or a sampling strategy in along-track direction such that the
RIP is satisfied. The measurement matrix has a size K × Nt Ny , where K < Nt Ny .
This model is shown in Figure 11.2. The figure shows pulses being emitted and received
at different positions in the along-track direction. The positions are given by the small
circles placed vertically and the observed scene is represented by a big circle. The pulses
varying according to radar-target distance form the dictionary. The received data consist
of these pulses weighed by the reflectivity of the observed scene. The raw data are
collected from all the positions in the along-track direction and then stored after analog-
to-digital conversion.
To recover the reflectivity from this under-determined problem, the following condi-
tions should be met:
• The reflectivity vector σ is sparse or compressible, which is a valid assumption

in CS SAR imaging when there are a few strong reflecting points in a scene. In
[10], the authors show the compressibility of scenes observed by SAR, which
underlines the suitability of CS for solving SAR imaging problems. An example
from [10] can be seen in Figure 11.3, where standard reconstruction is compared
to CS reconstruction using p = 1 and p = 0.8.
• For a sparsity level sl , the number of linearly independent measurements M
should be M = O(sl log2 (Nt Ny )) [21]. Normally, CS has been successfully
applied to SAR scenes using around 50% of the measurements, which underlines
the utility of CS applied to SAR imaging.
• The matrix should observe the RIP. This property is satisfied by random Gaus-
sian matrices, Fourier matrices, chirp function matrices [22], etc. The latter two
matrices are directly involved in CS SAR imaging and justify the use of CS for
SAR imaging. In [23], the authors present examples of Fourier domain sampling
strategies that acquire lower number of samples compared to those required by
(a) (b) (c)
Figure 11.3 SAR images of a vehicle from 360◦ aperture. (a) Standard Fourier image.
Compressed sensing-based solution using lp -norm instead of l0 -norm in (11.3), where
(b) p = 1 and (c) p = 0.8. (Taken from [10] with permission.)
(a) (b) (c) (d)
Figure 11.4 Original resolution: 0.3 m: (a) Conventional image. (b) Compressed sensing
reconstruction. Original resolution: 0.6 m: (c) Conventional image. (d) Compressed sensing
reconstruction. (Taken from [14] with permission)
standard Nyquist sampling. The authors further demonstrate that the data obtained
by these patterns can be used for successful CS SAR reconstruction.
The conditions for CS reconstruction can be satisfied for the SAR imaging problem
when there are a few strong scatterers embedded in a weak background. In this case,
the reflectivity reconstruction can be carried out by solving the following minimization
problem:
σ̂ = argmins − σ22 + λσ0 . (11.15)
σ
This equation can be solved using different recovery methods: linear programming [2],
orthogonal matching pursuit (OMP) [24], iterative shrinkage/thresholding [25], modi-
fied quasi-Newton method [14], etc.
The reconstructed reflectivity vector σ̂ is then reshaped into a 2D form as follows:
ˆ
(r,y) = ξ −1 ( σ̂), (11.16)
which shows the estimated value of the reflectivity for different range and azimuth
positions. This estimation does not suffer from artifacts or reduction in resolution, which
may arise if the undersampled data are used for reconstruction using traditional SAR
imaging techniques. Examples of CS reconstruction are shown in Figure 11.4, taken
from [14].
11.3 Synthetic Aperture Radar Autofocus and Compressed Sensing
In SAR imaging, the platform is assumed to be following a straight trajectory. However,

that may not be the case in airborne SAR systems due to human errors or atmospheric
turbulence. Similarly, in the case of satellite SAR systems, the ionosphere can cause
a change in the propagation path, which results in degradation in image quality [26].
Such errors, if unaccounted for during SAR imaging, will lead to appearance of image
artifacts in the processed images. A measure of the trajectory deviations of the platform
is available from the onboard GPS that can be used to compensate for these errors
in the acquired raw data [27]. However, in many cases these measurements are only
available to a certain accuracy level, and residual motion errors still remain, even after
carrying out traditional motion compensation using these measurements. This type of
error causes phase errors in the SAR data, and the compensation of these unknown phase
errors, known as autofocus, cause defocusing in the reconstructed image.
The traditional autofocus techniques perform post processing, i.e., they use conven-
tionally reconstructed [28,29] defocused images for phase error estimation. The best
known of these techniques is phase gradient autofocus (PGA) [30], which estimates
phase errors from the defocused images by isolating many single defocused targets via
center-shifting and windowing operations. The aim of the windowing operation is to
preserve the information contained in the blur footprints of the center-shifted targets
and at the same time to leave out all of the contributions from other surrounding targets
with weak reflectivities [30].
11.3.1 Synthetic Aperture Radar Autofocus

The system model for SAR imaging can be modified to take into account these unknown
motion errors. Assuming residual motion errors ren (y) in range direction for the nth
scatterer, and ye (y) in the azimuth direction, the radar-target distance can now be
written as
.
den (y) = (rn − ren (y))2 + (y − yn − ye (y))2 . (11.17)
The presence of superscript n in ren (y) indicates that these errors are dependent on the
range position of a scatterer, whereas the absence of n in ye (y) means that these errors
are only dependent on the aperture positions. Please note that this model does not make
a narrow-beamwidth assumption, but considers the variation of motion errors within the
aperture for each point that is being imaged. Now the expression of raw data in the 1D
wavenumber domain can be written as
.
Sn (k,y) = p(
k)σ(rn,yn ) exp(−j 2k (rn2 + (y − yn )2 )
Sen (k,y), (11.18)
where
. .

Sen (k,y) = exp −j 2k n
(rn − re (y)) − (y − yn − ye (y)) + rn + (y − yn )
2 2 2 2
(11.19)
is the error term due to the residual motion errors.
Considering the case where the residual motion errors do not create any significant
major shift in the range position, we can write the motion errors as
. .
n n

se (kc,y) = exp −j 2kc (rn − re (y)) + (y − yn − ye (y)) − rn + (y − yn )
2 2 2 2
(11.20)
and the raw data can be written as
$ %
Sen (t,y) = σ(rn,yn )p t − 2den (y)/c
sen (kc,y), (11.21)
which differs from the raw data in case of no motion errors by the term
sen (kc,y).
11.3.2 Synthetic Aperture Radar Autofocus in a Compressed Sensing Framework

In [31], the authors investigate the effects of the phase errors caused by radar platform
motion, in the CS reconstruction. The experimental results show that for quadratic and
sinusoidal phase errors, the smearing in the defocused image reconstructed by CS is
not symmetrical as in the conventional defocused SAR image due to the nonlinear
effects of CS reconstruction. Therefore, when the PGA is applied to refocus the CS
image, although the responses of prominent points can be refocused, some background
noise remains.
To take into account unknown motion errors, we define motion errors for the nth
target as a diagonal matrix of size Nt Ny × Nt Ny , given as

n = diag[
sen (kc,y),
sen (kc,y),. . .,
sen (kc,y)]. (11.22)
We first consider the case where the raw data for the nth point can be isolated from total
raw data. This is possible after the normal SAR imaging process that assumes a straight
trajectory, as the point targets in the SAR image will be partially focused. Hence, a small
patch around the partially focused target can be used to extract the target response and
raw data can be generated from it using inverse processing [32]. In this case, the raw
data can be written as
s̃n = A
n σ + , (11.23)
where s̃n is the raw data for the nth target of size Nt Ny × 1.
The solution to the above problem can be obtained as follows:
ˆ n = argmins̃n − A
n σ22 + λσ0,
σ̂n,
(11.24)
σ,
n
where now the solution consists of estimating the reflectivity and motion errors for the
nth target.
Equation (11.24) can be solved using the method called sparsity-driven autofocus
(SDA) proposed in [33]. This method solves for joint SAR imaging and phase-error
correction. Phase errors are incorporated in the problem as model errors, and phase error
correction is performed during the image formation process. Example results from [33]
are shown in Figure 11.6. The proposed method handles the problem as an optimization
problem in which the cost function is composed of a data fidelity term that depends
on the phase errors and a regularization term, which is the l1 -norm of the reflectivity.
The given cost function is minimized jointly with respect to the reflectivity and the
phase error using coordinate descent technique. The algorithm is an iterative two-step
algorithm, which cycles through steps of image formation and phase error estimation
and compensation.
1. In the first step, the cost function is minimized with respect to the reflectivity as
follows:
σ̂n = argmins̃n − A
n σ22 + λσ1 . (11.25)
σ,
n
2. In the second step, the estimated reflectivity is used to solve for the unknown
motion errors.
ˆ n = argmins̃n − A
n σ22 .

(11.26)

n
The above two steps are repeated until there is no significant change in the values of the
estimates. The solution from this method is dependant on an appropriate choice of the
variable λ. As the method does not require creating a dictionary of phase errors, it can
be seen as a dictionary-less approach.
Consider another case where the motion errors are not range- and azimuth position–
dependent, but only depend on the aperture position, i.e., the trajectory errors are given
only by ye (y). Given these motion phase errors as φe (yq ) for the qth aperture position,
the problem can be formulated as
s = A
σ + , (11.27)
where

= diag[

se (kc,y),

se (kc,y),. . .,

se (kc,y)] (11.28)
is a diagonal matrix of size Nt Ny × Nt Ny and
se (kc,y) = [exp{−j kc φe (y1 ), exp{−j kc φe (y2 ),. . ., exp{−j kc φe (yNy )}]

(11.29)
is a vector of size 1 × Ny . Figure 11.5 shows the system model for CS SAR autofocus.
The figure shows that in the presence of trajectory deviations, the received data have a
different form compared to the case of no trajectory deviations. This difference comes
from a “perturbing” term
. This term is a result of radar-target distance that is
different in the presence of trajectory deviations compared to the radar-target distance
in the absence of trajectory deviations.
In this case, the solution can be written as follows:
ˆ = argmins − A
σ22 + λσ1 .
σ̂,
(11.30)
σ,
This solution does not require making any sub-patches and can be obtained using the
SDA method. Besides the SDA method, other methods have also been proposed to solve
the CS SAR autofocus problem. A brief overview of some of these references is given
in the following paragraphs.
In [34], the authors propose a sparsity-based SAR autofocus (SBA) method that
corrects phase errors within the image reconstruction process like the SDA. Actually,
Azimuth
y
Ψ
den (Y) : radar-target distance
S = AΔΨΨσ A/D
t
Range
Figure 11.5 CS SAR imaging in the presence of trajectory deviations.
(a) (b) (c)
Figure 11.6 Images reconstructed by (a) conventional imaging method, (b) sparsity-driven
imaging method, (c) sparsity driven autofocus method. The reconstructed image obtained using
sparsity-driven autofocus method is focused. This can be seen by highly localized four points in
the center of the figure. The images obtained using conventional imaging and sparsity-driven
imaging methods are defocused, as seen by multiple artifacts apparent throughout the
corresponding figures. (Taken from [33] with permission)
both methods aim to solve approximately the same problem within a block relaxation-
based framework. Different from the SDA, the proposed method incorporates an addi-
tional surrogate parameter of the SAR reflectivity field into the optimization problem to
make the algorithm stable and to guarantee the convergence to an accumulation point
or a connected set of accumulation points. Experimentally, it is also shown that this
formulation results in a faster convergence.
Reference [35] proposes a sparsity-based SAR imaging algorithm, called perturbed
autofocus SAR (PA-SAR), which jointly solves for autofocus and off-grid target errors,
i.e., for scatterers that are not on the discrete grid. The proposed algorithm uses a vari-
ant of the OMP algorithm called a parameter perturbation-based orthogonal matching
pursuit (PPOMP) algorithm for efficient estimation of off-grid scatterer locations. The
location of a target, which is not exactly on the grid, is described using the closest grid
node coordinates with an additive unknown perturbation. Since off-grid targets with
strong reflectivities may adversely effect the reconstruction of neighboring targets with
weaker reflectivities, the off-grid oriented structure of PA-SAR provides an advantage
in terms of the resolvability of these targets.
Reference [36] uses a different sparse reconstruction approach called the expectation
maximization-based matching pursuit (EMMP) algorithm [37] for solving the SAR
autofocus problem. The EMMP algorithm treats the compressive measurements as
incomplete data and constructs through iterative expectation and maximization (EM)
steps the complete data corresponding to a set of SAR data for each strong target. The
EM iterations provide more accurate and efficient estimation of the individual target
parameters as well as enable the estimation of unknown phases for each complete
data component. In conclusion, the proposed EMMP-based SAR imaging algorithm
is described as being more greedy, computationally less complex, and having lower
reconstruction errors compared to l1 -norm minimization.
In [38], the authors compare the results of SDA [33], SBA [34], [36] EMMP with a
parameter-free variant of EMMP called the OMP-based autofocus algorithm (AOMP),
in terms of mean-square error (MSE), entropy, target-to-background ratio (TBR), and
signal-to-noise ratio (SNR). Furthermore, these sparsity-based techniques are compared
to the well-known PGA [30]. Comparison results show that all of the sparsity-based
techniques form focused and sparse images with some slight qualitative variations.
The performance of these techniques depends highly on the selection of the hyper-
parameters. Moreover it is shown that, in terms of phase error estimation performance,
SBA, SDA, and AOMP work better compared to the PGA and EMMP. As expected,
considering the run time, the performance of the PGA is much faster than the sparsity-
based techniques.
Reference [39] presents a method called autofocusing iteratively re-weighted aug-
mented Lagrangian method (AIRWALM), which is based on an iteratively re-weighted
augmented Lagrangian method (IRWALM) [40] and a sparsity-driven autofocus (SDA)
method [33]. The proposed method optimizes over the reflectivities and phase errors
jointly to solve a constrained formulation of the sparsity driven autofocus problem
with an lp -norm, p ≤ 1 cost function. Instead of solving the unconstrained problem
in (11.30), the AIRWALM solves the following problem:
p
argmin σp
σ,

s.t. s − A
σ2 ≤ . (11.31)
The motivation for this problem formulation is that it is easier to determine the
error bound , rather than choosing a regularization parameter. Moreover, the use
of p-norms further enhances the sparsity, which may result in a better phase error
estimation. This formulation offers a reduced computation time compared to the SDA
algorithm.
In [41], the authors propose a novel signal processing algorithm for joint SAR image
formation and autofocus in a synthesis dictionary-based sparse representation frame-
work. The proposed algorithm can be applied broadly to scenes that exhibit sparsity
with respect to any dictionary. This is done by extending the SBA imaging framework
from [34] to joint SAR image formation and autofocus. Phase error vector is estimated
using a MAP estimator and compensated through an iterative algorithm to produce
focused images.
The work in [42] uses a parametric sparse representation to compensate for platform
motion errors in SAR that improves the imaging quality compared to other autofocus
methods. The imaging consists of estimating the reflectivities of the scatterers, followed
by calculating azimuth velocity errors and range acceleration errors.
11.4 Synthetic Aperture Radar Moving Target Imaging and

Compressed Sensing
Compressed sensing was initially applied to scenes that were assumed to contain only
static targets. However, an observed scene can have moving targets such as vehicles,
boats, etc. For this case assuming the absence of moving targets can lead to the appear-
ance of artifacts in the processing results, similar to the case when a scene was processed
without taking into account residual motion errors. Therefore, CS SAR processing has to
be modified to take into account moving targets. In the following sections, we describe
the moving target imaging problem, give details about CS SAR processing for moving
targets, and review existing literature on this topic.
11.4.1 Synthetic Aperture Radar Moving Target Imaging

The radar-target distance for an nth moving target having a constant radial/range veloc-
ity vnr and constant azimuth velocity vny can be modified as
&
n y '2 & y '2
dm (y) = rn − vnr + y − yn − vny (11.32)
V V
that can be approximated as
& '
vny 2
v n 1− V (y − yn )2
n
dm (y) ≈ rn − r (y − yn ) + . (11.33)
V 2rn
The resulting raw data can then be written as

2d n (y)
n
Sm (t,y) = σ(rn,yn )p t − m . (11.34)
c
This movement of the target leads to two effects after processing:
• A shift in azimuth position due to the range velocity vnr creating an azimuth
vn
wavenumber shift of 2kc Vr .
• A blurring effect caused by the azimuth velocity vny .
These effects can be further seen in Fig. 11.7, taken from [43], where two moving
vehicles are shown. One of the vehicles moving in range direction is shifted from its
original position on the road, while the second vehicle moving in azimuth direction is
blurred.
Figure 11.7 Effects of target motion on processed image. Image courtesy Artemis, Inc.
For multiple moving targets, the expression for raw data is given as a weighted sum
of data from individual targets as follows:

Nm
Sm (t,y) = σ(rn,yn )Pmn (t,y),Nm = nr × ny × nvr × nvy , (11.35)
n=1
where

2d n (y)
Pmn (t,y) = p t − m ,n = {1,2,. . .,Nm } (11.36)
c
and nvr and nvy are the total number of range and azimuth velocities considered. After
processing, each target will undergo a different amount of azimuth shift and defocusing
due to different velocities. These velocities should be taken into account during process-
ing to compensate for these effects. Therefore, the velocities should be estimated first
so that the estimated values could be used for focusing of the images.
11.4.2 Synthetic Aperture Radar Moving Target Imaging in a Compressed

Sensing Framework
Compressed sensing can provide a convenient way of estimation of motion parameters
as well as the reflectivities and original positions of moving targets. Considering
constant velocities, the reflectivity can be considered as a four-dimensional matrix
as follows:

Nm
(r,y,vr ,vy ) = σ(rn,yn,vnr,vny )δ(r − rn,y − yn,vr − vnr,vy − vny ). (11.37)
n=1
Note that if range and/or azimuth acceleration is considered, the reflectivity will have
higher than four dimensions. This matrix shows the reflectivity of a target corresponding
to its range and azimuth positions, and having certain range and azimuth velocities. The
matrix is rearranged in a vector of size Nm × 1 as follows:
σm = [σm
1 2
,σm Nm T
,. . .σm ] , (11.38)
where σm n = σ(r ,y ,vn,vn ),n = {1,2,. . .,N }.
n n r y m
Traditionally a fixed dictionary–based approach was used for CS-based moving-target
imaging, where the dictionary would consist of data for all possible range and azimuth
velocities for all the points in a considered scene. Stojanovic and Karl [44] use a fixed
dictionary–based approach for moving target imaging and showed that CS can be used
to estimate velocities and positions of moving objects considering a single scatterer
in each observation pixel for mono- and multistatic SAR configurations. The authors
considered a high signal-to-clutter ratio (SCR), i.e., a scenario in which the moving
targets’ amplitude are much higher than the stationary targets. In [45], a fixed dictionary-
based CS processing method is used to focus targets moving in range direction only,
considering a low SCR, where a clutter cancellation filter was used to increase the SCR.
A fixed dictionary is given as
Nm
m = [pm
1 2
,pm ,. . .pm ], (11.39)
n is the response of the nth moving target given as a N N × 1 vector:
where pm t y

n (y)
4πfc dm 2d n (y) 2
n
pm = exp −j + j πK t − m ,n = {1,2,. . .,Nm }. (11.40)
c c
Using the undersampling operator A, the data corresponding to moving targets can
be expressed as
sm = Am σm + . (11.41)
The dictionary becomes very large if a large number of velocity parameters are
considered. The OMP algorithm is typically used as recovery algorithm for motion
parameter estimation due to low computational complexity and ease of implementation
for reconstruction of reflectivities. Moreover, the process of selecting basis vectors one
by one can be useful in cases where correlation exists amongst the basis vectors. The
algorithm solves the following problem:
σ̂m = argminsm − Am σm 22 + λσm 0 . (11.42)
σm
The steps involved in the OMP algorithm consist of iteratively: 1) correlating the
received data with each column of the sub-sampled dictionary, 2) finding the column
with the maximum correlation, 3) estimating the reflectivity using the selected column
via a least-squares solution, and 4) removing the contribution of the estimation from
the received data. The solution shows the amplitude for each combination of motion
parameter values.
The reconstructed reflectivity vector σ̂ can be rearranged as a four-dimensional matrix
as follows:
ˆ m (r,y,vr ,vy ) = f −1 ( σ̂m ). (11.43)
The four-dimensional matrix can be further divided into 2D reflectivity maps showing
estimated reflectivity for range and azimuth positions corresponding to different values
of velocities. Example results from [45] are shown in Figure 11.8.
In [46], the authors apply the CS SAR imaging approach to estimate the target reflec-
tivities and motion parameters of targets having rotational and vibrational motions based
on fixed dictionaries. In [47], a fixed dictionary–based approach is used for indication
of human motion with through-the-wall imaging.
Different approaches were also suggested in the literature to reduce the dictionary
size and enable the applications of CS SAR moving target imaging to realistic scenes.
In [48], a fixed dictionary–based CS imaging approach is used to focus data moving in
azimuth direction, where the range velocity is estimated before this step using a Radon
transform approach. Using such an approach can help in decreasing the dictionary size
as the fixed dictionary would consist of only azimuth velocities. The fixed dictionary–
based method is also applied to motion parameter estimation and focusing of the pro-
cessed images in [49], where the relative localization of the defocused targets provides
an opportunity to generate the locations in the dictionary only around a limited region
of the scene. Results obtained using this method are shown in Figure 11.9.
References [49] and [50] examine the CS imaging performance in the presence of
dictionary mismatch, i.e., when the discrete parameters used to generate the dictionary
elements are different from the parameters in the data. In this case, the data are assumed
#
to be generated from a dictionary m , and hence the reconstruction is carried out as
follows:
# #
σ̂m = argmins − Am σm 22 + λσ0 . (11.44)
σ
The reconstructed reflectivity in this case is different from the actual reflectivity and this
difference depends on the correlation between the actual and the assumed dictionaries.
The reconstructed reflectivity will become
# #
σ̂m = m m σ. (11.45)
These references also showed that to deal with this problem an oversampled dictio-
nary can be created, and the oversampling requirement is more severe for range velocity,
i.e., the dictionary has to be created such that the range velocity spacings are very
small. This can be seen in Figure 11.10, where MSE versus dictionary mismatch in
positions and velocities can be seen. It can be noted that a small mismatch causes a
sudden increase of MSE to a maximum value. The reason is that a small mismatch in
range velocity causes the position of the reconstructed reflectivity to be different from
Scene containing Multiple Moving Points in a Pixel Reconstructed Scene (Velocity = 3 m/s)
4
2.5
3 2
Amplitude
Amplitude
1.5
2
1
1
0.5
0 0
40 40
30 15 30 15
20 10 20 10
10 5 10 5
0 0 0 0
Azimuth Bins Range Bins Azimuth Bins Range Bins
(a) Simulated scene (b) Reconstructed scene
Reconstructed Scene (Velocity = 4 m/s) Reconstructed Scene (Velocity = 5 m/s)
3 2.5
2.5 2
Amplitude
Amplitude
2
1.5
1.5
1
1
0.5 0.5
0 0
40 40
30 15 30 15
20 10 20 10
10 5 10 5
0 0 0 0
Azimuth Bins Range Bins Azimuth Bins Range Bins
(c) Reconstructed scene (d) Reconstructed scene
Reconstructed Scene (Velocity = −3 m/s)
2.5
2
Amplitude
1.5
0.5
0
40
30 15
20 10
10 5
0 0
Azimuth Bins Range Bins
(e) Reconstructed scene
Figure 11.8 Simulated and reconstructed scene. The reconstructed scenes show estimated
reflectivities for different values of velocities. (Taken from [45] with permission)
the actual position, as dictionary elements with different combinations of range velocity
and azimuth positions can be highly correlated.
The technique in [51] aims to recover moving targets up to a membership in an
equivalence class. These equivalence motion classes involve different combinations of
starting positions and velocities that correspond to the same SAR-to-target range history.
Image containing two moving targets

1 Focussed Image at 7.9 m/s
1
100 0.9
50
0.9
200 0.8 100

Road 0.8
0.7 150
300 0.7
200
Azimuth bins
0.6
Azimuth bins
400 250 0.6
0.5
500 300 0.5
Road
0.4 350
0.4
600
400
0.3 0.3
700 450
0.2 0.2
800 500
0.1 0.1
550
900
0 600 0
2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20
Range bins Range bins
(a) Difference of cross-polarized channels. (b) Focused result at velocity 7.9 m/s.
Focussed Image at 8.9 m/s

0.45
50
0.4
100
150 0.35
200
0.3
Azimuth bins
250
0.25
300
Road
350 0.2
400 0.15
450
0.1
500
0.05
550
600 0
2 4 6 8 10 12 14 16 18 20
Range bins
(c) Focused result at velocity 8.9 m/s.
Figure 11.9 Comparison of input image and images with retrieved motion parameters. (Taken
from [49] with permission)
Effects of Basis Mismatch

0.01
Mismatch in range velocity
0.009 Mismatch in range position
Mismatch in azimuth position
Mismatch in azimuth velocity
0.008
0.007
0.006
MSE
0.005
0.004
0.003
0.002
0.001
0
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Mismatch
Figure 11.10 Mean-square error versus dictionary mismatch. (Taken from [49] with permission)
The proposed technique is based on minimization of a cost function to estimate an

image of the stationary background and the equivalence motion classes. The sparsity of
moving targets is incorporated into the cost function with an l1 -norm regularization term
reflecting the assumption that the number of moving targets in the scene is relatively
small compared to the stationary points.
In [52], an off-grid CS method is applied to the problem of indication of ground
moving targets from SAR images. The proposed method uses a variant of CS [53],
called “continuous basis pursuit.” With the use of “continuous basis pursuit,” its aim is
to reconstruct the exact velocity of every target by using tools such as Taylor expansion.
Another approach is to use a dictionary-less approach, as presented in [54]. Here,
it is assumed that either the moving targets have azimuth motion only, or any range
motion has been compensated. The approach handles phase errors resulting from target
motion as errors on the observation model of a static scene. The proposed method is an
extension of the SDA algorithm for the moving target imaging problem. It is based on
minimization of a cost function, which involves regularization terms imposing sparsity
on the reflectivity field to be imaged, as well as on the spatial structure of the motion-
related phase errors, reflecting the assumption that only a small percentage of the entire
scene contains moving targets. An advantage of this approach is that it can handle
nonconstant velocities.
In this approach, first, the CS SAR moving target imaging problem is formulated as
s = A(
m )σm + , (11.46)
where
m is a matrix of size Nt Ny × Nm containing the phase errors due to motion
and represents element-by-element multiplication. The matrix is given as

m = [
ψ1m,
ψ 2m,. . .,
ψN m
m ], (11.47)
where the individual vectors making up the matrix are

ψ nm = [exp{−j kc φen (y1 )},. . ., exp{−j kc φen (y1 )}, exp{−j kc φen (y2 )},. . .,
exp{−j kc φen (y2 )}. . . exp{−j kc φen (yNy )},. . ., exp{−j kc φen (yNy )}]T .
(11.48)
In this case, the solution can be written as follows:
σ̂m = argmins − A(
m )σm 22 + λ1 σm 1 + λ2 β − 11,s.t.|β(k)| = 1∀k,
σm,β
(11.49)
where
β T = [β1 T ,β2 T ,. . .,βNy T ] (11.50)
and
βn = [exp(−j kc φe1 (yn ), exp(−j kc φe2 (yn ),. . ., exp(−j kc φeNm (yn ))]T . (11.51)
The extra term is used for facilitating the solution. As the term β represents the phase
errors due to the moving points, and there are only a few moving points in a scene,the
50 50
100 100
150 150
200 200
250 250
300 300
350 350
400 400
450 450
500 500
50 100 150 200 250 300 350 400 450 500 50 100 150 200 250 300 350 400 450 500
(a) (b)
50 50
100 100
150 150
200 200
250 250
300 300
350 350
400 400
450 450
500 500
50 100 150 200 250 300 350 400 450 500 50 100 150 200 250 300 350 400 450 500
(c) (d)
Figure 11.11 (a) Original scene. (b) Defocused image obtained using traditional processing. (c)
Image reconstructed by sparsity-driven imaging assuming a stationary scene. (d) Image
reconstructed by the sparsity-driven moving object imaging approach. (Taken from [59] with
permission)
term β−1 becomes sparse, and is hence represented as β−11 . An example of imaging
carried out using this approach is shown in Figure 11.11.
In [55], the authors extend the applicability of sparsity-driven moving target–
focusing methods to very low signal-to-clutter ratio environments. First, SAR raw
data are divided into subapertures in the azimuth direction. Subsequently, low-rank and
sparse decomposition is applied using the multiple subapertures data to accomplish the
separation of moving targets from the stationary SAR background.
In [56], the authors use a variant of fractional FT as a basis to estimate the Doppler
rate from undersampled data, and hence the azimuth velocity. They use Radon transform
to estimate range velocity. In [57], moving targets in a processed imaged are refocused
by first extracting a ROI sub-image and then following a two-step iterative procedure.
A sparse image is first obtained using an initial phase compensation parameter. Then
the phase compensation parameter is estimated.
In [58], the smeared images of moving targets are removed from a processed image
to obtain a clear SAR image from multi-channel SAR images. The presented approach
uses a fixed-dictionary CS-based method to separate SAR data from moving targets and
noise. The fixed dictionary is based on frequency shifts for different range velocities
and angle between flight path and a moving object that has dependency on both range
and azimuth velocity of the moving target.
11.5 Inverse Synthetic Aperture Radar Imaging and Compressed Sensing
Inverse synthetic aperture radar imaging consists of a moving target observed by a static
antenna. The target’s motion creates a synthetic aperture, which can be used to obtain
a high resolution image. However, a target can have an unknown maneuvering motion,
which is similar to the case of SAR imaging in the presence of residual motion errors.
This unknown maneuvering motion needs to be estimated and compensated to obtain
a high-resolution artifact-free image. The moving target in a static background can be
considered as a sparse scene, therefore, CS can be applied for ISAR imaging. In the
following, we formulate the problem of ISAR imaging and CS application for ISAR
imaging, followed by a literature review of existing CS ISAR imaging approaches.
11.5.1 Inverse Synthetic Aperture Radar Imaging

A moving object is considered to have a range velocity vnr and a rotational motion given
by an angle
θn (τ) as a function of along-track time τ. The azimuth time-varying
radar-target distance dn (τ) for the n-th point moving with the range-velocity vnr and
rotating with the angle
θn (τ) is given as follows [60]:
dn (τ) = rn + vnr τ + xn cos(

θn (τ)) + yn sin(
θn (τ)), (11.52)
where τ is the along-track time sampled at a frequency fτ , i.e., τ = 0, f1τ , f2τ ,. . ., Nfτ −1
τ
,
with Nτ being the total number of along-track time samples. The slant-range distance
to the rotation center of the observed moving point is represented by rn ; xn and yn
denote the range and azimuth distance of the point from the rotation center, respectively.
The rotational motion is responsible for creating a high-resolution in the along-track
direction. Note that it is normally assumed that the whole object is rotating with the
same rotation angle, i.e.,
θn (τ) =
θ0 (τ)∀n.
Generally, ISAR imaging problems consider range-compressed and range-aligned
ISAR data, where any movement across range cells due to translational motion is
assumed to be eliminated. Therefore, the data are aligned along the azimuth direction
for each range bin. Such data, corresponding to a range bin and a point n, can be
written as
sn (τ) = σn pn (τ). (11.53)
The reflectivity of the point n is given by σn , and pn (τ) describes the range-aligned
signal based on the rotational motion of the target as follows:

τ $ %
pn (τ) = rect exp −j kc dan (τ) , (11.54)
Tτ
& '
where rect Tττ is a limit on the pulse size according to pulse-width Tτ arising from
the total target observation time in azimuth direction, and dan (τ) = xn cos(
θn (τ)) +
yn sin(
θn (τ)) is the radar-target distance after range-alignment. The total range-
compressed and aligned raw data for a total number of Nm moving points in a single
range bin can then be written as a sum of received data from each point as follows:

Nm

s(τ) = σn pn (τ). (11.55)
n=1
The purpose of ISAR imaging is to obtain a focused image for each range bin from
the raw data. For ISAR, if the rotational motion is not properly accounted for in the
imaging process, it will result in blurring in the processed image in the along-track
direction. Therefore, the rotational rate should be estimated and compensated to get a
focused image.
11.5.2 Inverse Synthetic Aperture Radar Moving Target Imaging in a Compressed

Sensing Framework
Compressed sensing can be used to estimate motion parameters as well as the reflectiv-
ities and original positions of moving targets for ISAR, similar to SAR moving target
imaging. Similar to traditional CS SAR moving target imaging, a fixed dictionary-based
approach was used initially for CS ISAR imaging. To carry out ISAR imaging using a
fixed dictionary–based approach, a model for the rotational motion has to be assumed.
First, it is assumed that cos(
θn (τ)) ≈ 1 and sin(
θn (τ)) ≈
θn (τ). This is the small-
angle approximation and can be valid for high-frequency radar imaging [60]. Then,

θn (τ) is further approximated.
In order to take into account realistic highly maneuvering motion,
θn (τ) can be
approximated until a third-order term, as in [61]. Accordingly, the time-varying rotation-
angle is given as
1 1

θn (τ) ≈ yn ωn τ + yn ω̇n τ 2 + yn ω̈n τ 3, (11.56)
2 6
where ωn , ω̇n , and ω̈n are the rotational rate, rotational acceleration, and the rate of
rotational acceleration, respectively. Let α n = kc yn ωn be defined as the rotation rate
phase term. It can be further written as αn = 2πdn f , where dn gives the azimuth
frequency/Doppler pixel position of the scatterer corresponding to the phase term αn ,
and f = Nfττ is the frequency resolution. The goal of CS ISAR imaging is to estimate
dn to form a processed image showing the position of the scatterer in the azimuth
frequency/Doppler location. The rotational acceleration phase term is further defined
as βn = k2c yn ω̇n , and γn = k6c yn ω̈n is defined as the rotational acceleration rate phase
term. Then, the pulse pn (τ) for a point n can be redefined as
& '
τ
pn (τ,α n,βn,γn ) = rect exp −j αn τ − j βn τ 2 − j γn τ 3 . (11.57)
Tτ
As in Section 11.4, a reflectivity vector σm
of size Nm × 1 is defined such that its
entries have the form σmn = σ(α ,β ,γ ), n = {1,2,. . .,N }, where N = N × N ×
n n n
m
m α β
Nγ . The total number of considered points for rotational rate, rotational acceleration,
and rotational acceleration rate are Nα,Nβ and Nγ , respectively.
The dictionary is defined as
Nm
= [pm
1 2 3
m ,pm
,pm ],
,. . .,pm (11.58)
n is the response of the nth rotating target given as
where pm

& '
n τ
pm = rect exp −j αn τ − j βn τ 2 − j γn τ 3 ,n = {1,2,. . .,Nm
}. (11.59)
Tτ
The dictionary m has a size Nτ × Nm . Making use of the dictionary definition, the
received raw data for one range bin can be written as
σm
s = Am + ε. (11.60)
In Figure 11.12, the model based on (11.60) is illustrated. Equation 11.60 can be
solved for each range bin using the fixed dictionary–based approach as follows:
= argmins − Am
σ̂m σm
2 + λσm
2
0 . (11.61)
σm

The reconstructed reflectivity σ̂m for each range bin is then converted to a 1D form,
which shows the reflectivity estimate versus the Doppler pixel position.
In [61] and [62], it is further shown that to avoid the degradation of performance
caused by dictionary mismatch, the fixed dictionary should be sufficiently upsampled
in rotational acceleration and rotational acceleration rate. Figure 11.13 gives one such
example, where the degradation of performance with respect to mismatch in rotational
Figure 11.12 Compressed sensing ISAR imaging.

MSE vs. β and γ mismatch

1
0.9
1 0.8
0.8 0.7
0.6 0.6
MSE
0.4 0.5
0.4
0.2
0.3
0
5 0.2
0.95 5
0.7 0.95 0.1
0.45 0.7
0.45
0.2 0.2
γ mismatch β mismatch
Figure 11.13 MSE versus β mismatch and γ mismatch. The circled region shows variation of
MSE with fixed β mismatch and varying γ mismatch, and vice versa. (Taken from [61] with
permission)
acceleration and rotational acceleration rate can be seen. The performance degrada-
tion is evaluated in terms of MSE between actual and reconstructed reflectivity in the
presence of mismatch. Therefore, to avoid any performance degradation, the dictionary
should be constructed with very fine spacing of rotational acceleration and the rotational
acceleration rate.
A parametric dictionary or a dictionary-less approach can also be used to avoid
the high upsampling required for a fixed dictionary–based approach. In the parametric
dictionary approach, the motion parameters of the moving targets are estimated as part
of the solution directly. In this case, the reconstruction problem can be defined as
σ̂m
= argmin s − Am
(α,β,γ)σ2 + λσm
2
0, (11.62)
σm
,α,β,γ
where the term m (α,β,γ) now shows that dictionary elements are generated based on
the motion model given in (11.56), and the parameters are found such that the resulting
dictionary elements are well matched to the received raw data. One possible approach
to solve this problem is as part of the OMP algorithm, where the dictionary elements
are iteratively generated such that they are best matched to received data. This can be
expressed as
α̂ k , β̂k , γ̂k = argmax| < s,(Am
(α k ,β k ,γk )) > |, (11.63)
α k ,β k ,γk
where k is the iteration number and | < .,. > | represents correlation. An example of
results obtained using a parametric dictionary-based approach is shown in Figure 11.14.
This figure was taken from [61].
Reference image Comparison of reference image and reconstructed image

250 250
Reference image
Reconstructed image
200 200
150 150
Azimuth
Azimuth
100 100
50 50
10 20 30 40 50 60 10 20 30 40 50 60
Range Range
(a) (b)
Comparison of reference image and reconstructed image Comparison of reference image and reconstructed image
250 250
Reference image Reference image
Reconstructed image Reconstructed image
200 200
150 150
Azimuth
Azimuth
100 100
50 50
10 20 30 40 50 60 10 20 30 40 50 60
Range Range
(c) (d)
Figure 11.14 (a) Reference image. Note that the reference and reconstructed images are
overlapped for comparing the reconstruction performance: a successful reconstruction will result
in the overlap looking like the reference image, whereas an unsuccessful reconstruction will
result in the overlap having points around the reference image. (b) Overlap of reference and
reconstructed images using a parametric dictionary approach with 50% downsampled data.
(c) Overlap of reference image and reconstructed image using a technique based on integrated
high-order matched phase transform with 50% data; partial reconstruction can be seen as the
technique was proposed for imaging with fully sampled data. (d) Overlap of reference image
and reconstructed image using a dictionary without a rotational acceleration rate phase term
with 50% data. The resulting reconstruction using this dictionary fails to generate
a discernible image. (Taken from [61] with permission)
Other approaches in existing literature on CS ISAR imaging approximate θ(τ) until

a first-order term, i.e.,

θ(τ) ≈ yn ωn τ, (11.64)
where it is assumed that the target is moving with a uniform rotational velocity, or
only such a small number of continuous samples in azimuth direction are acquired
that the target can be considered to have uniform motion during the acquisition. Such
an approximation can be seen as a small-angle approximation, or a uniform rotation
approximation. The focusing can be carried out using a FT as part of a range-Doppler

approach (RDA), which results in a low resolution image. Compressed sensing can be
used to improve resolution in such a case. The pulse making up the dictionary is given as

n τ
pm
= rect exp (−j α n τ) ,n = {1,2,. . .,Nα }. (11.65)
Tτ
The work in [63] proposes a 2D CS-based reconstruction for a target containing

translational motion and a small-angle rotation with limited motion, by decoupling
translational and rotational motions and carrying out CS imaging in range and azimuth
directions separately. The work in [64] shows that the CS formulation directly in 2D
instead of the traditional 1D stacking approach can generate high-resolution images
when data have a few missing samples, or is gapped with a large number of missing
samples. The authors assume small rotation and a dictionary based on Fourier matrices.
The work in [65] applies Bayesian CS and uses a parametric dictionary for CS ISAR
imaging in the presence of uniform rotation. The work in [66] uses a 2D CS ISAR imag-
ing that considered uniform rotational motion, and uses dictionaries based on nonuni-
form Fourier transform for both range and azimuth directions. The work in [67] proposes
CS ISAR image representation based on a sparsity prior and nonlocal total variation that
reconstructs strong scatterers in an observed object and maintains the overall shape of
the object, respectively. The authors assume a uniform rotation rate and a Fourier basis
matrix. The work in [68] proposes sparse imaging to compensate for range migration in
the presence of uniform rotational motion, making use of Bayesian sparse representa-
tion. The proposed imaging is divided into two steps: coarse imaging carried out based
on the minimum entropy criterion, and a residual phase error correction performed using
a 2D Fourier dictionary.
The work in [69] proposes an autofocus-based technique to compensate for phase
errors arising due to imperfect translational motion compensation based on sparse
Bayesian learning. The target is assumed to be rotating uniformly, and these phase
errors due to imperfect translational motion compensation could affect the focusing in
azimuth direction. The technique is based on a two-step process where the parameters
of the scatterer coefficient are calculated in the first step, and noise parameter and phase
error due to imperfect translational motion is estimated in the second step.
In [70], the author makes use of a segment from available data in the frequency
domain and generates a prediction of what frequency domain data for other segments
would look like. By comparison of the phases of the measurements and the predictions
it should be possible to derive information about the motion compensation errors. Data
are processed over small segments, such that the phase error history is assumed to
be linear.
The small-angle approach is simple in its implementation, but can result in a loss
of discernibility of the image. Therefore, other approaches based on more complicated
motion models were proposed. One of these approaches assumes a uniform rotation
motion, but considers cos(
θn (τ)) ≈ 1−(
θn (τ))2 /2. These techniques are described
in [71–73], where the object is undergoing uniform rotational motion, resulting in a

second-order phase term that is directly dependent on the first-order phase term
xn 2 2

θn (τ) ≈ yn ωn τ − ω τ . (11.66)
2 n
The dictionary is composed of pulses of the following form:

n τ $ %
= rect
pm exp − j α n τ − j ζn τ 2 ,n = {1,2,. . .,Nα × Nζ }, (11.67)
Tτ
where ζn = k2c xn ωn2,n = {1,2,. . .,Nα × Nζ }. The authors use a parametric dictionary
for reconstruction. Such a model also allows determination of the actual azimuth posi-
tion of the moving targets. In [74], a similar motion model is used, and the phase error
after translational motion compensation is considered. A two-step imaging process is
used, where reflectivity is estimated first using compressed sensing reconstruction and
starting with a phase error estimate, then the phase errors are estimated using entropy
minimization method.
In [68], the authors deal with translational motion phase error compensation and
a target undergoing motion according to the model given in (11.70). The proposed
imaging approach involves a two-step process: 1) A coarse error compensation is
achieved using the minimum entropy criterion; 2) a sparsity-driven optimization using
a 2D Fourier-based dictionary is carried out, where the residual phase errors are treated
as model error and removed to achieve a fine correction.
In other approaches, the rotation motion is assumed to consist of nonuniform rotation
up to a second-order term as follows:
yn

θn (τ) ≈ yn ωn τ + ω̇n τ 2 . (11.68)
2
The dictionary is composed of the following terms:

n τ $ %
pm = rect exp − j α n τ − j βn τ 2 ,n = {1,2,. . .,Nα × Nβ }. (11.69)
Tτ
In [62], a dictionary is presented that can deal with both first-order rotational velocity
and second-order rotational acceleration phase terms. An analysis is also carried out to
quantify the effects of spacing of different parameters in the dictionary on the imaging
performance. In [75], the authors consider rotational velocity and uniform acceleration.
The work in [76] consider uniform and nonuniform rotational motion in the presence
of unavailable or corrupted data and demonstrate the application of quadratic time
frequency–based representations for CS imaging when rotational acceleration is present
in the data.
In [77], the authors consider rotational velocity and uniform acceleration. Most of the
autofocus approaches lose their efficacy in subaperture cases, because the vacant aper-
tures break the FT relationship between the range-compressed data and the RD image.
Eigenvector-based autofocus is an exception. Rotational acceleration is compensated by
searching, then the residual motion errors are calculated by eigenvector-based autofocus.
The authors in [78] consider uniformly accelerated rotation targets. The maneuvering
signal model is formulated as chirp code and represented using a chirp-Fourier basis.
Then sparse representation is applied to realize range-Doppler imaging from the sparse
apertures, where the superposition of chirp parameters is acquired using the modi-
fied discrete chirp Fourier transform (MDCFT). After preprocessing involving sample
selection, rotation center determination, and noise reduction, the chirp parameters are
used to estimate the parameters of rotational motion using the weighted least square
(WLS) method. Finally, a high-resolution scaled-ISAR image is achieved by rescaling
the acquired RD image using the estimated rotational velocity.
In [79], the authors propose a weighted eigenvector-based phase correction method to
correct for unknown phase errors, followed by using a partial Fourier matrix to achieve
CS ISAR imaging. In [80], it is assumed that the translational motion compensation has
been successfully accomplished and only rotational motion is considered. Rotational
acceleration is considered and a dictionary based on scaled nonuniform Fourier trans-
form is used. The reflectivity is modeled as a complex Gaussian distribution.
In [81], the authors further extend the technique in [68] to the maneuvering target
model undergoing uniform rotational acceleration. In this case, the rotational motion is
approximated as
yn xn

θn (τ) ≈ yn ωn τ + ω˙n τ 2 − ωn2 τ 2, (11.70)
2 2
and the dictionary is composed of

n τ $ %
pm = rect exp − j α n τ − j ηn τ 2 ,n = {1,2,. . .,Nα × Nη }, (11.71)
Tτ
where ηn = k2c (yn ω˙n − xn ωn2 ),n = {1,2,. . .,Nα × Nη }.
Some other techniques do not assume any particular form of motion errors, and
can deal with random phase errors. In [82], the authors propose an iterative two-step
method: in the first step, an estimate of the reflectivity is obtained using an expansion-
compression variance component–based method. In the second step, the phase errors are
estimated making use of maximum likelihood method, similar to the method proposed in
[33] for SAR autofocus. The whole process is repeated until no significant improvement
can be seen. The method can deal with any type of motion errors.
In [83], the authors propose a novel sparse Bayesian ISAR imaging algorithm with a
newly proposed logarithmic Laplacian prior, which is achieved by putting a logarithm
on the exponent of the Laplacian prior. Compared to the Gaussian scale mixture and
Laplacian priors, the proposed logarithmic Laplacian prior has a narrower main lobe
and higher tail values, and performs better on sparseness representation. Then, the
logarithmic Laplacian prior-based ISAR image is reconstructed by MAP estimation,
and arbitrary phase errors are estimated based on the minimum entropy criterion during
the iterative process of sparse signal recovery. In [84], the authors impose continuity in
range by imposing a Bayesian model, and propose an autofocus algorithm to compen-
sate for random phase errors.
In [85], the authors assume a target only having a range velocity and micro-Doppler
motion. They use a joint sparsity model to separate micro-Doppler from the main target
body’s signal in the translational motion compensated data. Micro-Doppler is caused

by nonstationary parts of moving targets and is represented by a sinusoidal phase in
the azimuth direction, whereas the main body’s signal can be represented as a linear
phase in the azimuth direction. Due to the linear phase, the main targets’ signals can be
considered to be jointly sparse in the azimuth direction. The authors further assume a
Bernoulli–Gaussian distribution for the signal in each azimuth bin.
In [86], the authors propose CS to compensate for rotating parts in an observed object,
which cause a micro-Doppler effect. A CS-based short-time Fourier transform is used
to identify and remove data due to rotating parts from the entire data. The resulting data
that do not contain the micro-Doppler effect are focused using CS and a dictionary that
assumes uniform motion.
11.6 Conclusions
In this chapter, we presented CS-based synthetic aperture radar autofocus, synthetic

aperture radar moving target imaging, and inverse synthetic aperture radar imaging. We
presented the theory behind each approach, formulated the problem mathematically,
discussed general state-of-the-art solutions, and gave a review of existing approaches
presented in the literature to deal with these problems. While CS autofocus approaches
are usually dictionary-less approaches, SAR moving target imaging and ISAR imag-
ing can be divided into fixed dictionary–based and parametric dictionary–based or
dictionary-less approaches. The former class of approaches was used in initial work
for SAR moving target imaging and ISAR imaging. These dictionaries are based on a
fixed motion model and each entry of the dictionary is generated for one combination
of the motion model considered. The parametric dictionary approach was used in later
work and has the advantage of not being limited to a fixed motion model, and does not
require the creation of a huge dictionary.
References
[1] D. Donoho, “Compressed sensing,” IEEE Trans. Inf. Theory, vol. 52, no. 4, pp. 1289–1306,
2006.
[2] E. Candès and T. Tao, “Decoding by linear programming,” IEEE Trans. Inf. Theory, vol. 51,
no. 12, pp. 4203–4215, 2005.
[3] E. Candès, J. Romberg, and T. Tao, “Robust uncertainty principles: Exact signal reconstruc-
tion from highly incomplete frequency information,” IEEE Trans. Inf. Theory, vol. 52, no. 2,
pp. 489–590, 2006.
[4] E. Candès and M. Wakin, “An introduction to compressive sampling,” IEEE Signal Process.
Mag., vol. 25, no. 2, pp. 21–30, 2008.
[5] R. Baraniuk, “Compressed sensing,” IEEE Signal Process. Mag., vol. 24, no. 4, pp. 14–20,
2007.
[6] R. Baraniuk and P. Steeghs, “Compressive radar imaging,” in Proc. IEEE Radar Conference,
2007, pp. 128–133.
[7] V. Patel, G. Easley, D. Healy, and R. Chellappa, “Compressed synthetic aperture radar,”
IEEE J. Sel. Topics Signal Process., vol. 4, no. 2, pp. 244–254, 2010.
[8] K. Aberman and Y. C. Eldar, “Sub-nyquist SAR via fourier domain range-doppler process-
ing,” IEEE Trans. Geosci. Remote Sens., vol. 55, no. 11, pp. 6228–6244, 2017.
[9] M. Cetin, I. Stojanovic, N. Onhon et al. “Sparsity-driven synthetic aperture radar imaging:
Reconstruction, autofocusing, moving targets, and compressed sensing,” IEEE Signal
Process. Mag., vol. 31, no. 4, pp. 27–40, 2014.
[10] L. Potter, E. Ertin, J. Parker, and M. Cetin, “Sparsity and compressed sensing in radar
imaging,” Proc. IEEE, vol. 98, no. 6, pp. 1006–1020, 2010.
[11] M. Herman and T. Strohmer, “High-resolution radar via compressed sensing,” IEEE Trans.
Signal Process., vol. 57, no. 6, pp. 2275–2284, 2009.
[12] M. T. Alonso, P. Lopez-Dekker, and J. J. Mallorqui, “A novel strategy for radar imaging
based on compressed sensing,” IEEE Trans. Geosci. Remote Sensing, vol. 48, no. 12,
pp. 4285–4295, 2010.
[13] X. Dong and Y. Zhang, “A novel compressive sensing algorithm for SAR imaging,” IEEE J.
Select. Topics Appl. Earth Observ. Remote Sensing, vol. 7, no. 2, pp. 708–729, 2013.
[14] M. Cetin and W. C. Karl, “Feature-enhanced synthetic aperture radar image formation based
on nonquadratic regularization,” IEEE Trans. Image Processing, vol. 10, no. 4, pp. 623–631,
2001.
[15] D. L. Donoho and M. Elad, “Optimally sparse representation in general (nonorthogonal)
dictionaries via l1 minimization,” Proceedings of the National Academy of Sciences,
vol. 100, no. 5, pp. 2197–2202, 2003.
[16] D. M. Malioutov, M. Cetin, and A. S. Willsky, “Optimal sparse representations in general
overcomplete bases,” Proc. IEEE International Conference on Acoustics, Speech, and Signal
Processing, pp. 793–796, 2004.
[17] M. Soumekh, Synthetic Aperture Radar Signal Processing. Wiley, 1999.
[18] C. Cafforio, C. Prati, and F. Rocca, “SAR data focusing using seismic migration techniques,”
IEEE Trans. Aerosp. Electron. Syst., vol. 27, no. 2, pp. 194–207, 1999.
[19] I. Cumming and F. Wong, Digital Processing of Synthetic Aperture Radar Data. Artech
House, 2005.
[20] R. K. Raney, H. Runge, R. Bamler, I. Cumming, and F. Wong, “Precision SAR processing
using chirp scaling,” IEEE Trans. Geosci. Remote Sens., vol. 32, no. 4, pp. 786–799, 1994.
[21] R. Baraniuk, V. Cevher, M. Duarte, and C. Hegde, “Model-based compressive sensing,”
IEEE Trans. Inf. Theory, vol. 56, pp. 1982–2001, 2010.
[22] L. Applebaum, S. Howard, S. Searle, and R. Calderbank, “Chirp sensing codes: Determin-
istic compressed sensing measurements for fast recovery,” Appl. Comput. Harmon. Anal.,
vol. 26, pp. 283–290, 2009.
[23] I. Stojanovic, M. Cetin, and W. C. Karl, “Compressed sensing of monostatic and multistatic
SAR,” IEEE Geosci. Remote Sens. Lett., vol. 10, no. 6, pp. 1444–1448, 2013.
[24] J. Tropp and A. Gilbert, “Signal recovery from random measurements via orthogonal
matching pursuit,” IEEE Trans. Inf. Theory, vol. 53, no. 12, pp. 4655–4666, 2007.
[25] M. Elad and M. Zibulevsky, “Iterative shrinkage algorithms and their acceleration for l1l2
signal and image processing applications,” IEEE Signal Process. Mag., vol. 27, no. 3,
pp. 78–88, 2010.
[26] D. P. Belcher and N. C. Rogers, “Theory and simulation of ionospheric effects on synthetic
aperture radar,” IET Radar, Sonar & Navigation, vol. 5, no. 5, pp. 541–551, 2009.
[27] G. Franceschetti and R. Lanari, Synthetic Aperture Radar Processing. CRC Press, 1999.
[28] W. G. Carrara, R. M. Majewski, and R. S. Goodman, Spotlight Synthetic Aperture Radar:

Signal Processing Algorithms. Artech House, 1995.
[29] J. Walker, “Range-doppler imaging of rotating objects,” IEEE Trans. Aerosp. Electron. Syst.,
vol. AES-16, pp. 23–52, 1980.
[30] D. Wahl, P. Eichel, D. Ghiglia, and C. Jakowatz, “Phase gradient autofocus: A robust tool for
high resolution SAR phase correction,” IEEE Trans. Aerosp. Electron. Syst., vol. 30, no. 3,
pp. 827–835, 1994.
[31] T. Jihua, S. Jinping, H. Xiao, and Z. Bingchen, “Motion compensation for compressive
sensing SAR imaging with autofocus,” in Proc. 2011 6th IEEE Int. Conf. Ind. Electron.
Appl., 2011, pp. 1564–1567.
[32] A. S. Khwaja, L. Ferro-Famil, and E. Pottier, “Efficient SAR raw data generation for
anisotropic urban scenes based on inverse processing,” IEEE Geosci. Remote Sens. Lett.,
vol. 6, no. 4, pp. 757–761, 2009.
[33] O. Onhon and M. Cetin, “A sparsity-driven approach for joint SAR imaging and phase error
correction,” IEEE Trans. Image Process., vol. 21, no. 4, pp. 2075–2088, 2012.
[34] S. Kelly, M. Yaghoobi, and M. Davies, “Sparsity-based autofocus for undersampled
synthetic aperture radar,” IEEE Trans. Aerosp. Electron. Syst., vol. 50, no. 2, pp. 972–986,
2014.
[35] S. Camlica, A. C. Gurbuz, and O. Arikan, “Autofocused spotlight SAR image reconstruction
of off-grid sparse scenes,” IEEE Trans. Aerosp. Electron. Syst., vol. 53, no. 4, pp. 1880–
1892, 2017.
[36] S. Ugur, O. Arikan, and A. Gurbuz, “SAR image reconstruction by expectation maximiza-
tion based matching pursuit,” Digital Signal Processing, vol. 37, pp. 75–84, 2015.
[37] A. C. Gurbuz, M. Pilanci, and O. Arikan, “Expectation maximization based matching pur-
suit,” in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing,
2012.
[38] S. Camlica, H. E. Guven, A. C. Gurbuz, and O. Arikan, “Analysis of sparsity based joint
SAR image reconstruction and autofocus techniques,” in Proc. 3rd International Workshop
on Compressed Sensing Theory and its Applications to Radar, Sonar and Remote Sensing
(CoSeRa), 2015 pp. 99–103.
[39] A. Gungor, M. Cetin, and E. Guven, “An augmented lagrangian method for autofocused
compressed SAR imaging,” in Proc. 2015 3rd IEEE Int. Workshop Compressed Sens. Theory
Appl. Radar Sonar Remote Sensing, 2015, pp. 1–6.
[40] H. E. Guven and M. Cetin, “An augmented lagrangian method for sparse SAR imaging,” in
Proc. 10th European Conference on Synthetic Aperture Radar, 2014.
[41] M. J. Hasankhan, S. Samadi, and M. Cetin, “Sparse representation-based algorithm for joint
SAR image formation and autofocus,” Signal, Image and Video Processing, vol. 11, no. 4,
pp. 589–596, 2015.
[42] Y. C. Chen, G. Li, Q. Zhang, Q. J. Zhang, and X. G. Xia, “Motion compensation for airborne
SAR via parametric sparse representation,” IEEE Trans. Geosci. Remote Sens., vol. 55, no. 1,
pp. 551–562, 2017.
[43] M. T. Crockett, “Target motion estimation techniques for single-channel SAR,” master’s
thesis, Brigham Young University, 2014.
[44] I. Stojanovic and W. Karl, “Imaging of moving targets with multi-static SAR using an
overcomplete dictionary,” IEEE J. Sel. Topics Signal Process., vol. 4, no. 1, pp. 164–176,
2010.
[45] A. S. Khwaja and J. Ma, “Applications of compressed sensing for SAR moving target
velocity estimation and image compression,” IEEE Trans. Instrum. Meas., vol. 60, no. 8,
pp. 2848–2860, 2011.
[46] S. Zhu, A. M. Djafari, H. Wang et al., “Parameter estimation for SAR micromotion target
based on sparse signal representation,” EURASIP J. Adv. Sig. Proc., vol. 2012, 2012.
[47] F. Ahmad and M. Amin, “Through-the-wall human motion indication using sparsity-driven
change detection,” IEEE Trans. Geosci. Remote Sens., vol. 51, no. 2, pp. 881–890, 2013.
[48] Q. Wu, M. Xing, C. Qiu, B. Liu, Z. Bao, and T.-S. Yeo, “Motion parameter estimation in
the SAR system with low PRF sampling,” IEEE Geosci. Remote Sens. Lett., vol. 7, no. 3,
pp. 450–454, 2010.
[49] A. S. Khwaja and X. P. Zhang, “Motion parameter estimation and focusing from SAR
images based on sparse reconstruction,” IEEE Geosci. Remote Sens. Lett., vol. 11, no. 8,
pp. 1350–1354, 2011.
[50] A. S. Khwaja, M. Naeem, and A. Anpalagan, “Analysis of moving object imaging from
compressively sensed SAR data in the presence of dictionary mismatch,” Int. J. Antenn.
Propag., vol. 2013, 2013.
[51] J. Gunther, J. Hunsaker, H. Anderson, and T. Moon, “Sparse reconstruction of equivalence
classes of moving targets using single-channel synthetic aperture radar,” in Proc. IEEE
International Conference on Acoustic, Speech and Signal Processing, 2014.
[52] L. Prunte, “Gmti from multichannel SAR images using compressed sensing under off-grid
conditions,” in Proc. International Radar Symposium, 2013.
[53] C. Ekanadham, D. Tranchina, and E. P. Simoncelli, “Recovery of sparse translation-invariant
signals with continuous basis pursuit,” IEEE Trans. on Signal Processing, vol. 59, no. 10,
pp. 4735–4744, 2011.
[54] N. O. Onhon and M. Cetin, “SAR moving target imaging in a sparsity-driven framework,”
in Proc. SPIE Optics+Photonics, Wavelets and Sparsity XIV, 2011, pp. 8138–8139.
[55] M. Yasin, M. Cetin, and A. S. Khwaja, “SAR imaging of moving targets by subaperture
based low-rank and sparse decomposition,” in Proc. IEEE Signal Processing and Commu-
nications Applications Conference, 2017.
[56] X. Zhang, G. Liao, S. Zhu, D.Yang, and W. Du, “Efficient compressed sensing method for
moving-target imaging by exploiting the geometry information of the defocused results,”
IEEE Geosci. Remote Sens. Lett., vol. 12, no. 3, pp. 517–521, 2015.
[57] Y. Chen, Q. Zhang, G. Li, and J. Sun, “Refocusing of moving targets in SAR images via
parametric sparse representation,” Remote Sensing, vol. 9, no. 8, pp. 1–15, 2017.
[58] L. Prunte, “Compressed sensing for removing moving target artifacts and reducing noise in
SAR images,” in Proc. European Conference on Synthetic Aperture Radar, 2016.
[59] N. O. Onhon and M. Cetin, “SAR moving object imaging using sparsity imposing priors,”
EURASIP J. Adv. Sig. Proc., vol. 2017, 2017.
[60] V. Chen and H. Ling, Time-Frequency Transforms for Radar Imaging and Signal Analysis.
Artech House, 2002.
[61] A. S. Khwaja and M. Cetin, “Compressed sensing ISAR reconstruction considering highly
maneuvering motion,” Electronics, vol. 6, no. 1, 2017.
[62] A. S. Khwaja and X. P. Zhang, “Compressed sensing ISAR reconstruction in the presence of
rotational acceleration,” IEEE J. Sel. Top. Appl. Earth Observ., vol. 7, no. 7, pp. 2957–2970,
2014.
[63] Z. Liu, X. Wei, and X. Li, “Decoupled ISAR imaging using rsfw based on twice compressed
sensing,” IEEE Trans. Aerosp. Electron. Syst., vol. 50, no. 4, pp. 3195–3211, 2014.
[64] S. Tomei, A. Bacci, E. Giusti, M. Martorella, and F. Berizzi, “Compressive sensing-based

inverse synthetic radar imaging imaging from incomplete data,” IET Radar Sonar Navig.,
vol. 10, no. 2, pp. 386–397, 2016.
[65] B. Wang, S. Zhang, and W. Q. Wang, “Bayesian inverse synthetic aperture radar imaging
by exploiting sparse probing frequencies,” IEEE Antennas Wirel. Propag. Lett., vol. 14,
pp. 1698–1701, 2015.
[66] S. Li, G. Zhao, W. Zhang, Q. Qiu, and H. Sun, “ISAR imaging by two-dimensional convex
optimization-based compressive sensing,” IEEE Sens. J., vol. 16, no. 19, pp. 7088–7093,
2016.
[67] X. Zhang, T. Bai, H. Meng, and J. Chen, “Compressive sensing based ISAR imaging via the
combination of the sparsity and nonlocal total variation,” IEEE Geosci. Remote Sens. Lett.,
vol. 11, no. 5, pp. 990–994, 2014.
[68] G. Xu, M. Xing, X. G. Xia et al., “High-resolution inverse synthetic aperture radar imaging
and scaling with sparse aperture,” IEEE J. Sel. Top. Appl. Earth Observ., vol. 8, no. 8,
pp. 4010–4027, 2015.
[69] L. Zhao, L. Wang, G. Bi, and L. Yang, “An autofocus technique for high-resolution inverse
synthetic aperture radar imagery,” IEEE Trans. Geosci. Remote Sensing, vol. 52, no. 10,
pp. 6392–6403, 2014.
[70] J. Ender, “Autofocusing ISAR images via sparse representation,” in Proc. European Conf.
on Synthetic Aperture Radar, 2012.
[71] W. Rao, G. Li, X. Wang, and X.-G. Xia, “Adaptive sparse recovery by parametric weighted
l1 minimization for ISAR imaging of uniformly rotating targets,” IEEE J. Select. Topics
Appl. Earth Observ. Remote Sensing, vol. 6, no. 2, pp. 942–952, 2013.
[72] G. Li, H. Zhang, X. Wang, and X. G. Xia, “ISAR 2-d imaging of uniformly rotating targets
via matching pursuit,” IEEE Trans. Aerosp. Electron. Syst., vol. 48, pp. 1838–1846, 2012.
[73] B. Jiu, H. Liu, H. Liu et al., “Joint ISAR imaging and cross-range scaling method based
on compressive sensing with adaptive dictionary,” IEEE Trans. Antennas Propag, vol. 63,
no. 5, pp. 2112–2122, 2015.
[74] H. R. Hashempour and M. A. Masnadi-Shirazi, “Inverse synthetic aperture radar phase
adjustment and cross-range scaling based on sparsity,” Digit. Signal Proc., vol. 68, pp. 93–
101, 2017.
[75] G. Xu, M. Xing, and Z. Bao, “High-resolution inverse synthetic aperture radar imaging of
manoeuvring targets with sparse aperture,” Electron. Lett., vol. 51, no. 3, pp. 287–289, 2015.
[76] L. Stankovic, “ISAR image analysis and recovery with unavailable or heavily corrupted
data,” IEEE Trans. Aerosp. Electron. Syst., vol. 51, no. 3, pp. 2093–2106, 2015.
[77] L. Zhang, J. Duan, Z. Qiao, M. Xing, and Z. Bao, “Phase adjustment and ISAR imaging
of maneuvering targets with sparse apertures,” IEEE Trans. Aerosp. Electron. Syst., vol. 50,
no. 3, pp. 1955–1973, 2014.
[78] G. Xu, M. Xing, L. Zhang, J. Duan, Q. Chen, and Z. Bao, “Sparse apertures ISAR imaging
and scaling for maneuvering targets,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens.,
vol. 7, no. 7, pp. 2942–2956, 2014.
[79] J. Duan, L. Zhang, and M. Xing, “A weighted eigenvector autofocus method for sparse-
aperture ISAR imaging,” EURASIP J. on Adv. Sig. Proc., vol. 92, 2013.
[80] G. Xu, L. Yang, L. Zhao, and G. Bi, “ISAR maneuvering targets imaging and motion esti-
mation from parametric sparse Bayesian learning,” in Proc. IEEE International Geoscience
and Remote Sensing Symposium, 2016.
[81] G. Xu, L. Yang, G. Bi, and M. Xing, “Maneuvering target imaging and scaling by using
sparse inverse synthetic aperture,” Signal Process., vol. 137, pp. 149–159, 2017.
[82] W. Su, Y. Qin, H. Wang, and Q. Yang, “Joint ISAR imaging and phase error correction based
on sparse bayesian learning,” Int. J. Sig. Proc. Sys., vol. 4, no. 6, pp. 487–493, 2016.
[83] S. Zhang, Y. Liu, X. Li, and G. Bi, “Logarithmic Laplacian prior based Bayesian inverse
synthetic aperture radar imaging,” Sensors, vol. 16, no. 5, 2016.
[84] L. Zhao, L. Wang, G. Bi et al. “Structured sparsity-driven autofocus algorithm for high-
resolution radar imagery,” Signal Process., vol. 125, pp. 376–388, 2016.
[85] L. Sun, X. Lu, and W. Chen, “Joint sparsity-based ISAR imaging for micromotion targets,”
IEEE Geosci. Remote Sens. Lett., vol. 13, no. 11, pp. 1734–1738, 2016.
[86] Q. Hou, Y. Liu, and Z. Chen, “Reducing micro-Doppler effect in compressed sensing ISAR
imaging for aircraft using limited pulses,” Electron. Lett., vol. 51, no. 12, pp. 937–939, 2015.
Index
L1 -norm CCG, 181, 184–188 SAR, 39

L1 -norm MCG, 184–188 Hardware, 42
Synthetic Aperture Radar, 39
Adaptive Beamforming, 225, 228, 229 Spectral Coexistence, 20, 23
Adaptive Complex Approximate Message Passing, Hardware, 26
118 Sub-Nyquist, 19
Adaptive Signal Processing, 263 Cognitive Radio, 20
ALM, 77 Computational Complexity, 228, 240
Augmented Lagrange Multiplier, 77 Convergence Rate, 241
Exact ALM, 78 Convex Optimization, 199
Inexact ALM, 78 Covariance Matrix Fitting, 228, 235, 237
Alternating Optimization, 299, 304 Covariance Matrix Reconstruction, 234
Atomic Norm, 199 Covariance Matrix Sparse Reconstruction, 234, 240,
244
Basis Pursuit Denoising, 110 CPI, 73, 166, 177, 179, 187
Beamformer, 145, 225, 229 Coherent Processing Interval, 73
Capon, 230 Curve Fitting, 94
Delay-and-Sum (DAS), 233 CUT, 168
Diagonal Loading SMI (DL-SMI), 231 Cell Under Test, 168
Eigenspace, 232
Sample Matrix Inversion (SMI), 231 DAS, see Delay-and-Sum
Worst-Case, 233 Data Cube, 50
Beampattern, 243 David J. Greene, 72
BIC, 74 Diagonal Loading, 226, 231
Bayesian Information Criterion, 74, 258 Direction of Arrival (DOA), 226
Block SLIM, 258 DoFs, 169–171, 181
Block Sparsity, 258 Doppler Focusing, 8
CAMP, 112 EFA, 165, 175–179

Complex Approximate Message Passing, 112 Extended Factored Algorithm, 165
CFAR, 109, 148, 149 Eigenspace Decomposition, 226, 232
Constant False Alarm Rate, 109, 148, 149 EVD, 168, 181, 184–188
CG, 165 Eigenvalue Decomposition, 168
Conjugate Gradient, 165
Chirp Pulse, 324 False Alarm Probability, 108
CLEAN, 74 FFT, 84
CLEAN-BIC, 74 Fast-Fourier Transform, 84
Clutter, 54 FIM, 84
Clutter Interference, 139, 143 Fisher Information Matrix, 84
Cognitive Radar, 257 Finite Rate of Innovation, 4
MIMO, 33 FOCUSS, 173–175, 177, 179
Hardware, 33
Multiple Input Multiple Output (MIMO), GLRT, 107
33 Generalized Likelihood Ratio Test, 107
355
356 Index
GMF, 172 MNV, 168, 172

Global Matched Filter, 172 Minimum Noise Variance, 168
Model Mismatch, 225, 231
HH, 74 Modulated Wideband Converter, 21
Horizontal Transmit, Horizontal Receive, 74 MP-CFAR, 149
HM, 74 MPDR, see Minimum Power Distortionless
Hybrid Method, 74 Response
MSE, 184
IAA, 226, 258
Mean Square Error, 184
Iterative Adaptive Algorithm, 226, 258
IF, 185–188 MVDR, see Minimum Variance Distortionless
Improvement Factor, 185 Response
Interference Mitigation, 285, 289
Interference-Plus-Noise Covariance Matrix, 226, NLS, 83
230, 234, 239 Nonlinear Least-Squares, 83
ISAR, 341 Nuclear Norm, 77
Compressed Sensing, 342
Dictionary Mismatch, 343 Outer Product, 227, 228, 238, 239
Fixed Dictionary, 342
Maneuvering Motion, 342 PAST, 165, 166, 181, 182, 184–188, 190
Parametric Dictionary, 344 Projection Approximation Subspace Tracking,
Inverse Synthetic Aperture Radar Imaging, 165
341 PCs, 165
Nonuniform Rotation, 342 Peak Sidelobe, 141
Extended Second-Order, 348 Periodogram, 84
Second-Order, 347 Power Estimation, 236, 239
Third-Order, 342 PRF, 166, 177
Radar-Target Distance, 341 Pulse Repetition Frequency, 166
Small-Angle Approximation, 342 PRI, 72, 165, 166
Uniform Rotation, 345
Pulse Repetition Interval, 72, 286
First-Order, 345
Second-Order, 346
Radar Precoder/Precoding, 289, 292, 298, 307, 310,
312, 313, 315
JDL, 165
Radar-Communication Co-Existence, 284, 286, 293,
Joint Domain Localized, 165
315
Lam H. Nguyen, 72 Random Array, 140
Least Squares, 237 Random Unitary Matrix, 289, 292
LMS, 171 RCMC, 39
Least-Mean Square, 171 Range-Cell Migration Correction, 39
Low Rank, 73 RDA, 39
Fourier Domain, 40
Matrix Completion, 59, 285, 286, 289, 292, 294, Range-Doppler Algorithm, 39
296, 297, 299, 308, 310, 315 Receiver Operating Characteristic Curve, 124
MBMP-CFAR, 153 Reduced Time-on-Target, 15
MF, 53, 106 Regularized Maximum Likelihood, 258
Matched Filter, 53, 106 Resolution, 196
MIMO, 211
RFI, 72
Cognitive, 33
Radio Frequency Interference, 72
Multiple-Input Multiple-Output, 211
RLS, 172, 181, 183, 184
Sub-Nyquist, 29
Minimum Power Distortionless Response (MPDR), Recursive Least Square, 172
230 Robust Adaptive Beamforming, 226
Minimum Variance Distortionless Response Robustness, 225, 227
(MVDR), 230 RPCA, 73
ML, 85 Robust Principal Component Analysis, 73
Maximum Likelihood, 85 RPCA-CB, 74
Index 357
Sample Covariance Matrix, 226, 231 SSPARC, 18

SAR, 78, 328 Shared Spectrum Access for Radar and
Compressed Sensing, 325 Communications, 18
Compressed Sensing Moving Target Imaging, STAP, 56, 137, 165, 166, 168, 172, 173, 175–179,
334–336, 339 181, 184–188, 190
Dictionary Mismatch, 336 Compressed, 66
Dictionary-Less, 339 Space-Time Adaptive Processing, 56
Fixed Dictionary, 335 State Evolution, 113
Radar-Target Distance, 325, 328, 333 Steering Vector, 138, 226, 228, 231, 234, 239
Moving Target, 333 Mismatch, 226, 232
Platform Motion Error, 328 Sub-Nyquist Radar, 2, 4,
Sparsity-Based Autofocus, 329 Clutter Removal, 11
SAR Image, 78 Cognitive, 19
Algorithm, 40 CoSAR, 39
Sub-Nyquist, 39, 42 Hardware, 42
Synthetic Aperture Radar, 78 Doppler Sub-Nyquist, 15
Sensing Matrix, 110 Hardware, 17
Sidelobe, 140 Extensions, 4
Sidelobe Level, 108 Hardware
Signal Covariance Matrix, 234 Doppler, 17
Signal Self-Nulling, 226, 228, 234 Temporal, 13
Signal-Free Interference-Plus-Noise Covariance MIMO, 29
Matrix, 228, 231, 234 Sampling Methods, 8
Signal-to-Interference-Plus-Noise Ratio (SINR), SAR, 39
228 Algorithm, 40
Signal-to-Noise Ratio (SNR), 72, 226 Hardware, 42
SINR, see Signal-to-Interference-Plus-Noise Ratio Spatial Sub-Nyquist, 29
Sinusoidal Model, 82 Cognition, 33
SIR, 74 Hardware, 33
Signal-to-Interference Ratio, 74 Recovery Algorithm, 33, 34
SISO, 195 Spectral Coexistence, 20, 23
Single-Input Single-Output, 195 Algorithm, 24
SMI, 172, 175–179, 181, 182, 184–188, Hardware, 26
240 SUMMeR, 29
Sample Matrix Inversion, 172 Cognition, 33
SNR, 72, 240 Hardware, 33
Signal-to-Noise Ratio, 240 Recovery Algorithm, 34
Soft Thresholding, 110 Temporal Sub-Nyquist
Software Defined Radio, 258 Recovery Algorithm, 11
Sparse, 73
Sparse Learning via Iterative Minimization, Tianyi Zhang, 72
258 Two-Pulse Canceler, 55
Sparse Sensing, 284–286, 309, 315
Sparsity, 228, 235, 236 ULA, see Uniform Linear Array
Spatial Compressed Sensing, 4 Uniform Linear Array (ULA), 229
Spatial Multiplexing, 285, 295 UWB Radar, 72
Spatial Spectrum, 233, 234, 238, 239 Ultra-Wideband Radar, 72
Spectral Coexistence
Xampling, 20 Waveform Diversity, 257
Spectrum Congestion, 284 White Gaussian Noise, 84
Spectrum Sharing, 284–286, 292, 294, 297, Worst-Case Performance Optimization, 232
299–301, 304, 308–315
SpeCX, 20 Xampling, 7
Algorithm, 24
Hardware, 26 Ziv-Zakai Bound, 25

Compressed Sensing in Radar Signal Processing

Uploaded by

Copyright:

Available Formats

Compressed Sensing in Radar Signal Processing

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Compressed Sensing in Radar Signal Processing

Uploaded by

Copyright:

Available Formats

Compressed Sensing in Radar Signal Processing

Antonio De Maio is a professor in the Department of Electrical Engineering and

Alexander M. Haimovich is a distinguished professor in the Department of Electrical

Cambridge University Press is part of the University of Cambridge.

List of Contributors page xi

1 Sub-Nyquist Radar: Principles and Prototypes 1

2 Clutter Rejection and Adaptive Filtering in Compressed Sensing Radar 49

3.4 Enhanced Algorithms for RFI Mitigation 91

4 Compressed CFAR Techniques 105

7 Super-Resolution Radar Imaging via Convex Optimization 193

7.2 Signal Model and Problem Statement 195

8 Adaptive Beamforming via Sparsity-Based Reconstruction of Covariance Matrix 225

10 Cooperative Spectrum Sharing between Sparse Sensing-Based

11.3 Synthetic Aperture Radar Autofocus and Compressed Sensing 328

Ahmed Shaharyar Khwaja

Kumar Vijay Mishra

Naime Ozben Onhon

a fine tuning of a user parameter, which is dependent on the signal-to-interference ratio

A unified notation is used throughout the book.

z column vector (lower case)

T standard notation for sets (uppercase letter)

1.2 Prior Art and Historical Notes

Finite-Rate-of-Innovation (FRI) Sampling The received radar signal from L targets

Extensions of Sub-Nyquist Radars The system proposed in [49] reduces samples

Table 1.1 Sub-Nyquist radars and their corresponding reduction domains.

Sub-Nyquist system Temporal Doppler Spatial

Monostatic pulsed radar [45] Yes No No

1.3 Temporal Sub-Nyquist Radar

Consider a standard pulse-Doppler radar that transmits a pulse train

hNyq (t) ≈ HNyq (f )ej 2πf t df . (1.2)

The total transmit power of the radar is defined as

1.3.1 Received Signal Model

for pτ ≤ t ≤ (p + 1)τ is the return signal from the pth pulse.

1.3.2 Sub-Nyquist Delay-Doppler Recovery

The key to Doppler focusing follows from the approximation:

j (ν −νl )pτ P |ν − νl | < π/P τ

as illustrated in Figure 1.2. Denote the normalized focused measurements by

The Doppler focusing operation (1.8) is a continuous operation on the variable ν,

A continuous-value parameter recovery using Doppler focusing is described in [49].

Algorithm 1 Sub-Nyquist Radar Delay-Doppler Recovery [20,49]

3: Initialization: residual R0 = , index set 0 = ∅, t = 1

5: Find the two indices λt = [λt (1) λt (2)] such that

X̂t|t = (Fκ )†t , X̂t|C = 0

10: Compute new residual

In Section 1.3.4, we introduce a sub-Nyquist prototype implementing the ideas in

1.3.3 Sub-Nyquist Clutter Removal

r(t) = rRX (t) + y(t), (1.14)

M−1/2 R = M−1/2 FP AFK

1.3.4 Sub-Nyquist Hardware Prototype

Crystal Filter LPF

ADC 4 ¥ 250KHz (1MHz Total)

1.4 Doppler Sub-Nyquist Radar

The Xampling framework can also be extended in the slow-time or Doppler-frequency

1.4.1 Xampling in CPI and Delay-Doppler Recovery

1.4.2 RToT Hardware Prototype

1.5 Cognitive Sub-Nyquist Radar and Spectral Coexistence

bandwidth. Such systems have recently received tremendous interest in communications

1.5.1 Cognitive Radio

Figure 1.10 Multiband model with K = 6 bands [20].

3: Initialization: residual R0 = , index set 0 = ∅, t = 1

X̂t|t = (Fκ )†t , X̂t|C = 0