A Survey On Deep Learning in Medical Image Analysis: Haugeland 1985
A Survey On Deep Learning in Medical Image Analysis: Haugeland 1985
Geert Litjens, Thijs Kooi, Babak Ehteshami Bejnordi, Arnaud Arindra Adiyoso Setio, Francesco Ciompi,
Mohsen Ghafoorian, Jeroen A.W.M. van der Laak, Bram van Ginneken, Clara I. Sánchez
Abstract
Deep learning algorithms, in particular convolutional networks, have rapidly become a methodology of choice for
analyzing medical images. This paper reviews the major deep learning concepts pertinent to medical image analysis
and summarizes over 300 contributions to the field, most of which appeared in the last year. We survey the use of
deep learning for image classification, object detection, segmentation, registration, and other tasks. Concise overviews
are provided of studies per application area: neuro, retinal, pulmonary, digital pathology, breast, cardiac, abdominal,
musculoskeletal. We end with a summary of the current state-of-the-art, a critical discussion of open challenges and
directions for future research.
Keywords: deep learning, convolutional neural networks, medical imaging, survey
Classifica�on (Exam)
150
Classifica�on (Object)
100 Other
Detec�on (Organ, region, landmark)
50 Segmenta�on (Object)
Registra�on
0
2012 2013 2014 2015 2016 2017 0 20 40 60 80 100
All CNN RBM RNN AE Other Mul�ple
MRI Pathology
Microscopy Brain
CT Other
Lung
Ultrasound
Abdomen
X-ray
Cardiac
Mammography
Breast
Other Bone
Mul�ple Re�na
Color fundus photos Mul�ple
0 20 40 60 80 100 0 10 20 30 40 50 60 70
Number of papers Number of papers
Figure 1: Breakdown of the papers included in this survey in year of publication, task addressed (Section 3), imaging modality, and application
area (Section 4). The number of papers for 2017 has been extrapolated from the papers published in January.
techniques and architectures that we found in the medi- 2.2. Neural Networks
cal image analysis papers surveyed in this work. Neural networks are a type of learning algorithm
which forms the basis of most deep learning methods. A
neural network comprises of neurons or units with some
2.1. Learning algorithms activation a and parameters Θ = {W, B}, where W is a
set of weights and B a set of biases. The activation rep-
Machine learning methods are generally divided into resents a linear combination of the input x to the neuron
supervised and unsupervised learning algorithms, al- and the parameters, followed by an element-wise non-
though there are many nuances. In supervised learning, linearity σ(·), referred to as a transfer function:
a model is presented with a dataset D = {x, y}n=1 N
of in-
put features x and label y pairs, where y typically repre- a = σ(wT x + b). (1)
sents an instance of a fixed set of classes. In the case of
regression tasks y can also be a vector with continuous Typical transfer functions for traditional neural net-
values. Supervised training typically amounts to finding works are the sigmoid and hyperbolic tangent function.
model parameters Θ that best predict the data based on The multi-layered perceptrons (MLP), the most well-
a loss function L(y, ŷ). Here ŷ denotes the output of the known of the traditional neural networks, have several
model obtained by feeding a data point x to the function layers of these transformations:
f (x; Θ) that represents the model. f (x; Θ) = σ(WT σ(WT . . . σ(WT x + b)) + b). (2)
Unsupervised learning algorithms process data with-
out labels and are trained to find patterns, such as la- Here, W is a matrix comprising of columns wk , associ-
tent subspaces. Examples of traditional unsupervised ated with activation k in the output. Layers in between
learning algorithms are principal component analysis the input and output are often referred to as ’hidden’
and clustering methods. Unsupervised training can be layers. When a neural network contains multiple hidden
performed under many different loss functions. One ex- layers it is typically considered a ’deep’ neural network,
ample is reconstruction loss L(x, x̂) where the model has hence the term ’deep learning’.
to learn to reconstruct its input, often through a lower- At the final layer of the network the activations are
dimensional or noisy representation. mapped to a distribution over classes P(y|x; Θ) through
3
a softmax function: 2.3. Convolutional Neural Networks (CNNs)
T
There are two key differences between MLPs and
ewi x+bi CNNs. First, in CNNs weights in the network are shared
P(y|x; Θ) = softmax(x; Θ) = PK T
, (3)
k=1 ewk x+bk in such a way that it the network performs convolution
operations on images. This way, the model does not
where wi indicates the weight vector leading to the out- need to learn separate detectors for the same object oc-
put node associated with class i. A schematic represen- curring at different positions in an image, making the
tation of three-layer MLP is shown in Figure 2. network equivariant with respect to translations of the
Maximum likelihood with stochastic gradient descent input. It also drastically reduces the amount of param-
is currently the most popular method to fit parameters Θ eters (i.e. the number of weights no longer depends on
to a dataset D. In stochastic gradient descent a small the size of the input image) that need to be learned. An
subset of the data, a mini-batch, is used for each gradi- example of a 1D CNN is shown in Figure 2.
ent update instead of the full data set. Optimizing max- At each layer, the input image is convolved with a
imum likelihood in practice amounts to minimizing the set of K kernels W = {W1 , W2 , . . . , WK } and added
negative log-likelihood: biases B = {b1 , . . . , bK }, each generating a new feature
map Xk . These features are subjected to an element-
N
wise non-linear transform σ(·) and the same process is
X
arg min − log P(yn |xn ; Θ) .
(4) repeated for every convolutional layer l:
Θ
n=1
Xlk = σ Wkl−1 ∗ Xl−1 + bl−1 .
k (5)
This results in the binary cross-entropy loss for two-
The second key difference between CNNs and MLPs,
class problems and the categorical cross-entropy for
is the typical incorporation of pooling layers in CNNs,
multi-class tasks. A downside of this approach is that
where pixel values of neighborhoods are aggregated us-
it typically does not optimize the quantity we are in-
ing a permutation invariant function, typically the max
terested in directly, such as area under the receiver-
or mean operation. This induces a certain amount of
operating characteristic (ROC) curve or common evalu-
translation invariance and again reduces the amount of
ation measures for segmentation, such as the Dice coef-
parameters in the network. At the end of the convo-
ficient.
lutional stream of the network, fully-connected layers
For a long time, deep neural networks (DNN) were (i.e. regular neural network layers) are usually added,
considered hard to train efficiently. They only gained where weights are no longer shared. Similar to MLPs,
popularity in 2006 (Bengio et al., 2007; Hinton and a distribution over classes is generated by feeding the
Salakhutdinov, 2006; Hinton et al., 2006) when it was activations in the final layer through a softmax function
shown that training DNNs layer-by-layer in an unsu- and the network is trained using maximum likelihood.
pervised manner (pre-training), followed by supervised
fine-tuning of the stacked network, could result in good
2.4. Deep CNN Architectures
performance. Two popular architectures trained in such
a way are stacked auto-encoders (SAEs) and deep belief Given the prevalence of CNNs in medical image anal-
networks (DBNs). However, these techniques are rather ysis, we elaborate on the most common architectures
complex and require a significant amount of engineer- and architectural differences among the widely used
ing to generate satisfactory results. models.
Currently, the most popular models are trained end-
to-end in a supervised fashion, greatly simplifying 2.4.1. General classification architectures
the training process. The most popular architectures LeNet (LeCun et al., 1998) and AlexNet (Krizhevsky
are convolutional neural networks (CNNs) and recur- et al., 2012), introduced over a decade later, were in
rent neural networks (RNNs). CNNs are currently essence very similar models. Both networks were rela-
most widely used in (medical) image analysis, although tively shallow, consisting of two and five convolutional
RNNs are gaining popularity. The following sections layers, respectively, and employed kernels with large re-
will give a brief overview of each of these methods, ceptive fields in layers close to the input and smaller
starting with the most popular ones, and discussing their kernels closer to the output. AlexNet did incorporate
differences and potential challenges when applied to rectified linear units instead of the hyperbolic tangent as
medical problems. activation function.
4
After 2012 the exploration of novel architectures took of fusion, multi-stream architectures are being explored.
off, and in the last three years there is a preference for These models, also referred to as dual pathway architec-
far deeper models. By stacking smaller kernels, instead tures (Kamnitsas et al., 2017), have two main applica-
of using a single layer of kernels with a large receptive tions at the time of writing: (1) multi-scale image analy-
field, a similar function can be represented with less pa- sis and (2) 2.5D classification; both relevant for medical
rameters. These deeper architectures generally have a image processing tasks.
lower memory footprint during inference, which enable For the detection of abnormalities, context is often
their deployment on mobile computing devices such as an important cue. The most straightforward way to in-
smartphones. Simonyan and Zisserman (2014) were the crease context is to feed larger patches to the network,
first to explore much deeper networks, and employed but this can significantly increase the amount of param-
small, fixed size kernels in each layer. A 19-layer model eters and memory requirements of a network. Conse-
often referred to as VGG19 or OxfordNet won the Ima- quently, architectures have been investigated where con-
geNet challenge of 2014. text is added in a down-scaled representation in addi-
On top of the deeper networks, more complex build- tion to high resolution local information. To the best
ing blocks have been introduced that improve the effi- of our knowledge, the multi-stream multi-scale archi-
ciency of the training procedure and again reduce the tecture was first explored by Farabet et al. (2013), who
amount of parameters. Szegedy et al. (2014) introduced used it for segmentation in natural images. Several med-
a 22-layer network named GoogLeNet, also referred to ical applications have also successfully used this con-
as Inception, which made use of so-called inception cept (Kamnitsas et al., 2017; Moeskops et al., 2016a;
blocks (Lin et al., 2013), a module that replaces the Song et al., 2015; Yang et al., 2016c).
mapping defined in Eq. (5) with a set of convolutions of As so much methodology is still developed on nat-
different sizes. Similar to the stacking of small kernels, ural images, the challenge of applying deep learning
this allows a similar function to be represented with less techniques to the medical domain often lies in adapt-
parameters. The ResNet architecture (He et al., 2015) ing existing architectures to, for instance, different input
won the ImageNet challenge in 2015 and consisted of formats such as three-dimensional data. In early appli-
so-called ResNet-blocks. Rather than learning a func- cations of CNNs to such volumetric data, full 3D con-
tion, the residual block only learns the residual and is volutions and the resulting large amount of parameters
thereby pre-conditioned towards learning mappings in were circumvented by dividing the Volume of Interest
each layer that are close to the identity function. This (VOI) into slices which are fed as different streams to a
way, even deeper models can be trained effectively. network. Prasoon et al. (2013) were the first to use this
Since 2014, the performance on the ImageNet bench- approach for knee cartilage segmentation. Similarly, the
mark has saturated and it is difficult to assess whether network can be fed with multiple angled patches from
the small increases in performance can really be at- the 3D-space in a multi-stream fashion, which has been
tributed to ’better’ and more sophisticated architectures. applied by various authors in the context of medical
The advantage of the lower memory footprint these imaging (Roth et al., 2016b; Setio et al., 2016). These
models provide is typically not as important for medi- approaches are also referred to as 2.5D classification.
cal applications. Consequently, AlexNet or other sim-
ple models such as VGG are still popular for medical 2.4.3. Segmentation Architectures
data, though recent landmark studies all use a version Segmentation is a common task in both natural and
of GoogleNet called Inception v3 (Gulshan et al., 2016; medical image analysis and to tackle this, CNNs can
Esteva et al., 2017; Liu et al., 2017). Whether this is due simply be used to classify each pixel in the image indi-
to a superior architecture or simply because the model vidually, by presenting it with patches extracted around
is a default choice in popular software packages is again the particular pixel. A drawback of this naive ’sliding-
difficult to assess. window’ approach is that input patches from neighbor-
ing pixels have huge overlap and the same convolutions
2.4.2. Multi-stream architectures are computed many times. Fortunately, the convolution
The default CNN architecture can easily accommo- and dot product are both linear operators and thus inner
date multiple sources of information or representations products can be written as convolutions and vice versa.
of the input, in the form of channels presented to the By rewriting the fully connected layers as convolutions,
input layer. This idea can be taken further and chan- the CNN can take input images larger than it was trained
nels can be merged at any point in the network. Under on and produce a likelihood map, rather than an out-
the intuition that different tasks require different ways put for a single pixel. The resulting ’fully convolutional
5
network’ (fCNN) can then be applied to an entire input (in time) and consequently suffer from the same
image or volume in an efficient fashion. problems with training as regular deep neural networks
However, because of pooling layers, this may result (Bengio et al., 1994). To this end, several specialized
in output with a far lower resolution than the input. memory units have been developed, the earliest and
’Shift-and-stitch’ (Long et al., 2015) is one of several most popular being the Long Short Term Memory
methods proposed to prevent this decrease in resolution. (LSTM) cell (Hochreiter and Schmidhuber, 1997). The
The fCNN is applied to shifted versions of the input im- Gated Recurrent Unit (Cho et al., 2014) is a recent
age. By stitching the result together, one obtains a full simplification of the LSTM and is also commonly used.
resolution version of the final output, minus the pixels
lost due to the ’valid’ convolutions. Although initially proposed for one-dimensional in-
Ronneberger et al. (2015) took the idea of the fCNN put, RNNs are increasingly applied to images. In natu-
one step further and proposed the U-net architecture, ral images ’pixelRNNs’ are used as autoregressive mod-
comprising a ’regular’ fCNN followed by an upsam- els, generative models that can eventually produce new
pling part where ’up’-convolutions are used to in- images similar to samples in the training set. For med-
crease the image size, coined contractive and expansive ical applications, they have been used for segmentation
paths. Although this is not the first paper to introduce problems, with promising results (Stollenga et al., 2015)
learned upsampling paths in convolutional neural net- in the MRBrainS challenge.
works (e.g. Long et al. (2015)), the authors combined it
with so called skip-connections to directly connect op-
posing contracting and expanding convolutional layers. 2.6. Unsupervised models
A similar approach was used by Çiçek et al. (2016) for
3D data. Milletari et al. (2016b) proposed an extension 2.6.1. Auto-encoders (AEs) and Stacked Auto-encoders
to the U-Net layout that incorporates ResNet-like resid- (SAEs)
ual blocks and a Dice loss layer, rather than the conven-
tional cross-entropy, that directly minimizes this com- AEs are simple networks that are trained to recon-
monly used segmentation error measure. struct the input x on the output layer x0 through one hid-
den layer h. They are governed by a weight matrix W x,h
and bias b x,h from input to hidden state and Wh,x0 with
2.5. Recurrent Neural Networks (RNNs)
corresponding bias bh,x0 from the hidden layer to the re-
Traditionally, RNNs were developed for discrete se- construction. A non-linear function is used to compute
quence analysis. They can be seen as a generalization the hidden activation:
of MLPs because both the input and output can be of
varying length, making them suitable for tasks such as h = σ(W x,h x + b x,h ). (8)
machine translation where a sentence of the source and
target language are the input and output. In a classifica- Additionally, the dimension of the hidden layer |h| is
tion setting, the model learns a distribution over classes taken to be smaller than |x|. This way, the data is pro-
P(y|x1 , x2 , . . . , xT ; Θ) given a sequence x1 , x2 , . . . , xT , jected onto a lower dimensional subspace representing
rather than a single input vector x. a dominant latent structure in the input. Regularization
The plain RNN maintains a latent or hidden state h at or sparsity constraints can be employed to enhance the
time t that is the output of a non-linear mapping from its discovery process. If the hidden layer had the same size
input xt and the previous state ht−1 : as the input and no further non-linearities were added,
the model would simply learn the identity function.
ht = σ(Wxt + Rht−1 + b), (6)
The denoising auto-encoder (Vincent et al., 2010) is
where weight matrices W and R are shared over time. another solution to prevent the model from learning a
For classification, one or more fully connected layers trivial solution. Here the model is trained to recon-
are typically added followed by a softmax to map the struct the input from a noise corrupted version (typically
sequence to a posterior over the classes. salt-and-pepper-noise). SAEs (or deep AEs) are formed
by placing auto-encoder layers on top of each other.
P(y|x1 , x2 , . . . , xT ; Θ) = softmax(hT ; Wout , bout ). (7) In medical applications surveyed in this work, auto-
encoder layer were often trained individually (‘greed-
Since the gradient needs to be backpropagated from ily’) after which the full network was fine-tuned using
the output through time, RNNs are inherently deep supervised training to make a prediction.
6
2.6.2. Restricted Boltzmann Machines (RBMs) and 2.7. Hardware and Software
Deep Belief Networks (DBNs)
One of the main contributors to steep rise of deep
RBMs (Hinton, 2010) are a type of Markov Ran- learning has been the widespread availability of GPU
dom Field (MRF), constituting an input layer or visi- and GPU-computing libraries (CUDA, OpenCL). GPUs
ble layer x = (x1 , x2 , . . . , xN ) and a hidden layer h = are highly parallel computing engines, which have an
(h1 , h2 , . . . , h M ) that carries the latent feature represen- order of magnitude more execution threads than central
tation. The connections between the nodes are bi- processing units (CPUs). With current hardware, deep
directional, so given an input vector x one can obtain learning on GPUs is typically 10 to 30 times faster than
the latent feature representation h and also vice versa. on CPUs.
As such, the RBM is a generative model, and we can Next to hardware, the other driving force behind the
sample from it and generate new data points. In anal- popularity of deep learning methods is the wide avail-
ogy to physical systems, an energy function is defined ability of open source software packages. These li-
for a particular state (x, h) of input and hidden units: braries provide efficient GPU implementations of im-
portant operations in neural networks, such as convo-
E(x, h) = hT Wx − cT x − bT h, (9) lutions; allowing the user to implement ideas at a high
level rather than worrying about low-level efficient im-
with c and b bias terms. The probability of the ‘state’ of plementations. At the time of writing, the most popular
the system is defined by passing the energy to an expo- packages were (in alphabetical order):
nential and normalizing:
• Caffe (Jia et al., 2014). Provides C++ and Python
1 interfaces, developed by graduate students at UC
p(x, h) = exp{−E(x, h)}. (10)
Z Berkeley.
Computing the partition function Z is generally in- • Tensorflow (Abadi et al., 2016). Provides C++
tractable. However, conditional inference in the form of and Python and interfaces, developed by Google
computing h conditioned on v or vice versa is tractable and is used by Google research.
and results in a simple formula:
• Theano (Bastien et al., 2012). Provides a Python
1 interface, developed by MILA lab in Montreal.
P(h j |x) = . (11)
1 + exp{−b j − W j x}
• Torch (Collobert et al., 2011). Provides a Lua in-
Since the network is symmetric, a similar expression terface and is used by, among others, Facebook AI
holds for P(xi |h). research.
Concatenate
Up-convolu�on
Down-sample
Up-sample
(e) (f)
Input node Weighted connec�on
Hidden node Weighted connec�on
Output node (similar colors indicate shared weights)
Probabilis�c node Pooling connec�on
Figure 2: Node graphs of 1D representations of architectures commonly used in medical imaging. a) Auto-encoder, b) restricted Boltzmann
machine, c) recurrent neural network, d) convolutional neural network, e) multi-stream convolutional neural network, f) U-net (with a single
downsampling stage).
.
small compared to those in computer vision (e.g., hun- few authors perform a thorough investigation in which
dreds/thousands vs. millions of samples). The popular- strategy gives the best result. The two papers that do,
ity of transfer learning for such applications is therefore Antony et al. (2016) and Kim et al. (2016a), offer con-
not surprising. flicting results. In the case of Antony et al. (2016), fine-
tuning clearly outperformed feature extraction, achiev-
Transfer learning is essentially the use of pre-trained ing 57.6% accuracy in multi-class grade assessment of
networks (typically on natural images) to try to work knee osteoarthritis versus 53.4%. Kim et al. (2016a),
around the (perceived) requirement of large data sets however, showed that using CNN as a feature extractor
for deep network training. Two transfer learning strate- outperformed fine-tuning in cytopathology image clas-
gies were identified: (1) using a pre-trained network as sification accuracy (70.5% versus 69.1%). If any guid-
a feature extractor and (2) fine-tuning a pre-trained net- ance can be given to which strategy might be most suc-
work on medical data. The former strategy has the extra cessful, we would refer the reader to two recent papers,
benefit of not requiring one to train a deep network at published in high-ranking journals, which fine-tuned a
all, allowing the extracted features to be easily plugged pre-trained version of Google’s Inception v3 architec-
in to existing image analysis pipelines. Both strategies ture on medical data and achieved (near) human expert
are popular and have been widely applied. However,
8
performance (Esteva et al., 2017; Gulshan et al., 2016). 3.1.2. Object or lesion classification
As far as the authors are aware, such results have not yet Object classification usually focuses on the classifi-
been achieved by simply using pre-trained networks as cation of a small (previously identified) part of the med-
feature extractors. ical image into two or more classes (e.g. nodule clas-
With respect to the type of deep networks that are sification in chest CT). For many of these tasks both
commonly used in exam classification, a timeline sim- local information on lesion appearance and global con-
ilar to computer vision is apparent. The medical textual information on lesion location are required for
imaging community initially focused on unsupervised accurate classification. This combination is typically
pre-training and network architectures like SAEs and not possible in generic deep learning architectures. Sev-
RBMs. The first papers applying these techniques for eral authors have used multi-stream architectures to re-
exam classification appeared in 2013 and focused on solve this in a multi-scale fashion (Section 2.4.2). Shen
neuroimaging. Brosch and Tam (2013), Plis et al. et al. (2015b) used three CNNs, each of which takes a
(2014), Suk and Shen (2013), and Suk et al. (2014) nodule patch at a different scale as input. The result-
applied DBNs and SAEs to classify patients as hav- ing feature outputs of the three CNNs are then concate-
ing Alzheimer’s disease based on brain Magnetic Reso- nated to form the final feature vector. A somewhat simi-
nance Imaging (MRI). Recently, a clear shift towards lar approach was followed by Kawahara and Hamarneh
CNNs can be observed. Out of the 47 papers pub- (2016) who used a multi-stream CNN to classify skin
lished on exam classification in 2015, 2016, and 2017, lesions, where each stream works on a different reso-
36 are using CNNs, 5 are based on AEs and 6 on RBMs. lution of the image. Gao et al. (2015) proposed to use
The application areas of these methods are very diverse, a combination of CNNs and RNNs for grading nuclear
ranging from brain MRI to retinal imaging and digital cataracts in slit-lamp images, where CNN filters were
pathology to lung computed tomography (CT). pre-trained. This combination allows the processing of
In the more recent papers using CNNs authors also all contextual information regardless of image size. In-
often train their own network architectures from scratch corporating 3D information is also often a necessity for
instead of using pre-trained networks. Menegola et al. good performance in object classification tasks in med-
(2016) performed some experiments comparing training ical imaging. As images in computer vision tend to be
from scratch to fine-tuning of pre-trained networks and 2D natural images, networks developed in those scenar-
showed that fine-tuning worked better given a small data ios do not directly leverage 3D information. Authors
set of around a 1000 images of skin lesions. However, have used different approaches to integrate 3D in an ef-
these experiments are too small scale to be able to draw fective manner with custom architectures. Setio et al.
any general conclusions from. (2016) used a multi-stream CNN to classify points of
Three papers used an architecture leveraging the interest in chest CT as a nodule or non-nodule. Up to
unique attributes of medical data: two use 3D con- nine differently oriented patches extracted from the can-
volutions (Hosseini-Asl et al., 2016; Payan and Mon- didate were used in separate streams and merged in the
tana, 2015) instead of 2D to classify patients as having fully-connected layers to obtain the final classification
Alzheimer; Kawahara et al. (2016b) applied a CNN- output. In contrast, Nie et al. (2016c) exploited the 3D
like architecture to a brain connectivity graph derived nature of MRI by training a 3D CNN to assess survival
from MRI diffusion-tensor imaging (DTI). In order to in patients suffering from high-grade gliomas.
do this, they developed several new layers which formed Almost all recent papers prefer the use of end-to-end
the basis of their network, so-called edge-to-edge, edge- trained CNNs. In some cases other architectures and
to-node, and node-to-graph layers. They used their net- approaches are used, such as RBMs (van Tulder and
work to predict brain development and showed that they de Bruijne, 2016; Zhang et al., 2016c), SAEs (Cheng
outperformed existing methods in assessing cognitive et al., 2016a) and convolutional sparse auto-encoders
and motor scores. (CSAE) (Kallenberg et al., 2016). The major difference
Summarizing, in exam classification CNNs are the between CSAE and a classic CNN is the usage of unsu-
current standard techniques. Especially CNNs pre- pervised pre-training with sparse auto-encoders.
trained on natural images have shown surprisingly An interesting approach, especially in cases where
strong results, challenging the accuracy of human ex- object annotation to generate training data is expensive,
perts in some tasks. Last, authors have shown that is the integration of multiple instance learning (MIL)
CNNs can be adapted to leverage intrinsic structure of and deep learning. Xu et al. (2014) investigated the use
medical images. of a MIL-framework with both supervised and unsu-
9
pervised feature learning approaches as well as hand- reinforcement learning is applied to the identification
crafted features. The results demonstrated that the per- of landmarks. The authors showed promising results in
formance of the MIL-framework was superior to hand- several tasks: 2D cardiac MRI and ultrasound (US) and
crafted features, which in turn closely approaches the 3D head/neck CT.
performance of a fully supervised method. We expect Due to its increased complexity, only a few methods
such approaches to be popular in the future as well, as addressed the direct localization of landmarks and re-
obtaining high-quality annotated medical data is chal- gions in the 3D image space. Zheng et al. (2015) re-
lenging. duced this complexity by decomposing 3D convolution
Overall, object classification sees less use of pre- as three one-dimensional convolutions for carotid artery
trained networks compared to exam classifications, bifurcation detection in CT data. Ghesu et al. (2016b)
mostly due to the need for incorporation of contextual proposed a sparse adaptive deep neural network pow-
or three-dimensional information. Several authors have ered by marginal space learning in order to deal with
found innovative solutions to add this information to data complexity in the detection of the aortic valve in
deep networks with good results, and as such we ex- 3D transesophageal echocardiogram.
pect deep learning to become even more prominent for CNNs have also been used for the localization of scan
this task in the near future. planes or key frames in temporal data. Baumgartner
et al. (2016) trained CNNs on video frame data to de-
3.2. Detection tect up to 12 standardized scan planes in mid-pregnancy
3.2.1. Organ, region and landmark localization fetal US. Furthermore, they used saliency maps to ob-
Anatomical object localization (in space or time), tain a rough localization of the object of interest in the
such as organs or landmarks, has been an important pre- scan plan (e.g. brain, spine). RNNs, particularly LSTM-
processing step in segmentation tasks or in the clinical RNNs, have also been used to exploit the temporal in-
workflow for therapy planning and intervention. Lo- formation contained in medical videos, another type of
calization in medical imaging often requires parsing of high dimensional data. Chen et al. (2015a), for example,
3D volumes. To solve 3D data parsing with deep learn- employed LSTM models to incorporate temporal infor-
ing algorithms, several approaches have been proposed mation of consecutive sequence in US videos for fetal
that treat the 3D space as a composition of 2D orthog- standard plane detection. Kong et al. (2016) combined
onal planes. Yang et al. (2015) identified landmarks on an LSTM-RNN with a CNN to detect the end-diastole
the distal femur surface by processing three indepen- and end-systole frames in cine-MRI of the heart.
dent sets of 2D MRI slices (one for each plane) with Concluding, localization through 2D image classifi-
regular CNNs. The 3D position of the landmark was cation with CNNs seems to be the most popular strat-
defined as the intersection of the three 2D slices with egy overall to identify organs, regions and landmarks,
the highest classification output. de Vos et al. (2016b) with good results. However, several recent papers ex-
went one step further and localized regions of interest pand on this concept by modifying the learning pro-
(ROIs) around anatomical regions (heart, aortic arch, cess such that accurate localization is directly empha-
and descending aorta) by identifying a rectangular 3D sized, with promising results. We expect such strate-
bounding box after 2D parsing the 3D CT volume. Pre- gies to be explored further as they show that deep learn-
trained CNN architectures, as well as RBM, have been ing techniques can be adapted to a wide range of lo-
used for the same purpose (Cai et al., 2016b; Chen et al., calization tasks (e.g. multiple landmarks). RNNs have
2015b; Kumar et al., 2016), overcoming the lack of data shown promise in localization in the temporal domain,
to learn better feature representations. All these stud- and multi-dimensional RNNs could play a role in spatial
ies cast the localization task as a classification task and localization as well.
as such generic deep learning architectures and learning
processes can be leveraged.
Other authors try to modify the network learning pro- 3.2.2. Object or lesion detection
cess to directly predict locations. For example, Payer The detection of objects of interest or lesions in im-
et al. (2016) proposed to directly regress landmark lo- ages is a key part of diagnosis and is one of the most
cations with CNNs. They used landmark maps, where labor-intensive for clinicians. Typically, the tasks con-
each landmark is represented by a Gaussian, as ground sist of the localization and identification of small lesions
truth input data and the network is directly trained to in the full image space. There has been a long research
predict this landmark map. Another interesting ap- tradition in computer-aided detection systems that are
proach was published by Ghesu et al. (2016a), in which designed to automatically detect lesions, improving the
10
detection accuracy or decreasing the reading time of hu- 3.3. Segmentation
man experts. Interestingly, the first object detection sys-
tem using CNNs was already proposed in 1995, using a 3.3.1. Organ and substructure segmentation
CNN with four layers to detect nodules in x-ray images The segmentation of organs and other substructures
(Lo et al., 1995). in medical images allows quantitative analysis of clini-
cal parameters related to volume and shape, as, for ex-
Most of the published deep learning object detection ample, in cardiac or brain analysis. Furthermore, it is
systems still uses CNNs to perform pixel (or voxel) clas- often an important first step in computer-aided detection
sification, after which some form of post processing is pipelines. The task of segmentation is typically defined
applied to obtain object candidates. As the classifica- as identifying the set of voxels which make up either
tion task performed at each pixel is essentially object the contour or the interior of the object(s) of interest.
classification, CNN architecture and methodology are Segmentation is the most common subject of papers ap-
very similar to those in section 3.1.2. The incorpora- plying deep learning to medical imaging (Figure 1), and
tion of contextual or 3D information is also handled us- as such has also seen the widest variety in methodol-
ing multi-stream CNNs (Section 2.4.2, for example by ogy, including the development of unique CNN-based
Barbu et al. (2016) and Roth et al. (2016b). Teramoto segmentation architectures and the wider application of
et al. (2016) used a multi-stream CNN to integrate CT RNNs.
and Positron Emission Tomography (PET) data. Dou The most well-known, in medical image analysis, of
et al. (2016c) used a 3D CNN to find micro-bleeds in these novel CNN architectures is U-net, published by
brain MRI. Last, as the annotation burden to gener- Ronneberger et al. (2015) (section 2.4.3). The two main
ate training data can be similarly significant compared architectural novelties in U-net are the combination of
to object classification, weakly-supervised deep learn- an equal amount of upsampling and downsampling lay-
ing has been explored by Hwang and Kim (2016), who ers. Although learned upsampling layers have been pro-
adopted such a strategy for the detection of nodules in posed before, U-net combines them with so-called skip
chest radiographs and lesions in mammography. connections between opposing convolution and decon-
volution layers. This which concatenate features from
There are some aspects which are significantly differ- the contracting and expanding paths. From a training
ent between object detection and object classification. perspective this means that entire images/scans can be
One key point is that because every pixel is classified, processed by U-net in one forward pass, resulting in a
typically the class balance is skewed severely towards segmentation map directly. This allows U-net to take
the non-object class in a training setting. To add insult into account the full context of the image, which can be
to injury, usually the majority of the non-object sam- an advantage in contrast to patch-based CNNs. Further-
ples are easy to discriminate, preventing the deep learn- more, in an extended paper by Çiçek et al. (2016), it is
ing method to focus on the challenging samples. van shown that a full 3D segmentation can be achieved by
Grinsven et al. (2016) proposed a selective data sam- feeding U-net with a few 2D annotated slices from the
pling in which wrongly classified samples were fed back same volume. Other authors have also built derivatives
to the network more often to focus on challenging areas of the U-net architecture; Milletari et al. (2016b), for
in retinal images. Last, as classifying each pixel in a example, proposed a 3D-variant of U-net architecture,
sliding window fashion results in orders of magnitude called V-net, performing 3D image segmentation using
of redundant calculation, fCNNs, as used in Wolterink 3D convolutional layers with an objective function di-
et al. (2016), are important aspect of an object detection rectly based on the Dice coefficient. Drozdzal et al.
pipeline as well. (2016) investigated the use of short ResNet-like skip
connections in addition to the long skip-connections in
Challenges in meaningful application of deep learn- a regular U-net.
ing algorithms in object detection are thus mostly sim- RNNs have recently become more popular for seg-
ilar to those in object classification. Only few pa- mentation tasks. For example, Xie et al. (2016b) used
pers directly address issues specific to object detection a spatial clockwork RNN to segment the perimysium
like class imbalance/hard-negative mining or efficient in H&E-histopathology images. This network takes
pixel/voxel-wise processing of images. We expect that into account prior information from both the row and
more emphasis will be given to those areas in the near column predecessors of the current patch. To incor-
future, for example in the application of multi-stream porate bidirectional information from both left/top and
networks in a fully convolutional fashion. right/bottom neighbors, the RNN is applied four times
11
in different orientations and the end-result is concate- 3.3.2. Lesion segmentation
nated and fed to a fully-connected layer. This produces Segmentation of lesions combines the challenges of
the final output for a single patch. Stollenga et al. (2015) object detection and organ and substructure segmen-
where the first to use a 3D LSTM-RNN with convolu- tation in the application of deep learning algorithms.
tional layers in six directions. Andermatt et al. (2016) Global and local context are typically needed to per-
used a 3D RNN with gated recurrent units to segment form accurate segmentation, such that multi-stream net-
gray and white matter in a brain MRI data set. Chen works with different scales or non-uniformly sampled
et al. (2016d) combined bi-directional LSTM-RNNs patches are used as in for example Kamnitsas et al.
with 2D U-net-like-architectures to segment structures (2017) and Ghafoorian et al. (2016b). In lesion seg-
in anisotropic 3D electron microscopy images. Last, mentation we have also seen the application of U-net
Poudel et al. (2016) combined a 2D U-net architecture and similar architectures to leverage both this global
with a gated recurrent unit to perform 3D segmentation. and local context. The architecture used by Wang et al.
Although these specific segmentation architectures (2015), similar to the U-net, consists of the same down-
offered compelling advantages, many authors have also sampling and upsampling paths, but does not use skip
obtained excellent segmentation results with patch- connections. Another U-net-like architecture was used
trained neural networks. One of the earliest papers cov- by Brosch et al. (2016) to segment white matter lesions
ering medical image segmentation with deep learning in brain MRI. However, they used 3D convolutions and
algorithms used such a strategy and was published by a single skip connection between the first convolutional
Ciresan et al. (2012). They applied pixel-wise segmen- and last deconvolutional layers.
tation of membranes in electron microscopy imagery in One other challenge that lesion segmentation shares
a sliding window fashion. Most recent papers now use with object detection is class imbalance, as most vox-
fCNNs (subsection 2.4.3) in preference over sliding- els/pixels in an image are from the non-diseased class.
window-based classification to reduce redundant com- Some papers combat this by adapting the loss function:
putation. Brosch et al. (2016) defined it to be a weighted combi-
fCNNs have also been extended to 3D and have nation of the sensitivity and the specificity, with a larger
been applied to multiple targets at once: Korez et al. weight for the specificity to make it less sensitive to the
(2016), used 3D fCNNs to generate vertebral body like- data imbalance. Others balance the data set by perform-
lihood maps which drove deformable models for ver- ing data augmentation on positive samples (Kamnitsas
tebral body segmentation in MR images, Zhou et al. et al., 2017; Litjens et al., 2016; Pereira et al., 2016).
(2016) segmented nineteen targets in the human torso, Thus lesion segmentation sees a mixture of ap-
and Moeskops et al. (2016b) trained a single fCNN to proaches used in object detection and organ segmenta-
segment brain MRI, the pectoral muscle in breast MRI, tion. Developments in these two areas will most likely
and the coronary arteries in cardiac CT angiography naturally propagate to lesion segmentation as the exist-
(CTA). ing challenges are also mostly similar.
One challenge with voxel classification approaches 3.4. Registration
is that they sometimes lead to spurious responses. To Registration (i.e. spatial alignment) of medical im-
combat this, groups have tried to combine fCNNs with ages is a common image analysis task in which a coordi-
graphical models like MRFs (Shakeri et al., 2016; Song nate transform is calculated from one medical image to
et al., 2015) and Conditional Random Fields (CRFs) another. Often this is performed in an iterative frame-
(Alansary et al., 2016; Cai et al., 2016a; Christ et al., work where a specific type of (non-)parametric trans-
2016; Dou et al., 2016c; Fu et al., 2016a; Gao et al., formation is assumed and a pre-determined metric (e.g.
2016c) to refine the segmentation output. In most of the L2-norm) is optimized. Although segmentation and le-
cases, graphical models are applied on top of the likeli- sion detection are more popular topics for deep learn-
hood map produced by CNNs or fCNNs and act as label ing, researchers have found that deep networks can be
regularizers. beneficial in getting the best possible registration per-
Summarizing, segmentation in medical imaging has formance. Broadly speaking, two strategies are preva-
seen a huge influx of deep learning related methods. lent in current literature: (1) using deep-learning net-
Custom architectures have been created to directly tar- works to estimate a similarity measure for two images
get the segmentation task. These have obtained promis- to drive an iterative optimization strategy, and (2) to di-
ing results, rivaling and often improving over results ob- rectly predict transformation parameters using deep re-
tained with fCNNs. gression networks.
12
Wu et al. (2013), Simonovsky et al. (2016), and nificantly improved execution time: 1500x speed-up for
Cheng et al. (2015) used the first strategy to try to opti- 2D and 66x speed-up for 3D.
mize registration algorithms. Cheng et al. (2015) used In contrast to classification and segmentation, the re-
two types of stacked auto-encoders to assess the local search community seems not have yet settled on the best
similarity between CT and MRI images of the head. way to integrate deep learning techniques in registration
Both auto-encoders take vectorized image patches of methods. Not many papers have yet appeared on the
CT and MRI and reconstruct them through four lay- subject and existing ones each have a distinctly differ-
ers. After the networks are pre-trained using unsu- ent approach. Thus, giving recommendations on what
pervised patch reconstruction they are fine-tuned using method is most promising seems inappropriate. How-
two prediction layers stacked on top of the third layer ever, we expect to see many more contributions of deep
of the SAE. These prediction layers determine whether learning to medical image registration in the near future.
two patches are similar (class 1) or dissimilar (class 2).
Simonovsky et al. (2016) used a similar strategy, al- 3.5. Other tasks in medical imaging
beit with CNNs, to estimate a similarity cost between
two patches from differing modalities. However, they 3.5.1. Content-based image retrieval
also presented a way to use the derivative of this met- Content-based image retrieval (CBIR) is a technique
ric to directly optimize the transformation parameters, for knowledge discovery in massive databases and of-
which are decoupled from the network itself. Last, Wu fers the possibility to identify similar case histories, un-
et al. (2013) combined independent subspace analysis derstand rare disorders, and, ultimately, improve patient
and convolutional layers to extract features from input care. The major challenge in the development of CBIR
patches in an unsupervised manner. The resultant fea- methods is extracting effective feature representations
ture vectors are used to drive the HAMMER registration from the pixel-level information and associating them
algorithm instead of handcrafted features. with meaningful concepts. The ability of deep CNN
Miao et al. (2016) and Yang et al. (2016d) used deep models to learn rich features at multiple levels of ab-
learning algorithms to directly predict the registration straction has elicited interest from the CBIR commu-
transform parameters given input images. Miao et al. nity.
(2016) leveraged CNNs to perform 3D model to 2D x- All current approaches use (pre-trained) CNNs to ex-
ray registration to assess the pose and location of an tract feature descriptors from medical images. Anavi
implanted object during surgery. In total the transfor- et al. (2016) and Liu et al. (2016b) applied their meth-
mation has 6 parameters, two translational, 1 scaling ods to databases of X-ray images. Both used a five-layer
and 3 angular parameters. They parameterize the fea- CNN and extracted features from the fully-connected
ture space in steps of 20 degrees for two angular pa- layers. Anavi et al. (2016) used the last layer and a
rameters and train a separate CNN to predict the update pre-trained network. Their best results were obtained
to the transformation parameters given an digitally re- by feeding these features to a one-vs-all support vec-
constructed x-ray of the 3D model and the actual inter- tor machine (SVM) classifier to obtain the distance met-
operative x-ray. The CNNs are trained with artificial ric. They showed that incorporating gender information
examples generated by manually adapting the transfor- resulted in better performance than just CNN features.
mation parameters for the input training data. They Liu et al. (2016b) used the penultimate fully-connected
showed that their approach has significantly higher reg- layer and a custom CNN trained to classify X-rays in
istration success rates than using traditional - purely in- 193 classes to obtain the descriptive feature vector. Af-
tensity based - registration methods. Yang et al. (2016d) ter descriptor binarization and data retrieval using Ham-
tackled the problem of prior/current registration in brain ming separation values, the performance was inferior
MRI using the OASIS data set. They used the large to the state of the art, which the authors attributed to
deformation diffeomorphic metric mapping (LDDMM) small patch sizes of 96 pixels. The method proposed
registration methodology as a basis. This method takes by Shah et al. (2016) combines CNN feature descrip-
as input an initial momentum value for each pixel which tors with hashing-forests. 1000 features were extracted
is then evolved over time to obtain the final transfor- for overlapping patches in prostate MRI volumes, after
mation. However, the calculation of the initial momen- which a large feature matrix was constructed over all
tum map is often an expensive procure. The authors volumes. Hashing forests were then used to compress
circumvent this by training a U-net like architecture to this into descriptors for each volume.
predict the x- and y-momentum map given the input im- Content-based image retrieval as a whole has thus
ages. They obtain visually similar results but with sig- not seen many successful applications of deep learning
13
(2016c), 3T and 7T brain MRI in Bahrami et al. (2016),
PET from MRI in Li et al. (2014), and CT from MRI in
Nie et al. (2016a). Li et al. (2014) even showed that one
can use these generated images in computer-aided diag-
nosis systems for Alzheimer’s disease when the original
data is missing or not acquired.
With multi-stream CNNs super-resolution images
can be generated from multiple low-resolution inputs
(section 2.4.2). In Oktay et al. (2016), multi-stream net-
works reconstructed high-resolution cardiac MRI from
one or more low-resolution input MRI volumes. Not
only can this strategy be used to infer missing spatial in-
formation, but can also be leveraged in other domains;
for example, inferring advanced MRI diffusion parame-
ters from limited data (Golkov et al., 2016). Other im-
age enhancement applications like intensity normaliza-
tion and denoising have seen only limited application of
deep learning algorithms. Janowczyk et al. (2016a) used
SAEs to normalize H&E-stained histopathology images
Figure 3: Collage of some medical imaging applications in which whereas Benou et al. (2016) used CNNs to perform de-
deep learning has achieved state-of-the-art results. From top-left to noising in DCE-MRI time-series.
bottom-right: mammographic mass classification (Kooi et al., 2016), Image generation has seen impressive results with
segmentation of lesions in the brain (top ranking in BRATS, ISLES
and MRBrains challenges, image from Ghafoorian et al. (2016b), leak
very creative applications of deep networks in signifi-
detection in airway tree segmentation (Charbonnier et al., 2017), di- cantly differing tasks. One can only expect the number
abetic retinopathy classification (Kaggle Diabetic Retinopathy chal- of tasks to increase further in the future.
lenge 2015, image from van Grinsven et al. (2016), prostate segmen-
tation (top rank in PROMISE12 challenge), nodule classification (top
3.5.3. Combining Image Data With Reports
ranking in LUNA16 challenge), breast cancer metastases detection in
lymph nodes (top ranking and human expert performance in CAME- The combination of text reports and medical image
LYON16), human expert performance in skin lesion classification (Es- data has led to two avenues of research: (1) leverag-
teva et al., 2017), and state-of-the-art bone suppression in x-rays, im- ing reports to improve image classification accuracy
age from Yang et al. (2016c).
(Schlegl et al., 2015), and (2) generating text reports
from images (Kisilev et al., 2016; Shin et al., 2015,
methods yet, but given the results in other areas it seems 2016a; Wang et al., 2016e); the latter inspired by recent
only a matter of time. An interesting avenue of research caption generation papers from natural images (Karpa-
could be the direct training of deep networks for the re- thy and Fei-Fei, 2015). To the best of our knowledge,
trieval task itself. the first step towards leveraging reports was taken by
Schlegl et al. (2015), who argued that large amounts of
annotated data may be difficult to acquire and proposed
3.5.2. Image Generation and Enhancement to add semantic descriptions from reports as labels. The
A variety of image generation and enhancement system was trained on sets of images along with their
methods using deep architectures have been proposed, textual descriptions and was taught to predict semantic
ranging from removing obstructing elements in im- class labels during test time. They showed that semantic
ages, normalizing images, improving image quality, information increases classification accuracy for a va-
data completion, and pattern discovery. riety of pathologies in Optical Coherence Tomography
In image generation, 2D or 3D CNNs are used to (OCT) images.
convert one input image into another. Typically these Shin et al. (2015) and Wang et al. (2016e) mined se-
architectures lack the pooling layers present in classifi- mantic interactions between radiology reports and im-
cation networks. These systems are then trained with a ages from a large data set extracted from a PACS sys-
data set in which both the input and the desired output tem. They employed latent Dirichlet allocation (LDA),
are present, defining the differences between the gener- a type of stochastic model that generates a distribution
ated and desired output as the loss function. Examples over a vocabulary of topics based on words in a docu-
are regular and bone-suppressed X-ray in Yang et al. ment. In a later work, Shin et al. (2016a) proposed a sys-
14
Table 1: Overview of papers using deep learning techniques for brain image analysis. All works use MRI unless otherwise mentioned.
tem to generate descriptions from chest X-rays. A CNN et al. (2016) used a completely different approach and
was employed to generate a representation of an image predicted categorical BI-RADS descriptors for breast
one label at a time, which was then used to train an lesions. In their work they focused on three descrip-
RNN to generate sequence of MeSH keywords. Kisilev tors used in mammography: shape, margin, and density,
15
Table 2: Overview of papers using deep learning techniques for retinal image analysis. All works use CNNs.
where each have their own class label. The system was However, the local patches might lack the contextual
fed with the image data and region proposals and pre- information required for tasks where anatomical infor-
dicts the correct label for each descriptor (e.g. for shape mation is paramount (e.g. white matter lesion segmen-
either oval, round, or irregular). tation). To tackle this, Ghafoorian et al. (2016b) used
Given the wealth of data that is available in PACS non-uniformly sampled patches by gradually lowering
systems in terms of images and corresponding diag- sampling rate in patch sides to span a larger context.
nostic reports, it seems like an ideal avenue for future An alternative strategy used by many groups is multi-
deep learning research. One could expect that advances scale analysis and a fusion of representations in a fully-
in captioning natural images will in time be applied to connected layer.
these data sets as well. Even though brain images are 3D volumes in all sur-
veyed studies, most methods work in 2D, analyzing the
3D volumes slice-by-slice. This is often motivated by
4. Anatomical application areas
either the reduced computational requirements or the
This section presents an overview of deep learning thick slices relative to in-plane resolution in some data
contributions to the various application areas in medi- sets. More recent publications had also employed 3D
cal imaging. We highlight some key contributions and networks.
discuss performance of systems on large data sets and DNNs have completely taken over many brain image
on public challenge data sets. All these challenges are analysis challenges. In the 2014 and 2015 brain tumor
listed on http:\\www.grand-challenge.org. segmentation challenges (BRATS), the 2015 longitu-
dinal multiple sclerosis lesion segmentation challenge,
4.1. Brain the 2015 ischemic stroke lesion segmentation challenge
DNNs have been extensively used for brain image (ISLES), and the 2013 MR brain image segmentation
analysis in several different application domains (Ta- challenge (MRBrains), the top ranking teams to date
ble 1). A large number of studies address classification have all used CNNs. Almost all of the aforementioned
of Alzheimer’s disease and segmentation of brain tis- methods are concentrating on brain MR images. We ex-
sue and anatomical structures (e.g. the hippocampus). pect that other brain imaging modalities such as CT and
Other important areas are detection and segmentation US can also benefit from deep learning based analysis.
of lesions (e.g. tumors, white matter lesions, lacunes,
micro-bleeds). 4.2. Eye
Apart from the methods that aim for a scan-level Ophthalmic imaging has developed rapidly over the
classification (e.g. Alzheimer diagnosis), most meth- past years, but only recently are deep learning algo-
ods learn mappings from local patches to representa- rithms being applied to eye image understanding. As
tions and subsequently from representations to labels. summarized in Table 2, most works employ simple
16
Table 3: Overview of papers using deep learning techniques for chest x-ray image analysis.
Table 4: Overview of papers using deep learning techniques for chest CT image analysis.
CNNs for the analysis of color fundus imaging (CFI). Gulshan et al. (2016) performed a thorough analysis
A wide variety of applications are addressed: segmen- of the performance of a Google Inception v3 network
tation of anatomical structures, segmentation and detec- for diabetic retinopathy detection, showing performance
tion of retinal abnormalities, diagnosis of eye diseases, comparable to a panel of seven certified ophthalmolo-
and image quality assessment. gists.
In 2015, Kaggle organized a diabetic retinopathy de-
tection competition: Over 35,000 color fundus images 4.3. Chest
were provided to train algorithms to predict the sever- In thoracic image analysis of both radiography and
ity of disease in 53,000 test images. The majority of computed tomography, the detection, characterization,
the 661 teams that entered the competition applied deep and classification of nodules is the most commonly ad-
learning and four teams achieved performance above dressed application. Many works add features derived
that of humans, all using end-to-end CNNs. Recently from deep networks to existing feature sets or compare
17
Table 5: Overview of papers using deep learning for digital pathology images. The staining and imaging modality abbreviations used in the table are
as follows: H&E: hematoxylin and eosin staining, TIL: Tumor-infiltrating lymphocytes, BCC: Basal cell carcinoma, IHC: immunohistochemistry,
RM: Romanowsky, EM: Electron microscopy, PC: Phase contrast, FL: Fluorescent, IFL: Immunofluorescent, TPM: Two-photon microscopy, CM:
Confocal microscopy, Pap: Papanicolaou.
18
CNNs with classical machine learning approaches us- and AMIDA 2013, GLAS for gland segmentation and,
ing handcrafted features. In chest X-ray, several groups CAMELYON16 and TUPAC for processing breast can-
detect multiple diseases with a single system. In CT cer tissue samples.
the detection of textural patterns indicative of intersti- In both ICPR 2012 and the AMIDA13 challenges on
tial lung diseases is also a popular research topic. mitosis detection the IDSIA team outperformed other
Chest radiography is the most common radiological algorithms with a CNN based approach (Cireşan et al.,
exam; several works use a large set of images with text 2013). The same team had the highest performing sys-
reports to train systems that combine CNNs for image tem in EM 2012 (Ciresan et al., 2012) for 2D segmen-
analysis and RNNs for text analysis. This is a branch of tation of neuronal processes. In their approach, the task
research we expect to see more of in the near future. of segmenting membranes of neurons was performed
In a recent challenge for nodule detection in CT, by mild smoothing and thresholding of the output of a
LUNA16, CNN architectures were used by all top per- CNN, which computes pixel probabilities.
forming systems. This is in contrast with a previ-
GLAS addressed the problem of gland instance seg-
ous lung nodule detection challenge, ANODE09, where
mentation in colorectal cancer tissue samples. Xu et al.
handcrafted features were used to classify nodule candi-
(2016d) achieved the highest rank using three CNN
dates. The best systems in LUNA16 still rely on nodule
models. The first CNN classifies pixels as gland ver-
candidates computed by rule-based image processing,
sus non-gland. From each feature map of the first
but systems that use deep networks for candidate detec-
CNN, edge information is extracted using the holisti-
tion also performed very well (e.g. U-net). Estimating
cally nested edge technique, which uses side convolu-
the probability that an individual has lung cancer from
tions to produce an edge map. Finally, a third CNN
a CT scan is an important topic: It is the objective of
merges gland and edge maps to produce the final seg-
the Kaggle Data Science Bowl 2017, with $1 million in
mentation.
prizes and more than one thousand participating teams.
CAMELYON16 was the first challenge to provide
4.4. Digital pathology and microscopy participants with WSIs. Contrary to other medical
The growing availability of large scale gigapixel imaging applications, the availability of large amount
whole-slide images (WSI) of tissue specimen has made of annotated data in this challenge allowed for train-
digital pathology and microscopy a very popular appli- ing very deep models such as 22-layer GoogLeNet
cation area for deep learning techniques. The developed (Szegedy et al., 2014), 16-layer VGG-Net (Simonyan
techniques applied to this domain focus on three broad and Zisserman, 2014), and 101-layer ResNet (He et al.,
challenges: (1) Detecting, segmenting, or classifying 2015). The top-five performing systems used one of
nuclei, (2) segmentation of large organs, and (3) detect- these architectures. The best performing solution in the
ing and classifying the disease of interest at the lesion- Camelyon16 challenge was presented in Wang et al.
or WSI-level. Table 5 presents an overview for each of (2016b). This method is based on an ensemble of
these categories. two GoogLeNet architectures, one trained with and
Deep learning techniques have also been applied for one without hard-negative mining to tackle the chal-
normalization of histopathology images. Color normal- lenge. The latest submission of this team using the WSI
ization is an important research area in histopathology standardization algorithm by Ehteshami Bejnordi et al.
image analysis. In Janowczyk et al. (2016a), a method (2016) achieved an AUC of 0.9935, for task 2, which
for stain normalization of hematoxylin and eosin (H&E) outperformed the AUC of a pathologist (AUC = 0.966)
stained histopathology images was presented based on who independently scored the complete test set.
deep sparse auto-encoders. Recently, the importance of The recently held TUPAC challenge addressed detec-
color normalization was demonstrated by Sethi et al. tion of mitosis in breast cancer tissue, and prediction
(2016) for CNN based tissue classification in H&E of tumor grading at the WSI level. The top perform-
stained images. ing system by Paeng et al. (2016) achieved the highest
The introduction of grand challenges in digital performance in all tasks. The method has three main
pathology has fostered the development of comput- components: (1) Finding high cell density regions, (2)
erized digital pathology techniques. The challenges using a CNN to detect mitoses in the regions of interest,
that evaluated existing and new approaches for analy- (3) converting the results of mitosis detection to a fea-
sis of digital pathology images are: EM segmentation ture vector for each WSI and using an SVM classifier
challenge 2012 for the 2D segmentation of neuronal to compute the tumor proliferation and molecular data
processes, mitosis detection challenges in ICPR 2012 scores.
19
Table 6: Overview of papers using deep learning techniques for breast image analysis. MG = mammography; TS = tomosynthesis; US = ultrasound;
ADN = Adaptive Deconvolution Network.
Table 7: Overview of papers using deep learning techniques for cardiac image analysis.
Table 9: Overview of papers using deep learning for musculoskeletal image analysis.
22
A surprising number of complete applications with large diversity of deep architectures are covered. The
promising results are available; one that stands out is earliest studies used pre-trained CNNs as feature extrac-
Jamaludin et al. (2016) who trained their system with tors. The fact that these pre-trained networks could sim-
12K discs and claimed near-human performances across ply be downloaded and directly applied to any medical
four different radiological scoring tasks. image facilitated their use. Moreover, in this approach
already existing systems based on handcrafted features
4.9. Other could simply be extended. In the last two years, how-
This final section lists papers that address multiple ever, we have seen that end-to-end trained CNNs have
applications (Table 10) and a variety of other applica- become the preferred approach for medical imaging in-
tions (Table 11). terpretation (see Figure 1). Such CNNs are often inte-
It is remarkable that one single architecture or ap- grated into existing image analysis pipelines and replace
proach based on deep learning can be applied with- traditional handcrafted machine learning methods. This
out modifications to different tasks; this illustrates the is the approach followed by the largest group of papers
versatility of deep learning and its general applicabil- in this survey and we can confidently state that this is
ity. In some works, pre-trained architectures are used, the current standard practice.
sometimes trained with images from a completely dif-
Key aspects of successful deep learning methods
ferent domain. Several authors analyze the effect of
fine-tuning a network by training it with a small data set After reviewing so many papers one would expect
of images from the intended application domain. Com- to be able to distill the perfect deep learning method
bining features extracted by a CNN with ‘traditional’ and architecture for each individual task and applica-
features is also commonly seen. tion area. Although convolutional neural networks (and
From Table 11, the large number of papers that ad- derivatives) are now clearly the top performers in most
dress obstetric applications stand out. Most papers ad- medical image analysis competitions, one striking con-
dress the groundwork, such as selecting an appropriate clusion we can draw is that the exact architecture is not
frame from an US stream. More work on automated the most important determinant in getting a good so-
measurements with deep learning in these US sequences lution. We have seen, for example in challenges like
is likely to follow. the Kaggle Diabetic Retinopathy Challenge, that many
The second area where CNNs are rapidly improv- researchers use the exact same architectures, the same
ing the state of the art is dermoscopic image analy- type of networks, but have widely varying results. A
sis. For a long time, diagnosing skin cancer from pho- key aspect that is often overlooked is that expert knowl-
tographs was considered very difficult and out of reach edge about the task to be solved can provide advan-
for computers. Many studies focused only on images tages that go beyond adding more layers to a CNN.
obtained with specialized cameras, and recent systems Groups and researchers that obtain good performance
based on deep networks produced promising results. A when applying deep learning algorithms often differ-
recent work by Esteva et al. (2017) demonstrated excel- entiate themselves in aspects outside of the deep net-
lent results with training a recent standard architecture work, like novel data preprocessing or augmentation
(Google’s Inception v3) on a data set of both dermo- techniques. An example is that the best performing
scopic and standard photographic images. This data set method in the CAMELYON16-challenge improved sig-
was two orders of magnitude larger than what was used nificantly (AUC from 0.92 to 0.99) by adding a stain
in literature before. In a thorough evaluation, the pro- normalization pre-processing step to improve general-
posed system performed on par with 30 board certified ization without changing the CNN. Other papers focus
dermatologists. on data augmentation strategies to make networks more
robust, and they report that these strategies are essential
to obtain good performance. An example is the elas-
5. Discussion tic deformations that were applied in the original U-Net
paper (Ronneberger et al., 2015).
Overview Augmentation and pre-processing are, of course, not
From the 308 papers reviewed in this survey, it is ev- the only key contributors to good solutions. Several re-
ident that deep learning has pervaded every aspect of searchers have shown that designing architectures in-
medical image analysis. This has happened extremely corporating unique task-specific properties can obtain
quickly: the vast majority of contributions, 242 papers, better results than straightforward CNNs. Two exam-
were published in 2016 or the first month of 2017. A ples which we encountered several times are multi-view
23
Table 10: Overview of papers using a single deep learning approach for different tasks. DQN = Deep Q-Network
Table 11: Overview of papers using deep learning for various image analysis tasks.
and multi-scale networks. Other, often underestimated, could perform the same task themselves via visual as-
parts of network design are the network input size and sessment of the network input. If they, or domain ex-
receptive field (i.e. the area in input space that con- perts, cannot achieve good performance, the chance that
tributes to a single output unit). Input sizes should be you need to modify your network input or architecture
selected considering for example the required resolution is high.
and context to solve a problem. One might increase the The last aspect we want to touch on is model hyper-
size of the patch to obtain more context, but without parameter optimization (e.g. learning rate, dropout
changing the receptive field of the network this might rate), which can help squeeze out extra performance
not be beneficial. As a standard sanity check researchers from a network. We believe this is of secondary im-
24
portance with respect to performance to the previously Given the complexity of leveraging free-text reports
discussed topics and training data quality. Disappoint- from PACS or similar systems to train algorithms, gen-
ingly, no clear recipe can be given to obtain the best set erally researchers request domain experts (e.g. radiolo-
of hyper-parameters as it is a highly empirical exercise. gist, pathologists) to make task-specific annotations for
Most researchers fall back to an intuition-based random the image data. Labeling a sufficiently large dataset can
search (Bergstra and Bengio, 2012), which often seems take a significant amount of time, and this is problem-
to work well enough. Some basic tips have been cov- atic. For example, to train deep learning systems for
ered before by Bengio (2012). Researchers have also segmentation in radiology often 3D, slice-by-slice an-
looked at Bayesian methods for hyper-parameter opti- notations need to be made and this is very time con-
mization (Snoek et al., 2012), but this has not been ap- suming. Thus, learning efficiently from limited data is
plied in medical image analysis as far as we are aware an important area of research in medical image analy-
of. sis. A recent paper focused on training a deep learning
segmentation system for 3D segmentation using only
Unique challenges in medical image analysis sparse 2D segmentations (Çiçek et al., 2016). Multiple-
It is clear that applying deep learning algorithms to instance or active learning approaches might also of-
medical image analysis presents several unique chal- fer benefit in some cases, and have recently been pur-
lenges. The lack of large training data sets is often men- sued in the context of deep learning (Yan et al., 2016).
tioned as an obstacle. However, this notion is only par- One can also consider leveraging non-expert labels via
tially correct. The use of PACS systems in radiology crowd-sourcing (Rajchl et al., 2016b). Other poten-
has been routine in most western hospitals for at least tial solutions can be found within the medical field it-
a decade and these are filled with millions of images. self; in histopathology one can sometimes use specific
There are few other domains where this magnitude of immunohistochemical stains to highlight regions of in-
imaging data, acquired for specific purposes, are dig- terest, reducing the need for expert experience (Turkki
itally available in well-structured archives. PACS-like et al., 2016).
systems are not as broadly used for other specialties in Even when data is annotated by domain expert, label
medicine, like ophthalmology and pathology, but this noise can be a significant limiting factor in developing
is changing as imaging becomes more prevalent across algorithms, whereas in computer vision the noise in the
disciplines. We are also seeing that increasingly large labeling of images is typically relatively low. To give
public data sets are made available: Esteva et al. (2017) an example, a widely used dataset for evaluating im-
used 18 public data sets and more than 105 training im- age analysis algorithms to detect nodules in lung CT is
ages; in the Kaggle diabetic retinopathy competition a the LIDC-IDRI dataset (Armato et al., 2011). In this
similar number of retinal images were released; and sev- dataset pulmonary nodules were annotated by four ra-
eral chest x-ray studies used more than 104 images. diologists independently. Subsequently the readers re-
The main challenge is thus not the availability of im- viewed each others annotations but no consensus was
age data itself, but the acquisition of relevant annota- forced. It turned out that the number of nodules they
tions/labeling for these images. Traditionally PACS sys- did not unanimously agreed on to be a nodule, was three
tems store free-text reports by radiologists describing times larger than the number they did fully agree on.
their findings. Turning these reports into accurate an- Training a deep learning system on such data requires
notations or structured labels in an automated manner careful consideration of how to deal with noise and un-
requires sophisticated text-mining methods, which is an certainty in the reference standard. One could think
important field of study in itself where deep learning is of solutions like incorporating labeling uncertainty di-
also widely used nowadays. With the introduction of rectly in the loss function, but this is still an open chal-
structured reporting into several areas of medicine, ex- lenge.
tracting labels to data is expected to become easier in the In medical imaging often classification or segmenta-
future. For example, there are already papers appearing tion is presented as a binary task: normal versus ab-
which directly leverage BI-RADS categorizations by ra- normal, object versus background. However, this is of-
diologist to train deep networks (Kisilev et al., 2016) or ten a gross simplification as both classes can be highly
semantic descriptions in analyzing optical coherence to- heterogeneous. For example, the normal category of-
mography images (Schlegl et al., 2015). We expect the ten consists of completely normal tissue but also sev-
amount of research in optimally leveraging free-text and eral categories of benign findings, which can be rare,
structured reports for network training to increase in the and may occasionally include a wide variety of imag-
near future. ing artifacts. This often leads to systems that are ex-
25
tremely good at excluding the most common normal sification, where the anatomical location of the patch is
subclasses, but fail miserably on several rare ones. A often unknown to network. One solution would be to
straightforward solution would be to turn the deep learn- feed the entire image to the deep network and use a dif-
ing system in a multi-class system by providing it with ferent type of evaluation to drive learning, as was done
detailed annotations of all possible subclasses. Obvi- by, for example, Milletari et al. (2016b), who designed
ously this again compounds the issue of limited avail- a loss function based on the Dice coefficient. This also
ability of expert time for annotating and is therefore of- takes advantage of the fact that medical images are of-
ten simply not feasible. Some researchers have specif- ten acquired using a relatively static protocol, where the
ically looked into tackling this imbalance by incorpo- anatomy is always roughly in the same position and at
rating intelligence in the training process itself, by ap- the same scale. However, as mentioned above, if the
plying selective sampling (van Grinsven et al., 2016) or receptive field of the network is small feeding in the en-
hard negative mining (Wang et al., 2016b). However, tire image offers no benefit. Furthermore, feeding full
such strategies typically fail when there is substantial images to the network is not always feasible due to, for
noise in the reference standard. Additional methods for example, memory constraints. In some cases this might
dealing with within-class heterogeneity would be highly be solved in the near future due to advances in GPU
welcome. technology, but in others, for example digital pathology
Another data-related challenge is class imbalance. In with its gigapixel-sized images, other strategies have to
medical imaging, images for the abnormal class might be invented.
be challenging to find, depending on the task at hand.
As an example, the implementation of breast cancer Outlook
screening programs has resulted in vast databases of Although most of the challenges mentioned above
mammograms that have been established at many lo- have not been adequately tackled yet, several high-
cations world-wide. However, the majority of these im- profile successes of deep learning in medical imaging
ages are normal and do not contain any suspicious le- have been reported, such as the work by Esteva et al.
sions. When a mammogram does contain a suspicious (2017) and Gulshan et al. (2016) in the fields of derma-
lesion this is often not cancerous, and even most can- tology and ophthalmology. Both papers show that it is
cerous lesions will not lead to the death of a patient. possible to outperform medical experts in certain tasks
Designing deep learning systems that are adept at han- using deep learning for image classification. However,
dling this class imbalance is another important area of we feel it is important to put these papers into context
research. A typical strategy we encountered in current relative to medical image analysis in general, as most
literature is the application of specific data augmenta- tasks can by no means be considered ’solved’. One as-
tion algorithms to just the underrepresented class, for pect to consider is that both Esteva et al. (2017) and Gul-
example scaling and rotation transforms to generate new shan et al. (2016) focus on small 2D color image classi-
lesions. Pereira et al. (2016) performed a thorough eval- fication, which is relatively similar to the tasks that have
uation of data augmentation strategies for brain lesion been tackled in computer vision (e.g. ImageNet). This
segmentation to combat class imbalance. allows them to take advantage of well-explored network
In medical image analysis useful information is not architectures like ResNet and VGG-Net which have
just contained within the images themselves. Physicians shown to have excellent results in these tasks. However,
often leverage a wealth of data on patient history, age, there is no guarantee that these architectures are optimal
demographics and others to arrive at better decisions. in for example regressions/detection tasks. It also al-
Some authors have already investigated combining this lowed the authors to use networks that were pre-trained
information into deep learning networks in a straight- on a very well-labeled dataset of millions of natural im-
forward manner (Kooi et al., 2017). However, as these ages, which helps combat the lack of similarly large,
authors note, the improvements that were obtained were labeled medical datasets. In contrast, in most medical
not as large as expected. One of the challenges is to bal- imaging tasks 3D gray-scale or multi-channel images
ance the number of imaging features in the deep learn- are used for which pre-trained networks or architectures
ing network (typically thousands) with the number of dont exist. In addition this data typically has very spe-
clinical features (typically only a handful) to prevent the cific challenges, like anisotropic voxel sizes, small reg-
clinical features from being drowned out. Physicians istration errors between varying channels (e.g. in multi-
often also need to use anatomical information to come parametric MRI) or varying intensity ranges. Although
to an accurate diagnosis. However, many deep learning many tasks in medical image analysis can be postulated
systems in medical imaging are still based on patch clas- as a classification problem, this might not always be the
26
optimal strategy as it typically requires some form of optimally leverage this wealth of information.
post-processing with non-deep learning methods (e.g. Finally, deep learning methods have often been de-
counting, segmentation or regression tasks). An inter- scribed as ‘black boxes’. Especially in medicine, where
esting example is the paper by Sirinukunwattana et al. accountability is important and can have serious legal
(2016), which details a method directly predicting the consequences, it is often not enough to have a good pre-
center locations of nuclei and shows that this outper- diction system. This system also has to be able to ar-
forms classification-based center localization. Nonethe- ticulate itself in a certain way. Several strategies have
less, the papers by Esteva et al. (2017) and Gulshan et al. been developed to understand what intermediate layers
(2016) do show what ideally is possible with deep learn- of convolutional networks are responding to, for exam-
ing methods that are well-engineered for specific medi- ple deconvolution networks (Zeiler and Fergus, 2014),
cal image analysis tasks. guided back-propagation (Springenberg et al., 2014) or
Looking at current trends in the machine learning deep Taylor composition (Montavon et al., 2017). Other
community with respect to deep learning, we identify a researchers have tied prediction to textual representa-
key area which can be highly relevant for medical imag- tions of the image (i.e. captioning) (Karpathy and Fei-
ing and is receiving (renewed) interest: unsupervised Fei, 2015), which is another useful avenue to understand
learning. The renaissance of neural networks started what a network is perceiving. Last, some groups have
around 2006 with the popularization of greedy layer- tried to combine Bayesian statistics with deep networks
wise pre-training of neural networks in an unsupervised to obtain true network uncertainty estimates Kendall
manner. This was quickly superseded by fully super- and Gal (2017). This would allow physicians to as-
vised methods which became the standard after the suc- sess when the network is giving unreliable predictions.
cess of AlexNet during the ImageNet competition of Leveraging these techniques in the application of deep
2012, and most papers in this survey follow a supervised learning methods to medical image analysis could ac-
approach. However, interest in unsupervised training celerate acceptance of deep learning applications among
strategies has remained and recently has regained trac- clinicians, and among patients. We also foresee deep
tion. learning approaches will be used for related tasks in
Unsupervised methods are attractive as they allow medical imaging, mostly unexplored, such as image re-
(initial) network training with the wealth of unlabeled construction (Wang, 2016). Deep learning will thus not
data available in the world. Another reason to as- only have a great impact in medical image analysis, but
sume that unsupervised methods will still have a sig- in medical imaging as a whole.
nificant role to play is the analogue to human learn-
ing, which seems to be much more data efficient and
also happens to some extent in an unsupervised man- Acknowledgments
ner; we can learn to recognize objects and structures
The authors would like to thank members of the Di-
without knowing the specific label. We only need very
agnostic Image Analysis Group for discussions and sug-
limited supervision to categorize these recognized ob-
gestions. This research was funded by grants KUN
jects into classes. Two novel unsupervised strategies
2012-5577, KUN 2014-7032, and KUN 2015-7970 of
which we expect to have an impact in medical imag-
the Dutch Cancer Society.
ing are variational auto-encoders (VAEs), introduced by
Kingma and Welling (2013) and generative adversar-
ial networks (GANs), introduced by Goodfellow et al. Appendix A: Literature selection
(2014). The former merges variational Bayesian graph-
ical models with neural networks as encoders/decoders. PubMed was searched for papers containing ”convo-
The latter uses two competing convolutional neural net- lutional” OR ”deep learning” in any field. We specif-
works where one is generating artificial data samples ically did not include the term neural network here as
and the other is discriminating artificial from real sam- this would result in an enormous amount of ’false pos-
ples. Both have stochastic components and are gener- itive’ papers covering brain research. This search ini-
ative networks. Most importantly, they can be trained tially gave over 700 hits. ArXiv was searched for pa-
end-to-end and learn representative features in a com- pers mentioning one of a set of terms related to medical
pletely unsupervised manner. As we discussed in pre- imaging. The exact search string was: ’abs:((medical
vious paragraphs, obtaining large amounts of unlabeled OR mri OR ”magnetic resonance” OR CT OR ”com-
medical data is generally much easier than labeled data puted tomography” OR ultrasound OR pathology OR
and unsupervised methods like VAEs and GANs could xray OR x-ray OR radiograph OR mammography OR
27
fundus OR OCT) AND (”deep learning” OR convo- nary texture and deep learning classification. In: Conf Proc IEEE
lutional OR cnn OR ”neural network”))’. Conference Eng Med Biol Soc. pp. 2940–2943.
Anavi, Y., Kogan, I., Gelbart, E., Geva, O., Greenspan, H., 2016. Vi-
proceedings for MICCAI (including workshops), SPIE, sualizing and enhancing a deep learning framework using patients
ISBI and EMBC were searched based on titles of pa- age and gender for chest X-ray image retrieval. In: Medical Imag-
pers. Again we looked for mentions of ’deep learning’ ing. Vol. 9785 of Proceedings of the SPIE. p. 978510.
or ’convolutional’ or ’neural network’. We went over all Andermatt, S., Pezold, S., Cattin, P., 2016. Multi-dimensional gated
recurrent units for the segmentation of biomedical 3D-data. In:
these papers and excluded the ones that did not discuss DLMIA. Vol. 10008 of Lect Notes Comput Sci. pp. 142–151.
medical imaging (e.g. applications to genetics, chem- Anthimopoulos, M., Christodoulidis, S., Ebner, L., Christe, A.,
istry), only used handcrafted features in combination Mougiakakou, S., 2016. Lung pattern classification for intersti-
with neural networks, or only referenced deep learn- tial lung diseases using a deep convolutional neural network. IEEE
Trans Med Imaging 35 (5), 1207–1216.
ing as future work. When in doubt whether a paper Antony, J., McGuinness, K., Connor, N. E. O., Moran, K., 2016.
should be included we read the abstract and when the Quantifying radiographic knee osteoarthritis severity using deep
exact methodology was still unclear we read the paper convolutional neural networks. arXiv:1609.02469.
itself. We checked references in all selected papers iter- Apou, G., Schaadt, N. S., Naegel, B., Forestier, G., Schönmeyer, R.,
Feuerhake, F., Wemmert, C., Grote, A., 2016. Detection of lobular
atively and consulted colleagues to identify any papers structures in normal breast tissue. Comput Biol Med 74, 91–102.
which were missed by our initial search. When largely Arevalo, J., González, F. A., Ramos-Pollán, R., Oliveira, J. L., Gue-
overlapping work had been reported in multiple publica- vara Lopez, M. A., 2016. Representation learning for mammogra-
phy mass lesion classification with convolutional neural networks.
tions, only the publication deemed most important was Comput Methods Programs Biomed 127, 248–257.
included. A typical example here was arXiv preprints Armato, S. G., McLennan, G., Bidaut, L., McNitt-Gray, M. F., Meyer,
that were subsequently published or conference contri- C. R., Reeves, A. P., Zhao, B., Aberle, D. R., Henschke, C. I., Hoff-
butions which were expanded and published in journals. man, E. A., Kazerooni, E. A., MacMahon, H., Beek, E. J. R. V.,
Yankelevitz, D., Biancardi, A. M., Bland, P. H., Brown, M. S., En-
gelmann, R. M., Laderach, G. E., Max, D., Pais, R. C., Qing, D.
P. Y., Roberts, R. Y., Smith, A. R., Starkey, A., Batrah, P., Caligiuri,
P., Farooqi, A., Gladish, G. W., Jude, C. M., Munden, R. F.,
References Petkovska, I., Quint, L. E., Schwartz, L. H., Sundaram, B., Dodd,
L. E., Fenimore, C., Gur, D., Petrick, N., Freymann, J., Kirby, J.,
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Hughes, B., Casteele, A. V., Gupte, S., Sallamm, M., Heath, M. D.,
Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Kuhn, M. H., Dharaiya, E., Burns, R., Fryd, D. S., Salganicoff, M.,
Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, Anand, V., Shreter, U., Vastagh, S., Croft, B. Y., 2011. The lung
R., Kaiser, L., Kudlur, M., Levenberg, J., Mane, D., Monga, R., image database consortium (LIDC) and image database resource
Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, initiative (IDRI): a completed reference database of lung nodules
B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, on CT scans. Med Phys 38, 915–931.
V., Viegas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Avendi, M., Kheradvar, A., Jafarkhani, H., 2016. A combined deep-
Yu, Y., Zheng, X., 2016. Tensorflow: Large-scale machine learning learning and deformable-model approach to fully automatic seg-
on heterogeneous distributed systems. arXiv:1603.04467. mentation of the left ventricle in cardiac MRI. Med Image Anal
Abràmoff, M. D., Lou, Y., Erginay, A., Clarida, W., Amelon, R., Folk, 30, 108–119.
J. C., Niemeijer, M., 2016. Improved automated detection of dia- Azizi, S., Imani, F., Ghavidel, S., Tahmasebi, A., Kwak, J. T., Xu, S.,
betic retinopathy on a publicly available dataset through integra- Turkbey, B., Choyke, P., Pinto, P., Wood, B., Mousavi, P., Abol-
tion of deep learning. Invest Ophthalmol Vis Sci 57 (13), 5200– maesumi, P., 2016. Detection of prostate cancer using temporal
5206. sequences of ultrasound data: a large clinical feasibility study. Int
Akram, S. U., Kannala, J., Eklund, L., Heikkilä, J., 2016. Cell seg- J Comput Assist Radiol Surg 11 (6), 947–956.
mentation proposal network for microscopy image analysis. In: Bahrami, K., Shi, F., Rekik, I., Shen, D., 2016. Convolutional neural
DLMIA. Vol. 10008 of Lect Notes Comput Sci. pp. 21–29. network for reconstruction of 7T-like images from 3T MRI using
Akselrod-Ballin, A., Karlinsky, L., Alpert, S., Hasoul, S., Ben-Ari, R., appearance and anatomical features. In: DLMIA. Vol. 10008 of
Barkan, E., 2016. A region based convolutional network for tumor Lect Notes Comput Sci. pp. 39–47.
detection and classification in breast mammography. In: DLMIA. Bao, S., Chung, A. C., 2016. Multi-scale structured CNN with label
Vol. 10008 of Lect Notes Comput Sci. pp. 197–205. consistency for brain MR image segmentation. Computer Methods
Alansary, A., Kamnitsas, K., Davidson, A., Khlebnikov, R., Rajchl, in Biomechanics and Biomedical Engineering: Imaging & Visual-
M., Malamateniou, C., Rutherford, M., Hajnal, J. V., Glocker, B., ization, 1–5.
Rueckert, D., Kainz, B., 2016. Fast fully automatic segmentation Bar, Y., Diamant, I., Wolf, L., Greenspan, H., 2015. Deep learning
of the human placenta from motion corrupted MRI. In: Med Image with non-medical training used for chest pathology identification.
Comput Comput Assist Interv. Vol. 9901 of Lect Notes Comput In: Medical Imaging. Vol. 9414 of Proceedings of the SPIE. p.
Sci. pp. 589–597. 94140V.
Albarqouni, S., Baur, C., Achilles, F., Belagiannis, V., Demirci, S., Bar, Y., Diamant, I., Wolf, L., Lieberman, S., Konen, E., Greenspan,
Navab, N., 2016. AggNet: Deep learning from crowds for mito- H., 2016. Chest pathology identification using deep feature selec-
sis detection in breast cancer histology images. IEEE Trans Med tion with non-medical training. Computer Methods in Biomechan-
Imaging 35, 1313–1321. ics and Biomedical Engineering: Imaging & Visualization, 1–5.
Anavi, Y., Kogan, I., Gelbart, E., Geva, O., Greenspan, H., 2015. A Barbu, A., Lu, L., Roth, H., Seff, A., Summers, R. M., 2016. An anal-
comparative study for chest radiograph image retrieval using bi- ysis of robust cost functions for CNN in computer-aided diagnosis.
28
Computer Methods in Biomechanics and Biomedical Engineering: 9791 of Proceedings of the SPIE. p. 979115.
Imaging & Visualization 2016, 1–6. Cai, J., Lu, L., Zhang, Z., Xing, F., Yang, L., Yin, Q., 2016a. Pancreas
Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Goodfellow, I., segmentation in mri using graph-based decision fusion on convo-
Bergeron, A., Bouchard, N., Warde-Farley, D., Bengio, Y., 2012. lutional neural networks. In: Med Image Comput Comput Assist
Theano: new features and speed improvements. In: Deep Learning Interv. Vol. 9901 of Lect Notes Comput Sci. pp. 442–450.
and Unsupervised Feature Learning NIPS 2012 Workshop. Cai, Y., Landis, M., Laidley, D. T., Kornecki, A., Lum, A., Li, S.,
Bauer, S., Carion, N., Schäffler, P., Fuchs, T., Wild, P., Buhmann, 2016b. Multi-modal vertebrae recognition using transformed deep
J. M., 2016. Multi-organ cancer classification and survival analy- convolution network. Comput Med Imaging Graph 51, 11–19.
sis. arXiv:1606.00897. Carneiro, G., Nascimento, J. C., 2013. Combining multiple dynamic
Baumgartner, C. F., Kamnitsas, K., Matthew, J., Smith, S., Kainz, models and deep learning architectures for tracking the left ventri-
B., Rueckert, D., 2016. Real-time standard scan plane detection cle endocardium in ultrasound data. IEEE Trans Pattern Anal Mach
and localisation in fetal ultrasound using fully convolutional neural Intell 35, 2592–2607.
networks. In: Med Image Comput Comput Assist Interv. Vol. 9901 Carneiro, G., Nascimento, J. C., Freitas, A., 2012. The segmentation
of Lect Notes Comput Sci. pp. 203–211. of the left ventricle of the heart from ultrasound data using deep
Ben-Cohen, A., Diamant, I., Klang, E., Amitai, M., Greenspan, H., learning architectures and derivative-based search methods. IEEE
2016. Dlmia. In: International Workshop on Large-Scale Annota- Trans Image Process, 968–982.
tion of Biomedical Data and Expert Label Synthesis. Vol. 10008 of Carneiro, G., Oakden-Rayner, L., Bradley, A. P., Nascimento, J.,
Lect Notes Comput Sci. pp. 77–85. Palmer, L., 2016. Automated 5-year mortality prediction using
Bengio, Y., 2012. Practical recommendations for gradient-based train- deep learning and radiomics features from chest computed tomog-
ing of deep architectures. In: Neural Networks: Tricks of the raphy. arXiv:1607.00267.
Trade. Springer Berlin Heidelberg, pp. 437–478. Cha, K. H., Hadjiiski, L. M., Samala, R. K., Chan, H.-P., Cohan, R. H.,
Bengio, Y., Courville, A., Vincent, P., 2013. Representation learning: Caoili, E. M., Paramagul, C., Alva, A., Weizer, A. Z., Dec. 2016.
A review and new perspectives. IEEE Trans Pattern Anal Mach Bladder cancer segmentation in CT for treatment response assess-
Intell 35 (8), 1798–1828. ment: Application of deep-learning convolution neural network-a
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H., 2007. Greedy pilot study. Tomography 2, 421–429.
layer-wise training of deep networks. In: Advances in Neural In- Chang, H., Han, J., Zhong, C., Snijders, A., Mao, J.-H., Jan. 2017. Un-
formation Processing Systems. pp. 153–160. supervised transfer learning via multi-scale convolutional sparse
Bengio, Y., Simard, P., Frasconi, P., 1994. Learning long-term depen- coding for biomedical applications. IEEE transactions on pattern
dencies with gradient descent is difficult. IEEE Trans Neural Netw analysis and machine intelligence.
5, 157–166. Charbonnier, J., van Rikxoort, E., Setio, A., Schaefer-Prokop, C., van
Benou, A., Veksler, R., Friedman, A., Raviv, T. R., 2016. De-noising Ginneken, B., Ciompi, F., 2017. Improving airway segmentation
of contrast-enhanced mri sequences by an ensemble of expert deep in computed tomography using leak detection with convolutional
neural networks. In: DLMIA. Vol. 10008 of Lect Notes Comput networks. Med Image Anal 36, 52–60.
Sci. pp. 95–110. Chen, H., Dou, Q., Ni, D., Cheng, J.-Z., Qin, J., Li, S., Heng, P.-A.,
BenTaieb, A., Hamarneh, G., 2016. Topology aware fully convolu- 2015a. Automatic fetal ultrasound standard plane detection using
tional networks for histology gland segmentation. In: Med Image knowledge transferred recurrent neural networks. In: Med Image
Comput Comput Assist Interv. Vol. 9901 of Lect Notes Comput Comput Comput Assist Interv. Vol. 9349 of Lect Notes Comput
Sci. pp. 460–468. Sci. Cham, pp. 507–514.
BenTaieb, A., Kawahara, J., Hamarneh, G., 2016. Multi-loss convo- Chen, H., Dou, Q., Yu, L., Heng, P.-A., 2016a. VoxResNet: Deep
lutional networks for gland analysis in microscopy. In: IEEE Int voxelwise residual networks for volumetric brain segmentation.
Symp Biomedical Imaging. pp. 642–645. arXiv:1608.05895.
Bergstra, J., Bengio, Y., 2012. Random search for hyper-parameter Chen, H., Ni, D., Qin, J., Li, S., Yang, X., Wang, T., Heng, P. A.,
optimization. J Mach Learn Res 13 (1), 281–305. 2015b. Standard plane localization in fetal ultrasound via domain
Birenbaum, A., Greenspan, H., 2016. Longitudinal multiple sclero- transferred deep neural networks. IEEE J Biomed Health Inform
sis lesion segmentation using multi-view convolutional neural net- 19 (5), 1627–1636.
works. In: DLMIA. Vol. 10008 of Lect Notes Comput Sci. pp. Chen, H., Qi, X., Yu, L., Heng, P.-A., 2017. DCAN: Deep contour-
58–67. aware networks for accurate gland segmentation. Med Image Anal
Brosch, T., Tam, R., 2013. Manifold learning of brain MRIs by deep 36, 135–146.
learning. In: Med Image Comput Comput Assist Interv. Vol. 8150 Chen, H., Shen, C., Qin, J., Ni, D., Shi, L., Cheng, J. C. Y., Heng, P.-
of Lect Notes Comput Sci. pp. 633–640. A., 2015c. Automatic localization and identification of vertebrae
Brosch, T., Tang, L. Y., Yoo, Y., Li, D. K., Traboulsee, A., Tam, R., in spine CT via a joint learning model with deep neural networks.
2016. Deep 3D convolutional encoder networks with shortcuts for In: Med Image Comput Comput Assist Interv. Vol. 9349 of Lect
multiscale feature integration applied to Multiple Sclerosis lesion Notes Comput Sci. pp. 515–522.
segmentation. IEEE Trans Med Imaging 35 (5), 1229–1239. Chen, H., Wang, X., Heng, P. A., 2016b. Automated mitosis detec-
Brosch, T., Yoo, Y., Li, D. K. B., Traboulsee, A., Tam, R., 2014. tion with deep regression networks. In: IEEE Int Symp Biomedical
Modeling the variability in brain morphology and lesion distribu- Imaging. pp. 1204–1207.
tion in multiple sclerosis by deep learning. In: Med Image Comput Chen, H., Zheng, Y., Park, J.-H., Heng, P.-A., Zhou, S. K., 2016c.
Comput Assist Interv. Vol. 8674 of Lect Notes Comput Sci. pp. Iterative multi-domain regularized deep learning for anatomical
462–469. structure detection and segmentation from ultrasound images. In:
Burlina, P., Freund, D. E., Joshi, N., Wolfson, Y., Bressler, N. M., Med Image Comput Comput Assist Interv. Vol. 9901 of Lect Notes
2016. Detection of age-related macular degeneration via deep Comput Sci. pp. 487–495.
learning. In: IEEE Int Symp Biomedical Imaging. pp. 184–188. Chen, J., Yang, L., Zhang, Y., Alber, M., Chen, D. Z., 2016d. Com-
Bychkov, D., Turkki, R., Haglund, C., Linder, N., Lundin, J., 2016. bining fully convolutional and recurrent neural networks for 3D
Deep learning for tissue microarray image-based outcome predic- biomedical image segmentation. In: Advances in Neural Informa-
tion in patients with colorectal cancer. In: Medical Imaging. Vol. tion Processing Systems. pp. 3036–3044.
29
Chen, S., Qin, J., Ji, X., Lei, B., Wang, T., Ni, D., Cheng, J.-Z., 2016e. ing Systems. pp. 2843–2851.
Automatic scoring of multiple semantic attributes with multi-task Codella, N., Cai, J., Abedini, M., Garnavi, R., Halpern, A., Smith,
feature leverage: A study on pulmonary nodules in CT images. J. R., 2015. Deep learning, sparse coding, and svm for melanoma
IEEE Trans Med Imaging, in press. recognition in dermoscopy images. In: International Workshop on
Chen, X., Xu, Y., Wong, D. W. K., Wong, T. Y., Liu, J., 2015d. Glau- Machine Learning in Medical Imaging. pp. 118–126.
coma detection based on deep convolutional neural network. In: Collobert, R., Kavukcuoglu, K., Farabet, C., 2011. Torch7: A matlab-
Conf Proc IEEE Eng Med Biol Soc. pp. 715–718. like environment for machine learning. In: Advances in Neural
Cheng, J.-Z., Ni, D., Chou, Y.-H., Qin, J., Tiu, C.-M., Chang, Y.- Information Processing Systems.
C., Huang, C.-S., Shen, D., Chen, C.-M., 2016a. Computer-Aided Cruz-Roa, A., Basavanhally, A., González, F., Gilmore, H., Feldman,
Diagnosis with deep learning architecture: Applications to breast M., Ganesan, S., Shih, N., Tomaszewski, J., Madabhushi, A., 2014.
lesions in US images and pulmonary nodules in CT scans. Nat Sci Automatic detection of invasive ductal carcinoma in whole slide
Rep 6, 24454. images with convolutional neural networks. In: Medical Imaging.
Cheng, R., Roth, H. R., Lu, L., Wang, S., Turkbey, B., Gandler, W., Vol. 9041 of Proceedings of the SPIE. p. 904103.
McCreedy, E. S., Agarwal, H. K., Choyke, P., Summers, R. M., Cruz-Roa, A. A., Ovalle, J. E. A., Madabhushi, A., Osorio, F. A. G.,
McAuliffe, M. J., 2016b. Active appearance model and deep learn- 2013. A deep learning architecture for image representation, visual
ing for more accurate prostate segmentation on MRI. In: Medical interpretability and automated basal-cell carcinoma cancer detec-
Imaging. Vol. 9784 of Proceedings of the SPIE. p. 97842I. tion. In: Med Image Comput Comput Assist Interv. Vol. 8150 of
Cheng, X., Zhang, L., Zheng, Y., 2015. Deep similarity learning for Lect Notes Comput Sci. pp. 403–410.
multimodal medical images. Computer Methods in Biomechanics Dalmis, M., Litjens, G., Holland, K., Setio, A., Mann, R., Karsse-
and Biomedical Engineering, 1–5. meijer, N., Gubern-Mérida, A., Feb. 2017. Using deep learning to
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, segment breast and fibroglandular tissue in mri volumes. Medical
F., Schwenk, H., Bengio, Y., 2014. Learning phrase representa- physics 44, 533–546.
tions using rnn encoder-decoder for statistical machine translation. de Brebisson, A., Montana, G., 2015. Deep neural networks for
arXiv:1406.1078. anatomical brain segmentation. In: Comput Vis Pattern Recognit.
Choi, H., Jin, K. H., 2016. Fast and robust segmentation of the stria- pp. 20–28.
tum using deep convolutional neural networks. Journal of Neuro- de Vos, B. D., Viergever, M. A., de Jong, P. A., Išgum, I., 2016a. Au-
science Methods 274, 146–153. tomatic slice identification in 3D medical images with a ConvNet
Christ, P. F., Elshaer, M. E. A., Ettlinger, F., Tatavarty, S., Bickel, M., regressor. In: DLMIA. Vol. 10008 of Lect Notes Comput Sci. pp.
Bilic, P., Rempfler, M., Armbruster, M., Hofmann, F., D’Anastasi, 161–169.
M., et al., 2016. Automatic liver and lesion segmentation in CT de Vos, B. D., Wolterink, J. M., de Jong, P. A., Viergever, M. A.,
using cascaded fully convolutional neural networks and 3D condi- Išgum, I., 2016b. 2D image classification for 3D anatomy localiza-
tional random fields. In: Med Image Comput Comput Assist Interv. tion: employing deep convolutional neural networks. In: Medical
Vol. 9901 of Lect Notes Comput Sci. pp. 415–423. Imaging. Vol. 9784 of Proceedings of the SPIE. p. 97841Y.
Christodoulidis, S., Anthimopoulos, M., Ebner, L., Christe, A., Demyanov, S., Chakravorty, R., Abedini, M., Halpern, A., Garnavi,
Mougiakakou, S., 2017. Multi-source transfer learning with convo- R., 2016. Classification of dermoscopy patterns using deep convo-
lutional neural networks for lung pattern analysis. IEEE J Biomed lutional neural networks. In: IEEE Int Symp Biomedical Imaging.
Health Inform 21, 76–84. pp. 364–368.
Çiçek, Ö., Abdulkadir, A., Lienkamp, S. S., Brox, T., Ronneberger, Dhungel, N., Carneiro, G., Bradley, A. P., 2016. The automated learn-
O., 2016. 3D U-Net: Learning dense volumetric segmentation ing of deep features for breast mass classification from mammo-
from sparse annotation. In: Med Image Comput Comput Assist grams. In: Med Image Comput Comput Assist Interv. Vol. 9901 of
Interv. Vol. 9901 of Lect Notes Comput Sci. Springer, pp. 424– Lect Notes Comput Sci. Springer, pp. 106–114.
432. Dou, Q., Chen, H., Jin, Y., Yu, L., Qin, J., Heng, P.-A., 2016a. 3D
Cicero, M., Bilbily, A., Colak, E., Dowdell, T., Gray, B., Perampal- deeply supervised network for automatic liver segmentation from
adas, K., Barfett, J., 2016. Training and validating a deep convo- CT volumes. arXiv:1607.00582.
lutional neural network for computer-aided detection and classifi- Dou, Q., Chen, H., Yu, L., Qin, J., Heng, P. A., 2016b. Multi-level
cation of abnormalities on frontal chest radiographs. Invest Radiol, contextual 3D CNNs for false positive reduction in pulmonary nod-
in press. ule detection, in press.
Ciompi, F., Chung, K., van Riel, S. J., Setio, A. A. A., Gerke, P. K., Ja- Dou, Q., Chen, H., Yu, L., Shi, L., Wang, D., Mok, V. C., Heng,
cobs, C., Scholten, E. T., Schaefer-Prokop, C. M., Wille, M. M. W., P. A., 2015. Automatic cerebral microbleeds detection from MR
Marchiano, A., Pastorino, U., Prokop, M., van Ginneken, B., 2016. images via independent subspace analysis based hierarchical fea-
Towards automatic pulmonary nodule management in lung cancer tures. Conf Proc IEEE Eng Med Biol Soc, 7933–7936.
screening with deep learning. arXiv:1610.09157. Dou, Q., Chen, H., Yu, L., Zhao, L., Qin, J., Wang, D., Mok, V. C.,
Ciompi, F., de Hoop, B., van Riel, S. J., Chung, K., Scholten, E. T., Shi, L., Heng, P.-A., 2016c. Automatic detection of cerebral mi-
Oudkerk, M., de Jong, P. A., Prokop, M., van Ginneken, B., crobleeds from MR images via 3D convolutional neural networks.
2015. Automatic classification of pulmonary peri-fissural nodules IEEE Trans Med Imaging 35, 1182–1195.
in computed tomography using an ensemble of 2D views and a Drozdzal, M., Vorontsov, E., Chartrand, G., Kadoury, S., Pal, C.,
convolutional neural network out-of-the-box. Med Image Anal 26, 2016. The importance of skip connections in biomedical image
195–202. segmentation. In: DLMIA. Vol. 10008 of Lect Notes Comput Sci.
Cireşan, D. C., Giusti, A., Gambardella, L. M., Schmidhuber, J., 2013. pp. 179–187.
Mitosis detection in breast cancer histology images with deep neu- Dubrovina, A., Kisilev, P., Ginsburg, B., Hashoul, S., Kimmel, R.,
ral networks. In: Med Image Comput Comput Assist Interv. Vol. 2016. Computational mammography using deep neural networks.
8150 of Lect Notes Comput Sci. pp. 411–418. Computer Methods in Biomechanics and Biomedical Engineering:
Ciresan, D., Giusti, A., Gambardella, L. M., Schmidhuber, J., 2012. Imaging & Visualization, 1–5.
Deep neural networks segment neuronal membranes in electron Ehteshami Bejnordi, B., Litjens, G., Timofeeva, N., Otte-Holler, I.,
microscopy images. In: Advances in Neural Information Process- Homeyer, A., Karssemeijer, N., van der Laak, J., Sep 2016. Stain
30
specific standardization of whole-slide histopathological images. Gao, Y., Maraci, M. A., Noble, J. A., 2016d. Describing ultrasound
IEEE Trans Med Imaging 35 (2), 404–415. video content using deep convolutional neural networks. In: IEEE
URL http://dx.doi.org/10.1109/TMI.2015.2476509 Int Symp Biomedical Imaging. pp. 787–790.
Emad, O., Yassine, I. A., Fahmy, A. S., 2015. Automatic localization Gao, Z., Wang, L., Zhou, L., Zhang, J., 2016e. Hep-2 cell image
of the left ventricle in cardiac MRI images using deep learning. In: classification with deep convolutional neural networks. Journal of
Conf Proc IEEE Eng Med Biol Soc. pp. 683–686. Biomedical and Health Informatics.
Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, Ghafoorian, M., Karssemeijer, N., Heskes, T., Bergkamp, M.,
H. M., Thrun, S., 2017. Dermatologist-level classification of skin Wissink, J., Obels, J., Keizer, K., de Leeuw, F.-E., van Ginneken,
cancer with deep neural networks. Nature 542, 115–118. B., Marchiori, E., Platel, B., 2017. Deep multi-scale location-
Farabet, C., Couprie, C., Najman, L., LeCun, Y., 2013. Learning hier- aware 3d convolutional neural networks for automated detection
archical features for scene labeling. IEEE Trans Pattern Anal Mach of lacunes of presumed vascular origin. NeuroImage: Clinical, in
Intell 35 (8), 1915–1929. press.
Farag, A., Lu, L., Roth, H. R., Liu, J., Turkbey, E., Summers, Ghafoorian, M., Karssemeijer, N., Heskes, T., van Uden, I., Sanchez,
R. M., 2015. A bottom-up approach for pancreas segmenta- C., Litjens, G., de Leeuw, F.-E., van Ginneken, B., Marchiori,
tion using cascaded superpixels and (deep) image patch labeling. E., Platel, B., 2016a. Location sensitive deep convolutional neu-
arXiv:1505.06236. ral networks for segmentation of white matter hyperintensities.
Ferrari, A., Lombardi, S., Signoroni, A., 2015. Bacterial colony arXiv:1610.04834.
counting by convolutional neural networks. Conf Proc IEEE Eng Ghafoorian, M., Karssemeijer, N., Heskes, T., van Uden, I. W. M.,
Med Biol Soc, 7458–7461. de Leeuw, F.-E., Marchiori, E., van Ginneken, B., Platel, B.,
Fonseca, P., Mendoza, J., Wainer, J., Ferrer, J., Pinto, J., Guerrero, 2016b. Non-uniform patch sampling with deep convolutional neu-
J.and Castaneda, B., 2015. Automatic breast density classification ral networks for white matter hyperintensity segmentation. In:
using a convolutional neural network architecture search proce- IEEE Int Symp Biomedical Imaging. pp. 1414–1417.
dure. In: Medical Imaging. Vol. 9413 of Proceedings of the SPIE. Ghesu, F. C., Georgescu, B., Mansi, T., Neumann, D., Hornegger, J.,
p. 941428. Comaniciu, D., 2016a. An artificial agent for anatomical landmark
Forsberg, D., Sjöblom, E., Sunshine, J. L., 2017. Detection and label- detection in medical images. In: Med Image Comput Comput As-
ing of vertebrae in MR images using deep learning with clinical sist Interv. Vol. 9901 of Lect Notes Comput Sci.
annotations as training data. J Digit Imaging, in press. Ghesu, F. C., Krubasik, E., Georgescu, B., Singh, V., Zheng, Y.,
Fotin, S. V., Yin, Y., Haldankar, H., Hoffmeister, J. W., Periaswamy, Hornegger, J., Comaniciu, D., 2016b. Marginal space deep learn-
S., 2016. Detection of soft tissue densities from digital breast to- ing: Efficient architecture for volumetric image parsing. IEEE
mosynthesis: comparison of conventional and deep learning ap- Trans Med Imaging 35, 1217–1228.
proaches. In: Medical Imaging. Vol. 9785 of Proceedings of the Golan, D., Donner, Y., Mansi, C., Jaremko, J., Ramachandran, M.,
SPIE. p. 97850X. 2016. Fully automating Graf‘s method for DDH diagnosis using
Fritscher, K., Raudaschl, P., Zaffino, P., Spadea, M. F., Sharp, G. C., deep convolutional neural networks. In: DLMIA. Vol. 10008 of
Schubert, R., 2016. Deep neural networks for fast segmentation of Lect Notes Comput Sci. pp. 130–141.
3D medical images. In: Med Image Comput Comput Assist Interv. Golkov, V., Dosovitskiy, A., Sperl, J., Menzel, M., Czisch, M.,
Vol. 9901 of Lect Notes Comput Sci. pp. 158–165. Samann, P., Brox, T., Cremers, D., 2016. q-Space deep learning:
Fu, H., Xu, Y., Lin, S., Kee Wong, D. W., Liu, J., 2016a. Deepves- Twelve-fold shorter and model-free diffusion MRI scans. IEEE
sel: Retinal vessel segmentation via?deep learning and conditional Trans Med Imaging 35, 1344 – 1351.
random?field. In: Med Image Comput Comput Assist Interv. Vol. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley,
9901 of Lect Notes Comput Sci. pp. 132–139. D., Ozair, S., Courville, A., Bengio, Y., 2014. Generative adversar-
Fu, H., Xu, Y., Wong, D. W. K., Liu, J., 2016b. Retinal vessel seg- ial nets. arXiv:1406.2661.
mentation via deep learning network and fully-connected condi- Greenspan, H., Summers, R. M., van Ginneken, B., 2016. Deep learn-
tional random fields. In: IEEE Int Symp Biomedical Imaging. pp. ing in medical imaging: Overview and future promise of an excit-
698–701. ing new technique. IEEE Trans Med Imaging 35 (5), 1153–1159.
Fukushima, K., 1980. Neocognitron: A self-organizing neural net- Gulshan, V., Peng, L., Coram, M., Stumpe, M. C., Wu, D.,
work model for a mechanism of pattern recognition unaffected by Narayanaswamy, A., Venugopalan, S., Widner, K., Madams, T.,
shift in position. Biol Cybern 36 (4), 193–202. Cuadros, J., Kim, R., Raman, R., Nelson, P. C., Mega, J. L., Web-
Gao, M., Bagci, U., Lu, L., Wu, A., Buty, M., Shin, H.-C., Roth, ster, D. R., Dec. 2016. Development and validation of a deep learn-
H., Papadakis, G. Z., Depeursinge, A., Summers, R. M., Xu, Z., ing algorithm for detection of diabetic retinopathy in retinal fundus
Mollura, D. J., 2016a. Holistic classification of CT attenuation photographs. JAMA 316, 2402–2410.
patterns for interstitial lung diseases via deep convolutional neu- Gülsün, M. A., Funka-Lea, G., Sharma, P., Rapaka, S., Zheng, Y.,
ral networks. Computer Methods in Biomechanics and Biomedical 2016. Coronary centerline extraction via optimal flow paths and
Engineering: Imaging & Visualization, 1–6. CNN path pruning. In: Med Image Comput Comput Assist Interv.
Gao, M., Xu, Z., Lu, L., Harrison, A. P., Summers, R. M., Mollura, Vol. 9902 of Lect Notes Comput Sci. Springer, pp. 317–325.
D. J., 2016b. Multi-label deep regression and unordered pooling Günhan Ertosun, M., Rubin, D. L., 2015. Automated grading of
for holistic interstitial lung disease pattern detection. In: Machine gliomas using deep learning in digital pathology images: a modu-
Learning in Medical Imaging. Vol. 10019 of Lect Notes Comput lar approach with ensemble of convolutional neural networks. In:
Sci. pp. 147–155. AMIA Annual Symposium. pp. 1899–1908.
Gao, M., Xu, Z., Lu, L., Nogues, I., Summers, R., Mollura, D., 2016c. Guo, Y., Gao, Y., Shen, D., 2016. Deformable MR prostate segmen-
Segmentation label propagation using deep convolutional neural tation via deep feature learning and sparse patch matching. IEEE
networks and dense conditional random field. In: IEEE Int Symp Trans Med Imaging 35 (4), 1077–1089.
Biomedical Imaging. pp. 1265–1268. Guo, Y., Wu, G., Commander, L. A., Szary, S., Jewells, V., Lin, W.,
Gao, X., Lin, S., Wong, T. Y., 2015. Automatic feature learning Shen, D., 2014. Segmenting hippocampus from infant brains by
to grade nuclear cataracts based on deep learning. IEEE Trans sparse patch matching with deep-learned features. In: Med Image
Biomed Eng 62 (11), 2693–2701. Comput Comput Assist Interv. Vol. 8674 of Lect Notes Comput
31
Sci. pp. 308–315. A resolution adaptive deep hierarchical (RADHicaL) learning
Han, X.-H., Lei, J., Chen, Y.-W., 2016. HEp-2 cell classification using scheme applied to nuclear segmentation of digital pathology im-
K-support spatial pooling in deep CNNs. In: DLMIA. Vol. 10008 ages. Computer Methods in Biomechanics and Biomedical Engi-
of Lect Notes Comput Sci. pp. 3–11. neering: Imaging & Visualization, 1–7.
Haugeland, J., 1985. Artificial intelligence: the very idea. The MIT Janowczyk, A., Madabhushi, A., 2016. Deep learning for digital
Press, Cambridge, Mass. pathology image analysis: A comprehensive tutorial with selected
Havaei, M., Davy, A., Warde-Farley, D., Biard, A., Courville, A., Ben- use cases. Journal of pathology informatics 7, 29.
gio, Y., Pal, C., Jodoin, P.-M., Larochelle, H., 2016a. Brain tumor Jaumard-Hakoun, A., Xu, K., Roussel-Ragot, P., Dreyfus, G., Denby,
segmentation with Deep Neural Networks. Med Image Anal 35, B., 2016. Tongue contour extraction from ultrasound images based
18–31. on deep neural network. arXiv:1605.05912.
Havaei, M., Guizard, N., Chapados, N., Bengio, Y., 2016b. HeMIS: Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick,
Hetero-modal image segmentation. In: Med Image Comput Com- R., Guadarrama, S., Darrell, T., 2014. Caffe: Convolutional archi-
put Assist Interv. Vol. 9901 of Lect Notes Comput Sci. pp. 469– tecture for fast feature embedding. In: Proceedings of the 22nd
477. ACM International Conference on Multimedia. pp. 675–678.
He, K., Zhang, X., Ren, S., Sun, J., 2015. Deep residual learning for Kainz, P., Pfeiffer, M., Urschler, M., 2015. Semantic segmentation
image recognition. arXiv:1512.03385. of colon glands with deep convolutional neural networks and total
Hinton, G., 2010. A practical guide to training restricted Boltzmann variation segmentation. arXiv:1511.06919.
machines. Momentum 9 (1), 926. Källén, H., Molin, J., Heyden, A., Lundstr, C., Aström, K., 2016. To-
Hinton, G. E., Osindero, S., Teh, Y.-W., 2006. A fast learning algo- wards grading gleason score using generically trained deep convo-
rithm for deep belief nets. Neural Comput 18, 1527–1554. lutional neural networks. In: IEEE Int Symp Biomedical Imaging.
Hinton, G. E., Salakhutdinov, R. R., 2006. Reducing the dimensional- pp. 1163–1167.
ity of data with neural networks. Science 313, 504–507. Kallenberg, M., Petersen, K., Nielsen, M., Ng, A., Diao, P., Igel, C.,
Hochreiter, S., Schmidhuber, J., 1997. Long short-term memory. Neu- Vachon, C., Holland, K., Karssemeijer, N., Lillholm, M., 2016.
ral Computation 9 (8), 1735–1780. Unsupervised deep learning applied to breast density segmenta-
Hoffmann, N., Koch, E., Steiner, G., Petersohn, U., Kirsch, M., 2016. tion and mammographic risk scoring. IEEE Trans Med Imaging
Learning thermal process representations for intraoperative analy- 35, 1322–1331.
sis of cortical perfusion during ischemic strokes. In: DLMIA. Vol. Kamnitsas, K., Ledig, C., Newcombe, V. F., Simpson, J. P., Kane,
10008 of Lect Notes Comput Sci. pp. 152–160. A. D., Menon, D. K., Rueckert, D., Glocker, B., 2017. Efficient
Hoogi, A., Subramaniam, A., Veerapaneni, R., Rubin, D., 2016. multi-scale 3D CNN with fully connected CRF for accurate brain
Adaptive estimation of active contour parameters using convolu- lesion segmentation. Med Image Anal 36, 61–78.
tional neural networks and texture analysis. IEEE Trans Med Imag- Karpathy, A., Fei-Fei, L., June 2015. Deep visual-semantic align-
ing. ments for generating image descriptions. In: Comput Vis Pattern
Hosseini-Asl, E., Gimel’farb, G., El-Baz, A., 2016. Alzheimer’s dis- Recognit. ArXiv:1412.2306.
ease diagnostics by a deeply supervised adaptable 3D convolu- Kashif, M. N., Raza, S. E. A., Sirinukunwattana, K., Arif, M., Ra-
tional network. arXiv:1607.00556. jpoot, N., 2016. Handcrafted features with convolutional neural
Hu, P., Wu, F., Peng, J., Bao, Y., Chen, F., Kong, D., Nov. 2016a. networks for detection of tumor cells in histology images. In: IEEE
Automatic abdominal multi-organ segmentation using deep convo- Int Symp Biomedical Imaging. pp. 1029–1032.
lutional neural network and time-implicit level sets. Int J Comput Kawahara, J., BenTaieb, A., Hamarneh, G., 2016a. Deep features to
Assist Radiol Surg. classify skin lesions. In: IEEE Int Symp Biomedical Imaging. pp.
Hu, P., Wu, F., Peng, J., Liang, P., Kong, D., Dec. 2016b. Automatic 1397–1400.
3D liver segmentation based on deep learning and globally opti- Kawahara, J., Brown, C. J., Miller, S. P., Booth, B. G., Chau, V.,
mized surface evolution. Phys Med Biol 61, 8676–8698. Grunau, R. E., Zwicker, J. G., Hamarneh, G., 2016b. Brain-
Huang, H., Hu, X., Han, J., Lv, J., Liu, N., Guo, L., Liu, T., 2016. NetCNN: Convolutional neural networks for brain networks; to-
Latent source mining in FMRI data via deep neural network. In: wards predicting neurodevelopment. NeuroImage.
IEEE Int Symp Biomedical Imaging. pp. 638–641. Kawahara, J., Hamarneh, G., 2016. Multi-resolution-tract CNN with
Huynh, B. Q., Li, H., Giger, M. L., Jul 2016. Digital mammographic hybrid pretrained and skin-lesion trained layers. In: Machine
tumor classification using transfer learning from deep convolu- Learning in Medical Imaging. Vol. 10019 of Lect Notes Comput
tional neural networks. J Med Imaging 3, 034501. Sci. pp. 164–171.
Hwang, S., Kim, H., 2016. Self-transfer learning for fully weakly su- Kendall, A., Gal, Y., 2017. What uncertainties do we need in bayesian
pervised object localization. arXiv:1602.01625. deep learning for computer vision? arXiv:1703.04977.
Hwang, S., Kim, H.-E., Jeong, J., Kim, H.-J., 2016. A novel approach Kim, E., Cortre-Real, M., Baloch, Z., 2016a. A deep semantic mobile
for tuberculosis screening based on deep convolutional neural net- application for thyroid cytopathology. In: Medical Imaging. Vol.
works. In: Medical Imaging. Vol. 9785 of Proceedings of the SPIE. 9789 of Proceedings of the SPIE. p. 97890A.
pp. 97852W–1. Kim, H., Hwang, S., 2016. Scale-invariant feature learning using
Jamaludin, A., Kadir, T., Zisserman, A., 2016. SpineNet: Automati- deconvolutional neural networks for weakly-supervised semantic
cally pinpointing classification evidence in spinal MRIs. In: Med segmentation. arXiv:1602.04984.
Image Comput Comput Assist Interv. Vol. 9901 of Lect Notes Kim, J., Calhoun, V. D., Shim, E., Lee, J.-H., 2016b. Deep neural net-
Comput Sci. pp. 166–175. work with weight sparsity control and pre-training extracts hierar-
Jamieson, A. R., Drukker, K., Giger, M. L., 2012. Breast image fea- chical features and enhances classification performance: Evidence
ture learning with adaptive deconvolutional networks. In: Medical from whole-brain resting-state functional connectivity patterns of
Imaging. Vol. 8315 of Proceedings of the SPIE. p. 831506. schizophrenia. NeuroImage 124, 127–146.
Janowczyk, A., Basavanhally, A., Madabhushi, A., 2016a. Stain nor- Kingma, D. P., Welling, M., 2013. Auto-encoding variational bayes.
malization using sparse autoencoders (StaNoSA): Application to arXiv:1312.6114.
digital pathology. Comput Med Imaging Graph, in press. Kisilev, P., Sason, E., Barkan, E., Hashoul, S., 2016. Medical image
Janowczyk, A., Doyle, S., Gilmore, H., Madabhushi, A., 2016b. description using multi-task-loss CNN. In: International Workshop
32
on Large-Scale Annotation of Biomedical Data and Expert Label I., Kovacs, I., Hulsbergen-van de Kaa, C., Bult, P., van Ginneken,
Synthesis. Springer, pp. 121–129. B., van der Laak, J., 2016. Deep learning as a tool for increased
Kleesiek, J., Urban, G., Hubert, A., Schwarz, D., Maier-Hein, K., accuracy and efficiency of histopathological diagnosis. Nat Sci Rep
Bendszus, M., Biller, A., 2016. Deep MRI brain extraction: A 3D 6, 26286.
convolutional neural network for skull stripping. NeuroImage 129, Liu, J., Wang, D., Wei, Z., Lu, L., Kim, L., Turkbey, E., Summers,
460–469. R. M., 2016a. Colitis detection on computed tomography using re-
Kong, B., Zhan, Y., Shin, M., Denny, T., Zhang, S., 2016. Recognizing gional convolutional neural networks. In: IEEE Int Symp Biomed-
end-diastole and end-systole frames via deep temporal regression ical Imaging. pp. 863–866.
network. In: Med Image Comput Comput Assist Interv. Vol. 9901 Liu, X., Tizhoosh, H. R., Kofman, J., 2016b. Generating binary tags
of Lect Notes Comput Sci. pp. 264–272. for fast medical image retrieval based on convolutional nets and
Kooi, T., Litjens, G., van Ginneken, B., Gubern-Mérida, A., Sánchez, Radon transform. In: International Joint Conference on Neural
C. I., Mann, R., den Heeten, A., Karssemeijer, N., 2016. Large Networks. ArXiv:1604.04676.
scale deep learning for computer aided detection of mammo- Liu, Y., Gadepalli, K., Norouzi, M., Dahl, G. E., Kohlberger, T.,
graphic lesions. Med Image Anal 35, 303–312. Boyko, A., Venugopalan, S., Timofeev, A., Nelson, P. Q., Corrado,
Kooi, T., van Ginneken, B., Karssemeijer, N., den Heeten, A., 2017. G. S., Hipp, J. D., Peng, L., Stumpe, M. C., 2017. Detecting cancer
Discriminating solitary cysts from soft tissue lesions in mammog- metastases on gigapixel pathology images. arXiv:1703.02442.
raphy using a pretrained deep convolutional neural network. Med- Lo, S.-C., Lou, S.-L., Lin, J.-S., Freedman, M. T., Chien, M. V., Mun,
ical Physics. S. K., 1995. Artificial convolution neural network techniques and
Korez, R., Likar, B., Pernuš, F., Vrtovec, T., 2016. Model-based seg- applications for lung nodule detection. IEEE Trans Med Imaging
mentation of vertebral bodies from MR images with 3D CNNs. In: 14, 711–718.
Med Image Comput Comput Assist Interv. Vol. 9901 of Lect Notes Long, J., Shelhamer, E., Darrell, T., 2015. Fully convolutional net-
Comput Sci. Springer, pp. 433–441. works for semantic segmentation. arXiv:1411.4038.
Krizhevsky, A., Sutskever, I., Hinton, G., 2012. Imagenet classifi- Lu, F., Wu, F., Hu, P., Peng, Z., Kong, D., Feb. 2017. Automatic 3D
cation with deep convolutional neural networks. In: Advances in liver location and segmentation via convolutional neural network
Neural Information Processing Systems. pp. 1097–1105. and graph cut. Int J Comput Assist Radiol Surg 12, 171–182.
Kumar, A., Sridar, P., Quinton, A., Kumar, R. K., Feng, D., Nanan, Lu, X., Xu, D., Liu, D., 2016. Robust 3d organ localization with dual
R., Kim, J., 2016. Plane identification in fetal ultrasound images learning architectures and fusion. In: DLMIA. Vol. 10008 of Lect
using saliency maps and convolutional neural networks. In: IEEE Notes Comput Sci. pp. 12–20.
Int Symp Biomedical Imaging. pp. 791–794. Ma, J., Wu, F., Zhu, J., Xu, D., Kong, D., Jan 2017. A pre-trained
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., 1998. Gradient-based convolutional neural network based method for thyroid nodule di-
learning applied to document recognition. Proceedings of the IEEE agnosis. Ultrasonics 73, 221–230.
86, 2278–2324. Mahapatra, D., Roy, P. K., Sedai, S., Garnavi, R., 2016. Retinal image
Lekadir, K., Galimzianova, A., Betriu, A., Del Mar Vila, M., Igual, quality classification using saliency maps and CNNs. In: Machine
L., Rubin, D. L., Fernandez, E., Radeva, P., Napel, S., Jan. 2017. Learning in Medical Imaging. Vol. 10019 of Lect Notes Comput
A convolutional neural network for automatic characterization of Sci. pp. 172–179.
plaque composition in carotid ultrasound. IEEE J Biomed Health Malon, C. D., Cosatto, E., 2013. Classification of mitotic figures with
Inform 21, 48–55. convolutional neural networks and seeded blob features. Journal of
Lessmann, N., Isgum, I., Setio, A. A., de Vos, B. D., Ciompi, F., pathology informatics.
de Jong, P. A., Oudkerk, M., Mali, W. P. T. M., Viergever, M. A., Maninis, K.-K., Pont-Tuset, J., Arbeláez, P., Gool, L., 2016. Deep reti-
van Ginneken, B., 2016. Deep convolutional neural networks for nal image understanding. In: Med Image Comput Comput Assist
automatic coronary calcium scoring in a screening study with low- Interv. Vol. 9901 of Lect Notes Comput Sci. pp. 140–148.
dose chest CT. In: Medical Imaging. Vol. 9785 of Proceedings of Mansoor, A., Cerrolaza, J., Idrees, R., Biggs, E., Alsharid, M., Avery,
the SPIE. pp. 978511–1 – 978511–6. R., Linguraru, M. G., 2016. Deep learning guided partitioned shape
Li, R., Zhang, W., Suk, H.-I., Wang, L., Li, J., Shen, D., Ji, S., 2014. model for anterior visual pathway segmentation. IEEE Trans Med
Deep learning based imaging data completion for improved brain Imaging 35 (8), 1856–1865.
disease diagnosis. In: Med Image Comput Comput Assist Interv. Mao, Y., Yin, Z., 2016. A hierarchical convolutional neural net-
Vol. 8675 of Lect Notes Comput Sci. pp. 305–312. work for mitosis detection in phase-contrast microscopy images.
Li, W., Cao, P., Zhao, D., Wang, J., 2016a. Pulmonary nodule clas- In: Med Image Comput Comput Assist Interv. Vol. 9901 of Lect
sification with deep convolutional neural networks on computed Notes Comput Sci. pp. 685–692.
tomography images. Computational and Mathematical Methods in Menegola, A., Fornaciali, M., Pires, R., Avila, S., Valle, E., 2016. To-
Medicine, 6215085. wards automated melanoma screening: Exploring transfer learning
Li, W., Jia, F., Hu, Q., 2015. Automatic segmentation of liver tumor schemes. arXiv:1609.01228.
in CT images with deep convolutional neural networks. Journal of Merkow, J., Kriegman, D., Marsden, A., Tu, Z., 2016. Dense volume-
Computer and Communications 3 (11), 146–151. to-volume vascular boundary detection. arXiv:1605.08401.
Li, W., Manivannan, S., Akbar, S., Zhang, J., Trucco, E., McKenna, Miao, S., Wang, Z. J., Liao, R., 2016. A CNN regression approach
S. J., 2016b. Gland segmentation in colon histology images using for real-time 2D/3D registration. IEEE Trans Med Imaging 35 (5),
hand-crafted features and convolutional neural networks. In: IEEE 1352–1363.
Int Symp Biomedical Imaging. pp. 1405–1408. Milletari, F., Ahmadi, S.-A., Kroll, C., Plate, A., Rozanski, V.,
Liao, S., Gao, Y., Oto, A., Shen, D., 2013. Representation learn- Maiostre, J., Levin, J., Dietrich, O., Ertl-Wagner, B., Bötzel, K.,
ing: A unified deep learning framework for automatic prostate mr Navab, N., 2016a. Hough-CNN: Deep learning for segmentation
segmentation. In: Med Image Comput Comput Assist Interv. Vol. of deep brain regions in MRI and ultrasound. arXiv:1601.07014.
8150 of Lect Notes Comput Sci. pp. 254–261. Milletari, F., Navab, N., Ahmadi, S.-A., 2016b. V-Net: Fully convolu-
Lin, M., Chen, Q., Yan, S., 2013. Network in network. tional neural networks for volumetric medical image segmentation.
arXiv:1312.4400. arXiv:1606.04797.
Litjens, G., Sánchez, C. I., Timofeeva, N., Hermsen, M., Nagtegaal, Mishra, M., Schmitt, S., Wang, L., Strasser, M. K., Marr, C., Navab,
33
N., Zischka, H., Peng, T., 2016. Structure-based assessment of 699–702.
cancerous mitochondria using deep networks. In: IEEE Int Symp Payan, A., Montana, G., 2015. Predicting Alzheimer’s disease:
Biomedical Imaging. pp. 545–548. a neuroimaging study with 3D convolutional neural networks.
Moeskops, P., Viergever, M. A., Mendrik, A. M., de Vries, L. S., Ben- arXiv:1502.02506.
ders, M. J. N. L., Isgum, I., 2016a. Automatic segmentation of Payer, C., Stern, D., Bischof, H., Urschler, M., 2016. Regressing
MR brain images with a convolutional neural network. IEEE Trans heatmaps for multiple landmark localization using CNNs. In: Med
Med Imaging 35 (5), 1252–1262. Image Comput Comput Assist Interv. Vol. 9901 of Lect Notes
Moeskops, P., Wolterink, J. M., Velden, B. H. M., Gilhuijs, K. G. A., Comput Sci. pp. 230–238.
Leiner, T., Viergever, M. A., Isgum, I., 2016b. Deep learning for Pereira, S., Pinto, A., Alves, V., Silva, C. A., 2016. Brain tumor
multi-task medical image segmentation in multiple modalities. In: segmentation using convolutional neural networks in MRI images.
Med Image Comput Comput Assist Interv. Vol. 9901 of Lect Notes IEEE Trans Med Imaging.
Comput Sci. pp. 478–486. Phan, H. T. H., Kumar, A., Kim, J., Feng, D., 2016. Transfer learning
Montavon, G., Lapuschkin, S., Binder, A., Samek, W., Müller, K.- of a convolutional neural network for HEp-2 cell image classifica-
R., 2017. Explaining nonlinear classification decisions with deep tion. In: IEEE Int Symp Biomedical Imaging. pp. 1208–1211.
taylor decomposition. Pattern Recognition 65, 211–222. Pinaya, W. H. L., Gadelha, A., Doyle, O. M., Noto, C., Zugman, A.,
Moradi, M., Guo, Y., Gur, Y., Negahdar, M., Syeda-Mahmood, Cordeiro, Q., Jackowski, A. P., Bressan, R. A., Sato, J. R., Dec.
T., 2016a. A cross-modality neural network transform for semi- 2016. Using deep belief network modelling to characterize dif-
automatic medical image annotation. In: Med Image Comput ferences in brain morphometry in schizophrenia. Nat Sci Rep 6,
Comput Assist Interv. Vol. 9901 of Lect Notes Comput Sci. pp. 38897.
300–307. Plis, S. M., Hjelm, D. R., Salakhutdinov, R., Allen, E. A., Bockholt,
Moradi, M., Gur, Y., Wang, H., Prasanna, P., Syeda-Mahmood, T., H. J., Long, J. D., Johnson, H. J., Paulsen, J. S., Turner, J. A., Cal-
2016b. A hybrid learning approach for semantic labeling of cardiac houn, V. D., 2014. Deep learning for neuroimaging: a validation
CT slices and recognition of body position. In: IEEE Int Symp study. Frontiers in Neuroscience.
Biomedical Imaging. Poudel, R. P. K., Lamata, P., Montana, G., 2016. Recurrent fully con-
Nappi, J. J., Hironaka, T., Regge, D., Yoshida, H., 2016. Deep transfer volutional neural networks for multi-slice MRI cardiac segmenta-
learning of virtual endoluminal views for the detection of polyps in tion. arXiv:1608.03974.
CT colonography. In: Medical Imaging. Proceedings of the SPIE. Prasoon, A., Petersen, K., Igel, C., Lauze, F., Dam, E., Nielsen, M.,
p. 97852B. 2013. Deep feature learning for knee cartilage segmentation using
Nascimento, J. C., Carneiro, G., 2016. Multi-atlas segmentation using a triplanar convolutional neural network. In: Med Image Comput
manifold learning with deep belief networks. In: IEEE Int Symp Comput Assist Interv. Vol. 8150 of Lect Notes Comput Sci. pp.
Biomedical Imaging. pp. 867–871. 246–253.
Ngo, T. A., Lu, Z., Carneiro, G., 2017. Combining deep learning and Prentasic, P., Heisler, M., Mammo, Z., Lee, S., Merkur, A., Navajas,
level set for the automated segmentation of the left ventricle of the E., Beg, M. F., Sarunic, M., Loncaric, S., 2016. Segmentation of
heart from cardiac cine magnetic resonance. Med Image Anal 35, the foveal microvasculature using deep learning networks. Journal
159–171. of Biomedical Optics 21, 75008.
Nie, D., Cao, X., Gao, Y., Wang, L., Shen, D., 2016a. Estimating CT Prentasic, P., Loncaric, S., 2016. Detection of exudates in fundus pho-
image from MRI data using 3D fully convolutional networks. In: tographs using deep neural networks and anatomical landmark de-
DLMIA. Vol. 10008 of Lect Notes Comput Sci. pp. 170–178. tection fusion. Comput Methods Programs Biomed 137, 281–292.
Nie, D., Wang, L., Gao, Y., Shen, D., 2016b. Fully convolutional net- Qiu, Y., Wang, Y., Yan, S., Tan, M., Cheng, S., Liu, H., Zheng, B.,
works for multi-modality isointense infant brain image segmenta- 2016. An initial investigation on developing a new method to pre-
tion. In: IEEE Int Symp Biomedical Imaging. pp. 1342–1345. dict short-term breast cancer risk based on deep learning technol-
Nie, D., Zhang, H., Adeli, E., Liu, L., Shen, D., 2016c. 3D deep ogy. In: Medical Imaging. Vol. 9785 of Proceedings of the SPIE.
learning for multi-modal imaging-guided survival time prediction p. 978521.
of brain tumor patients. In: Med Image Comput Comput Assist Quinn, J. A., Nakasi, R., Mugagga, P. K. B., Byanyima, P., Lubega,
Interv. Vol. 9901 of Lect Notes Comput Sci. pp. 212–220. W., Andama, A., 2016. Deep convolutional neural networks for
Nogues, I., Lu, L., Wang, X., Roth, H., Bertasius, G., Lay, N., Shi, J., microscopy-based point of care diagnostics. arXiv:1608.02989.
Tsehay, Y., Summers, R. M., 2016. Automatic lymph node cluster Rajchl, M., Lee, M. C., Oktay, O., Kamnitsas, K., Passerat-Palmbach,
segmentation using holistically-nested neural networks and struc- J., Bai, W., Kainz, B., Rueckert, D., 2016a. DeepCut: Object
tured optimization in CT images. In: Med Image Comput Comput segmentation from bounding box annotations using convolutional
Assist Interv. Vol. 9901 of Lect Notes Comput Sci. pp. 388–397. neural networks. IEEE Trans Med Imaging, in press.
Oktay, O., Bai, W., Lee, M., Guerrero, R., Kamnitsas, K., Caballero, Rajchl, M., Lee, M. C., Schrans, F., Davidson, A., Passerat-Palmbach,
J., Marvao, A., Cook, S., O’Regan, D., Rueckert, D., 2016. Multi- J., Tarroni, G., Alansary, A., Oktay, O., Kainz, B., Rueck-
input cardiac image super-resolution using convolutional neural ert, D., 2016b. Learning under distributed weak supervision.
networks. In: Med Image Comput Comput Assist Interv. Vol. 9902 arXiv:1606.01100.
of Lect Notes Comput Sci. pp. 246–254. Rajkomar, A., Lingam, S., Taylor, A. G., Blum, M., Mongan, J., 2017.
Ortiz, A., Munilla, J., Górriz, J. M., Ramı́rez, J., 2016. Ensem- High-throughput classification of radiographs using deep convolu-
bles of deep learning architectures for the early diagnosis of the tional neural networks. J Digit Imaging 30, 95–101.
Alzheimer’s disease. International Journal of Neural Systems 26, Ravi, D., Wong, C., Deligianni, F., Berthelot, M., Andreu-Perez, J.,
1650025. Lo, B., Yang, G.-Z., Jan. 2017. Deep learning for health informat-
Paeng, K., Hwang, S., Park, S., Kim, M., Kim, S., 2016. A uni- ics. IEEE J Biomed Health Inform 21, 4–21.
fied framework for tumor proliferation score prediction in breast Ravishankar, H., Prabhu, S. M., Vaidya, V., Singhal, N., 2016a.
histopathology. arXiv:1612.07180. Hybrid approach for automatic segmentation of fetal abdomen
Pan, Y., Huang, W., Lin, Z., Zhu, W., Zhou, J., Wong, J., Ding, Z., from ultrasound images using deep learning. In: IEEE Int Symp
2015. Brain tumor grading based on neural networks and convolu- Biomedical Imaging. pp. 779–782.
tional neural networks. In: Conf Proc IEEE Eng Med Biol Soc. pp. Ravishankar, H., Sudhakar, P., Venkataramani, R., Thiruvenkadam,
34
S., Annangi, P., Babu, N., Vaidya, V., 2016b. Understanding the convolutional neural network with transfer learning from mam-
mechanisms of deep transfer learning for medical images. In: mography. Medical Physics 43 (12), 6654–6666.
DLMIA. Vol. 10008 of Lect Notes Comput Sci. pp. 188–196. Sarraf, S., Tofighi, G., 2016. Classification of Alzheimer’s disease us-
Rezaeilouyeh, H., Mollahosseini, A., Mahoor, M. H., 2016. Micro- ing fMRI data and deep learning convolutional neural networks.
scopic medical image classification framework via deep learning arXiv:1603.08631.
and shearlet transform. Journal of Medical Imaging 3 (4), 044501. Schaumberg, A. J., Rubin, M. A., Fuchs, T. J., 2016. H&E-stained
Romo-Bucheli, D., Janowczyk, A., Gilmore, H., Romero, E., Mad- whole slide deep learning predicts SPOP mutation state in prostate
abhushi, A., Sep 2016. Automated tubule nuclei quantification and cancer. bioRxiv:064279.
correlation with Oncotype DX risk categories in ER+ breast cancer Schlegl, T., Waldstein, S. M., Vogl, W.-D., Schmidt-Erfurth, U.,
whole slide images. Nat Sci Rep 6, 32706. Langs, G., 2015. Predicting semantic descriptions from medical
Ronneberger, O., Fischer, P., Brox, T., 2015. U-net: Convolutional images with convolutional neural networks. In: Inf Process Med
networks for biomedical image segmentation. In: Med Image Imaging. Vol. 9123 of Lect Notes Comput Sci. pp. 437–448.
Comput Comput Assist Interv. Vol. 9351 of Lect Notes Comput Sethi, A., Sha, L., Vahadane, A. R., Deaton, R. J., Kumar, N., Macias,
Sci. pp. 234–241. V., Gann, P. H., 2016. Empirical comparison of color normalization
Roth, H. R., Lee, C. T., Shin, H.-C., Seff, A., Kim, L., Yao, J., Lu, methods for epithelial-stromal classification in H and E images. J
L., Summers, R. M., 2015a. Anatomy-specific classification of Pathol Inform 7, 17.
medical images using deep convolutional nets. In: IEEE Int Symp Setio, A. A. A., Ciompi, F., Litjens, G., Gerke, P., Jacobs, C., van Riel,
Biomedical Imaging. pp. 101–104. S., Wille, M. W., Naqibullah, M., Sanchez, C., van Ginneken, B.,
Roth, H. R., Lu, L., Farag, A., Shin, H.-C., Liu, J., Turkbey, E. B., 2016. Pulmonary nodule detection in CT images: false positive re-
Summers, R. M., 2015b. DeepOrgan: Multi-level deep convolu- duction using multi-view convolutional networks. IEEE Trans Med
tional networks for automated pancreas segmentation. In: Med Im- Imaging 35 (5), 1160–1169.
age Comput Comput Assist Interv. Vol. 9349 of Lect Notes Com- Sevetlidis, V., Giuffrida, M. V., Tsaftaris, S. A., Jan. 2016. Whole
put Sci. pp. 556–564. image synthesis using a deep encoder-decoder network. In: Simu-
Roth, H. R., Lu, L., Farag, A., Sohn, A., Summers, R. M., 2016a. lation and Synthesis in Medical Imaging. Vol. 9968 of Lect Notes
Spatial aggregation of holistically-nested networks for automated Comput Sci. pp. 127–137.
pancreas segmentation. In: Med Image Comput Comput Assist In- Shah, A., Conjeti, S., Navab, N., Katouzian, A., 2016. Deeply learnt
terv. Vol. 9901 of Lect Notes Comput Sci. pp. 451–459. hashing forests for content based image retrieval in prostate MR
Roth, H. R., Lu, L., Liu, J., Yao, J., Seff, A., Cherry, K., Kim, L., images. In: Medical Imaging. Vol. 9784 of Proceedings of the
Summers, R. M., 2016b. Improving computer-aided detection us- SPIE. p. 978414.
ing convolutional neural networks and random view aggregation. Shakeri, M., Tsogkas, S., Ferrante, E., Lippe, S., Kadoury, S., Para-
IEEE Trans Med Imaging 35 (5), 1170–1181. gios, N., Kokkinos, I., 2016. Sub-cortical brain structure segmen-
Roth, H. R., Lu, L., Seff, A., Cherry, K. M., Hoffman, J., Wang, S., tation using F-CNNs. In: IEEE Int Symp Biomedical Imaging. pp.
Liu, J., Turkbey, E., Summers, R. M., 2014. A new 2.5D repre- 269–272.
sentation for lymph node detection using random sets of deep con- Shen, D., Wu, G., Suk, H.-I., Mar. 2017. Deep learning in medical
volutional neural network observations. In: Med Image Comput image analysis. Annu Rev Biomed Eng.
Comput Assist Interv. Vol. 8673 of Lect Notes Comput Sci. pp. Shen, W., Yang, F., Mu, W., Yang, C., Yang, X., Tian, J., 2015a.
520–527. Automatic localization of vertebrae based on convolutional neural
Roth, H. R., Wang, Y., Yao, J., Lu, L., Burns, J. E., Summers, R. M., networks. In: Medical Imaging. Vol. 9413 of Proceedings of the
2016c. Deep convolutional networks for automated detection of SPIE. p. 94132E.
posterior-element fractures on spine CT. In: Medical Imaging. Vol. Shen, W., Zhou, M., Yang, F., Dong, D., Yang, C., Zang, Y., Tian,
9785 of Proceedings of the SPIE. p. 97850P. J., 2016. Learning from experts: Developing transferable deep fea-
Roth, H. R., Yao, J., Lu, L., Stieger, J., Burns, J. E., Summers, R. M., tures for patient-level lung cancer prediction. In: Med Image Com-
2015c. Detection of sclerotic spine metastases via random aggrega- put Comput Assist Interv. Vol. 9901 of Lect Notes Comput Sci. pp.
tion of deep convolutional?neural network classifications. In: Re- 124–131.
cent Advances in Computational Methods and Clinical Applica- Shen, W., Zhou, M., Yang, F., Yang, C., Tian, J., 2015b. Multi-scale
tions for Spine Imaging. Vol. 20 of Lecture Notes in Computational convolutional neural networks for lung nodule classification. In:
Vision and Biomechanics. pp. 3–12. Inf Process Med Imaging. Vol. 9123 of Lect Notes Comput Sci.
Rupprecht, C., Huaroc, E., Baust, M., Navab, N., 2016. Deep active pp. 588–599.
contours. arXiv:1607.05074. Shi, J., Zheng, X., Li, Y., Zhang, Q., Ying, S., Jan. 2017. Mul-
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., timodal neuroimaging feature learning with multimodal stacked
Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., deep polynomial networks for diagnosis of Alzheimer’s disease.
Fei-Fei, L., 2014. ImageNet large scale visual recognition chal- IEEE J Biomed Health Inform, in press.
lenge. Int J Comput Vis 115 (3), 1–42. Shin, H.-C., Lu, L., Kim, L., Seff, A., Yao, J., Summers, R. M., 2015.
Sahiner, B., Chan, H.-P., Petrick, N., Wei, D., Helvie, M. A., Adler, Interleaved text/image deep mining on a very large-scale radiology
D. D., Goodsitt, M. M., 1996. Classification of mass and normal database. In: Comput Vis Pattern Recognit. pp. 1090–1099.
breast tissue: a convolution neural network classifier with spatial Shin, H.-C., Orton, M. R., Collins, D. J., Doran, S. J., Leach, M. O.,
domain and texture images. IEEE Trans Med Imaging 15, 598– 2013. Stacked autoencoders for unsupervised feature learning and
610. multiple organ detection in a pilot study using 4D patient data.
Samala, R. K., Chan, H.-P., Hadjiiski, L., Cha, K., Helvie, M. A., IEEE Trans Pattern Anal Mach Intell 35, 1930–1943.
2016a. Deep-learning convolution neural network for computer- Shin, H.-C., Roberts, K., Lu, L., Demner-Fushman, D., Yao, J.,
aided detection of microcalcifications in digital breast tomosyn- Summers, R. M., 2016a. Learning to read chest x-rays: Re-
thesis. In: Medical Imaging. Vol. 9785 of Proceedings of the SPIE. current neural cascade model for automated image annotation.
p. 97850Y. arXiv:1603.08486.
Samala, R. K., Chan, H.-P., Hadjiiski, L., Helvie, M. A., Wei, J., Cha, Shin, H.-C., Roth, H. R., Gao, M., Lu, L., Xu, Z., Nogues, I.,
K., 2016b. Mass detection in digital breast tomosynthesis: Deep Yao, J., Mollura, D., Summers, R. M., 2016b. Deep convolu-
35
tional neural networks for computer-aided detection: CNN archi- 9785 of Proceedings of the SPIE. p. 97850Z.
tectures, dataset characteristics and transfer learning. IEEE Trans Suzani, A., Rasoulian, A., Seitel, A., Fels, S., Rohling, R., Abolmae-
Med Imaging 35 (5), 1285–1298. sumi, P., 2015. Deep learning for automatic localization, identifi-
Shkolyar, A., Gefen, A., Benayahu, D., Greenspan, H., 2015. Au- cation, and segmentation of vertebral bodies in volumetric mr im-
tomatic detection of cell divisions (mitosis) in live-imaging mi- ages. In: Medical Imaging. Vol. 9415 of Proceedings of the SPIE.
croscopy images using convolutional neural networks. In: Conf p. 941514.
Proc IEEE Eng Med Biol Soc. pp. 743–746. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D.,
Simonovsky, M., Gutiérrez-Becker, B., Mateus, D., Navab, N., Ko- Erhan, D., Vanhoucke, V., Rabinovich, A., 2014. Going deeper
modakis, N., 2016. A deep metric for multimodal registration. In: with convolutions. arXiv:1409.4842.
Med Image Comput Comput Assist Interv. Vol. 9902 of Lect Notes Tachibana, R., Näppi, J. J., Hironaka, T., Kim, S. H., Yoshida, H.,
Comput Sci. pp. 10–18. 2016. Deep learning for electronic cleansing in dual-energy ct
Simonyan, K., Zisserman, A., 2014. Very deep convolutional net- colonography. In: Medical Imaging. Vol. 9785 of Proceedings of
works for large-scale image recognition. arXiv:1409.1556. the SPIE. p. 97851M.
Sirinukunwattana, K., Raza, S. E. A., Tsang, Y.-W., Snead, D. R., Tajbakhsh, N., Gotway, M. B., Liang, J., 2015a. Computer-aided pul-
Cree, I. A., Rajpoot, N. M., 2016. Locality sensitive deep learning monary embolism detection using a novel vessel-aligned multi-
for detection and classification of nuclei in routine colon cancer planar image representation and convolutional neural networks. In:
histology images. IEEE Trans Med Imaging 35 (5), 1196–1206. Med Image Comput Comput Assist Interv. Vol. 9350 of Lect Notes
Smistad, E., Løvstakken, L., 2016. Vessel detection in ultrasound im- Comput Sci. pp. 62–69.
ages using deep convolutional neural networks. In: DLMIA. Vol. Tajbakhsh, N., Gurudu, S. R., Liang, J., 2015b. A comprehensive
10008 of Lect Notes Comput Sci. pp. 30–38. computer-aided polyp detection system for colonoscopy videos.
Snoek, J., Larochelle, H., Adams, R. P., 2012. Practical bayesian op- In: Inf Process Med Imaging. Vol. 9123 of Lect Notes Comput
timization of machine learning algorithms. In: Advances in Neural Sci. pp. 327–338.
Information Processing Systems. pp. 2951–2959. Tajbakhsh, N., Shin, J. Y., Gurudu, S. R., Hurst, R. T., Kendall, C. B.,
Song, Y., Tan, E.-L., Jiang, X., Cheng, J.-Z., Ni, D., Chen, S., Lei, Gotway, M. B., Liang, J., 2016. Convolutional neural networks for
B., Wang, T., Sep 2017. Accurate cervical cell segmentation from medical image analysis: Fine tuning or full training? IEEE Trans
overlapping clumps in pap smear images. IEEE Trans Med Imag- Med Imaging 35 (5), 1299–1312.
ing 36, 288–300. Tarando, S. R., Fetita, C., Faccinetto, A., Yves, P., 2016. Increasing
Song, Y., Zhang, L., Chen, S., Ni, D., Lei, B., Wang, T., 2015. Accu- CAD system efficacy for lung texture analysis using a convolu-
rate segmentation of cervical cytoplasm and nuclei based on mul- tional network. In: Medical Imaging. Vol. 9785 of Proceedings of
tiscale convolutional network and graph partitioning. IEEE Trans the SPIE. pp. 97850Q–97850Q.
Biomed Eng 62 (10), 2421–2433. Teikari, P., Santos, M., Poon, C., Hynynen, K., 2016. Deep learn-
Spampinato, C., Palazzo, S., Giordano, D., Aldinucci, M., Leonardi, ing convolutional networks for multiphoton microscopy vascula-
R., Feb. 2017. Deep learning for automated skeletal bone age as- ture segmentation. arXiv:1606.02382.
sessment in X-ray images. Med Image Anal 36, 41–51. Teramoto, A., Fujita, H., Yamamuro, O., Tamaki, T., 2016. Auto-
Springenberg, J. T., Dosovitskiy, A., Brox, T., Riedmiller, M., 2014. mated detection of pulmonary nodules in PET/CT images: Ensem-
Striving for simplicity: The all convolutional net. arXiv preprint ble false-positive reduction using a convolutional neural network
arXiv:1412.6806. technique. Med Phys 43, 2821–2827.
Štern, D., Payer, C., Lepetit, V., Urschler, M., 2016. Automated age Thong, W., Kadoury, S., Piché, N., Pal, C. J., 2016. Convolutional
estimation from hand MRI volumes using deep learning. In: Med networks for kidney segmentation in contrast-enhanced CT scans.
Image Comput Comput Assist Interv. Vol. 9901 of Lect Notes Computer Methods in Biomechanics and Biomedical Engineering:
Comput Sci. pp. 194–202. Imaging & Visualization, 1–6.
Stollenga, M. F., Byeon, W., Liwicki, M., Schmidhuber, J., 2015. Par- Tran, P. V., 2016. A fully convolutional neural network for cardiac
allel multi-dimensional LSTM, with application to fast biomedical segmentation in short-axis MRI. arXiv:1604.00494.
volumetric image segmentation. In: Advances in Neural Informa- Turkki, R., Linder, N., Kovanen, P. E., Pellinen, T., Lundin, J., 2016.
tion Processing Systems. pp. 2998–3006. Antibody-supervised deep learning for quantification of tumor-
Suk, H.-I., Lee, S.-W., Shen, D., 2014. Hierarchical feature repre- infiltrating immune cells in hematoxylin and eosin stained breast
sentation and multimodal fusion with deep learning for AD/MCI cancer samples. Journal of pathology informatics 7, 38.
diagnosis. NeuroImage 101, 569–582. Twinanda, A. P., Shehata, S., Mutter, D., Marescaux, J., de Mathelin,
Suk, H.-I., Lee, S.-W., Shen, D., 2015. Latent feature representa- M., Padoy, N., 2017. Endonet: A deep architecture for recognition
tion with stacked auto-encoder for AD/MCI diagnosis. Brain Struct tasks on laparoscopic videos. IEEE Trans Med Imaging 36, 86–97.
Funct 220, 841–859. van der Burgh, H. K., Schmidt, R., Westeneng, H.-J., de Reus, M. A.,
Suk, H.-I., Shen, D., 2013. Deep learning-based feature representation van den Berg, L. H., van den Heuvel, M. P., 2017. Deep learning
for AD/MCI classification. In: Med Image Comput Comput Assist predictions of survival based on MRI in amyotrophic lateral scle-
Interv. Vol. 8150 of Lect Notes Comput Sci. pp. 583–590. rosis. NeuroImage. Clinical 13, 361–369.
Suk, H.-I., Shen, D., 2016. Deep ensemble sparse regression network van Ginneken, B., Setio, A. A., Jacobs, C., Ciompi, F., 2015. Off-the-
for Alzheimer’s disease diagnosis. In: Med Image Comput Comput shelf convolutional neural network features for pulmonary nod-
Assist Interv. Vol. 10019 of Lect Notes Comput Sci. pp. 113–121. ule detection in computed tomography scans. In: IEEE Int Symp
Suk, H.-I., Wee, C.-Y., Lee, S.-W., Shen, D., 2016. State-space model Biomedical Imaging. pp. 286–289.
with deep learning for functional dynamics estimation in resting- van Grinsven, M. J. J. P., van Ginneken, B., Hoyng, C. B., Theelen,
state fMRI. NeuroImage 129, 292–307. T., Sánchez, C. I., 2016. Fast convolutional neural network train-
Sun, W., Tseng, T.-L. B., Zhang, J., Qian, W., 2016a. Enhancing deep ing using selective data sampling: Application to hemorrhage de-
convolutional neural network scheme for breast cancer diagnosis tection in color fundus images. IEEE Trans Med Imaging 35 (5),
with unlabeled data. Comput Med Imaging Graph. 1273–1284.
Sun, W., Zheng, B., Qian, W., 2016b. Computer aided lung cancer van Tulder, G., de Bruijne, M., 2016. Combining generative and
diagnosis with deep learning algorithms. In: Medical Imaging. Vol. discriminative representation learning for lung CT analysis with
36
convolutional Restricted Boltzmann Machines. IEEE Trans Med Imaging & Visualization, 1–10.
Imaging 35 (5), 1262–1272. Xie, Y., Kong, X., Xing, F., Liu, F., Su, H., Yang, L., 2015a. Deep vot-
Veta, M., van Diest, P. J., Pluim, J. P. W., 2016. Cutting out the mid- ing: A robust approach toward nucleus localization in microscopy
dleman: measuring nuclear area in histopathology slides without images. In: Med Image Comput Comput Assist Interv. Vol. 9351
segmentation. In: Med Image Comput Comput Assist Interv. Vol. of Lect Notes Comput Sci. pp. 374–382.
9901 of Lect Notes Comput Sci. pp. 632–639. Xie, Y., Xing, F., Kong, X., Su, H., Yang, L., 2015b. Beyond classi-
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.-A., fication: Structured regression for robust cell detection using con-
2010. Stacked denoising autoencoders: Learning useful represen- volutional neural network. In: Med Image Comput Comput Assist
tations in a deep network with a local denoising criterion. J Mach Interv. Vol. 9351 of Lect Notes Comput Sci. pp. 358–365.
Learn Res 11, 3371–3408. Xie, Y., Zhang, Z., Sapkota, M., Yang, L., 2016b. Spatial clock-
Vivanti, R., Ephrat, A., Joskowicz, L., Karaaslan, O., Lev-Cohain, work recurrent neural network for muscle perimysium segmen-
N., Sosna, J., 2015. Automatic liver tumor segmentation in follow- tation. In: International Conference on Medical Image Comput-
up ct studies using convolutional neural networks. In: Proc. ing and Computer-Assisted Intervention. Vol. 9901 of Lect Notes
Patch-Based Methods in Medical Image Processing Workshop, Comput Sci. Springer, pp. 185–193.
MICCAI.–2015. pp. 54–61. Xing, F., Xie, Y., Yang, L., 2016. An automatic learning-based frame-
Wang, C., Elazab, A., Wu, J., Hu, Q., Nov. 2016a. Lung nodule clas- work for robust nucleus segmentation. IEEE Trans Med Imaging
sification using deep feature fusion in chest radiography. Comput 35 (2), 550–566.
Med Imaging Graph. Xu, J., Luo, X., Wang, G., Gilmore, H., Madabhushi, A., 2016a. A
Wang, C., Yan, X., Smith, M., Kochhar, K., Rubin, M., Warren, S. M., deep convolutional neural network for segmenting and classifying
Wrobel, J., Lee, H., 2015. A unified framework for automatic epithelial and stromal regions in histopathological images. Neuro-
wound segmentation and analysis with deep convolutional neural computing 191, 214–223.
networks. In: Conf Proc IEEE Eng Med Biol Soc. pp. 2415–2418. Xu, J., Xiang, L., Liu, Q., Gilmore, H., Wu, J., Tang, J., Madabhushi,
Wang, D., Khosla, A., Gargeya, R., Irshad, H., Beck, A. H., A., 2016b. Stacked sparse autoencoder (ssae) for nuclei detection
2016b. Deep learning for identifying metastatic breast cancer. on breast cancer histopathology images. IEEE Trans Med Imaging
arXiv:1606.05718. 35, 119–130.
Wang, G., 2016. A perspective on deep imaging. IEEE Access 4, Xu, T., Zhang, H., Huang, X., Zhang, S., Metaxas, D. N., 2016c. Mul-
8914–8924. timodal deep learning for cervical dysplasia diagnosis. In: Med Im-
Wang, H., Cruz-Roa, A., Basavanhally, A., Gilmore, H., Shih, N., age Comput Comput Assist Interv. Vol. 9901 of Lect Notes Com-
Feldman, M., Tomaszewski, J., Gonzalez, F., Madabhushi, A., put Sci. pp. 115–123.
2014. Mitosis detection in breast cancer pathology images by com- Xu, Y., Li, Y., Liu, M., Wang, Y., Lai, M., Chang, E. I.-C., 2016d.
bining handcrafted and convolutional neural network features. J Gland instance segmentation by deep multichannel side supervi-
Med Imaging 1, 034003. sion. arXiv:1607.03222.
Wang, J., Ding, H., Azamian, F., Zhou, B., Iribarren, C., Molloi, S., Xu, Y., Mo, T., Feng, Q., Zhong, P., Lai, M., Chang, E. I. C., 2014.
Baldi, P., 2017. Detecting cardiovascular disease from mammo- Deep learning of feature representation with multiple instance
grams with deep learning. IEEE Trans Med Imaging. learning for medical image analysis. In: IEEE International Con-
Wang, J., MacKenzie, J. D., Ramachandran, R., Chen, D. Z., 2016c. ference on Acoustics, Speech and Signal Processing (ICASSP). pp.
A deep learning approach for semantic segmentation in histology 1626–1630.
tissue images. In: Med Image Comput Comput Assist Interv. Vol. Xu, Z., Huang, J., 2016. Detecting 10,000 Cells in one second. In:
9901 of Lect Notes Comput Sci. Springer, pp. 176–184. Med Image Comput Comput Assist Interv. Vol. 9901 of Lect Notes
Wang, S., Yao, J., Xu, Z., Huang, J., 2016d. Subtype cell detection Comput Sci. pp. 676–684.
with an accelerated deep convolution neural network. In: Med Im- Xue, D.-X., Zhang, R., Feng, H., Wang, Y.-L., 2016. CNN-SVM for
age Comput Comput Assist Interv. Vol. 9901 of Lect Notes Com- microvascular morphological type recognition with data augmen-
put Sci. pp. 640–648. tation. J Med Biol Eng 36, 755–764.
Wang, X., Lu, L., Shin, H.-c., Kim, L., Nogues, I., Yao, J., Sum- Yan, Z., Zhan, Y., Peng, Z., Liao, S., Shinagawa, Y., Zhang, S.,
mers, R., 2016e. Unsupervised category discovery via looped Metaxas, D. N., Zhou, X. S., 2016. Multi-instance deep learning:
deep pseudo-task optimization using a large scale radiology image Discover discriminative local anatomies for bodypart recognition.
database. arXiv:1603.07965. IEEE Trans Med Imaging 35 (5), 1332–1343.
Wolterink, J. M., Leiner, T., de Vos, B. D., van Hamersvelt, R. W., Yang, D., Zhang, S., Yan, Z., Tan, C., Li, K., Metaxas, D., 2015.
Viergever, M. A., Isgum, I., 2016. Automatic coronary artery cal- Automated anatomical landmark detection on distal femur surface
cium scoring in cardiac CT angiography using paired convolu- using convolutional neural network. In: IEEE Int Symp Biomedi-
tional neural networks. Med Image Anal 34, 123–136. cal Imaging. pp. 17–21.
Worrall, D. E., Wilson, C. M., Brostow, G. J., 2016. Automated Yang, H., Sun, J., Li, H., Wang, L., Xu, Z., 2016a. Deep fusion net
retinopathy of prematurity case detection with convolutional neu- for multi-atlas segmentation: Application to cardiac mr images. In:
ral networks. In: DLMIA. Vol. 10008 of Lect Notes Comput Sci. Med Image Comput Comput Assist Interv. Vol. 9901 of Lect Notes
pp. 68–76. Comput Sci. pp. 521–528.
Wu, A., Xu, Z., Gao, M., Buty, M., Mollura, D. J., 2016. Deep vessel Yang, L., Zhang, Y., Guldner, I. H., Zhang, S., Chen, D. Z., 2016b. 3d
tracking: A generalized probabilistic approach via deep learning. segmentation of glial cells using fully convolutional networks and
In: IEEE Int Symp Biomedical Imaging. pp. 1363–1367. k-terminal cut. In: Med Image Comput Comput Assist Interv. Vol.
Wu, G., Kim, M., Wang, Q., Gao, Y., Liao, S., Shen, D., 2013. Unsu- 9901 of Lect Notes Comput Sci. Springer, pp. 658–666.
pervised deep feature learning for deformable registration of MR Yang, W., Chen, Y., Liu, Y., Zhong, L., Qin, G., Lu, Z., Feng, Q.,
brain images. In: Med Image Comput Comput Assist Interv. Vol. Chen, W., 2016c. Cascade of multi-scale convolutional neural net-
8150 of Lect Notes Comput Sci. pp. 649–656. works for bone suppression of chest radiographs in gradient do-
Xie, W., Noble, J. A., Zisserman, A., 2016a. Microscopy cell count- main. Med Image Anal 35, 421–433.
ing and detection with fully convolutional regression networks. Yang, X., Kwitt, R., Niethammer, M., 2016d. Fast predictive image
Computer Methods in Biomechanics and Biomedical Engineering: registration. In: DLMIA. Vol. 10008 of Lect Notes Comput Sci.
37
pp. 48–57. cup and disc segmentation. Comput Med Imaging Graph 55, 28–
Yao, J., Wang, S., Zhu, X., Huang, J., 2016. Imaging biomarker dis- 41.
covery for lung cancer survival prediction. In: Med Image Comput Zreik, M., Leiner, T., de Vos, B., van Hamersvelt, R., Viergever, M.,
Comput Assist Interv. Vol. 9901 of Lect Notes Comput Sci. pp. Isgum, I., 2016. Automatic segmentation of the left ventricle in
649–657. cardiac CT angiography using convolutional neural networks. In:
Yoo, Y., Tang, L. W., Brosch, T., Li, D. K. B., Metz, L., Traboulsee, IEEE Int Symp Biomedical Imaging. pp. 40–43.
A., Tam, R., 2016. Deep learning of brain lesion patterns for pre-
dicting future disease activity in patients with early symptoms of
multiple sclerosis. In: DLMIA. Vol. 10008 of Lect Notes Comput
Sci. pp. 86–94.
Ypsilantis, P.-P., Siddique, M., Sohn, H.-M., Davies, A., Cook, G.,
Goh, V., Montana, G., 2015. Predicting response to neoadjuvant
chemotherapy with pet imaging using convolutional neural net-
works. PLoS ONE 10 (9), 1–18.
Yu, L., Chen, H., Dou, Q., Qin, J., Heng, P. A., 2016a. Automated
melanoma recognition in dermoscopy images via very deep resid-
ual networks. IEEE Trans Med Imaging, in press.
Yu, L., Guo, Y., Wang, Y., Yu, J., Chen, P., Nov. 2016b. Segmentation
of fetal left ventricle in echocardiographic sequences based on dy-
namic convolutional neural networks. IEEE Trans Biomed Eng, in
press.
Yu, L., Yang, X., Chen, H., Qin, J., Heng, P. A., 2017. Volumetric
convnets with mixed residual connections for automated prostate
segmentation from 3D MR images. In: Thirty-First AAAI Confer-
ence on Artificial Intelligence.
Zeiler, M. D., Fergus, R., 2014. Visualizing and understanding convo-
lutional networks. In: European Conference on Computer Vision.
pp. 818–833.
Zhang, H., Li, L., Qiao, K., Wang, L., Yan, B., Li, L., Hu, G., 2016a.
Image prediction for limited-angle tomography via deep learning
with convolutional neural network. arXiv:1607.08707.
Zhang, L., Gooya, A., Dong, B. H. R., Petersen, S. E., Medrano-
Gracia, K. P., Frangi, A. F., 2016b. Automated quality assessment
of cardiac MR images using convolutional neural networks. In:
SASHIMI. Vol. 9968 of Lect Notes Comput Sci. pp. 138–145.
Zhang, Q., Xiao, Y., Dai, W., Suo, J., Wang, C., Shi, J., Zheng, H.,
2016c. Deep learning based classification of breast tumors with
shear-wave elastography. Ultrasonics 72, 150–157.
Zhang, R., Zheng, Y., Mak, T. W. C., Yu, R., Wong, S. H., Lau, J.
Y. W., Poon, C. C. Y., Jan. 2017. Automatic detection and classifi-
cation of colorectal polyps by transferring low-level CNN features
from nonmedical domain. IEEE J Biomed Health Inform 21, 41–
47.
Zhang, W., Li, R., Deng, H., Wang, L., Lin, W., Ji, S., Shen, D., 2015.
Deep convolutional neural networks for multi-modality isointense
infant brain image segmentation. NeuroImage 108, 214–224.
Zhao, J., Zhang, M., Zhou, Z., Chu, J., Cao, F., Nov. 2016. Automatic
detection and classification of leukocytes using convolutional neu-
ral networks. Medical & Biological Engineering & Computing.
Zhao, L., Jia, K., 2016. Multiscale CNNs for brain tumor segmenta-
tion and diagnosis. Computational and Mathematical Methods in
Medicine 2016, 8356294.
Zheng, Y., Liu, D., Georgescu, B., Nguyen, H., Comaniciu, D., 2015.
3D deep learning for efficient and robust landmark detection in
volumetric data. In: Med Image Comput Comput Assist Interv.
Vol. 9349 of Lect Notes Comput Sci. pp. 565–572.
Zhou, X., Ito, T., Takayama, R., Wang, S., Hara, T., Fujita, H., 2016.
Three-dimensional CT image segmentation by combining 2D fully
convolutional network with 3D majority voting. In: DLMIA. Vol.
10008 of Lect Notes Comput Sci. pp. 111–120.
Zhu, Y., Wang, L., Liu, M., Qian, C., Yousuf, A., Oto, A., Shen,
D., Jan. 2017. MRI based prostate cancer detection with high-level
representation and hierarchical classification. Med Phys, in press.
Zilly, J., Buhmann, J. M., Mahapatra, D., 2017. Glaucoma detection
using entropy sampling and ensemble learning for automatic optic
38