Malaria Journal: Computer Vision For Microscopy Diagnosis of Malaria
Malaria Journal: Computer Vision For Microscopy Diagnosis of Malaria
Malaria Journal: Computer Vision For Microscopy Diagnosis of Malaria
Address: 1Applied DSP & VLSI Research Group, University of Westminster, London, UK and 2School of Surveying & Spatial Information Systems,
University of New South Wales, Sydney, Australia
Email: F Boray Tek* - [email protected]; Andrew G Dempster - [email protected]; Izzet Kale - [email protected]
* Corresponding author
Abstract
This paper reviews computer vision and image analysis studies aiming at automated diagnosis or
screening of malaria infection in microscope images of thin blood film smears. Existing works
interpret the diagnosis problem differently or propose partial solutions to the problem. A critique
of these works is furnished. In addition, a general pattern recognition framework to perform
diagnosis, which includes image acquisition, pre-processing, segmentation, and pattern classification
components, is described. The open problems are addressed and a perspective of the future work
for realization of automated microscopy diagnosis of malaria is provided.
Page 1 of 14
(page number not for citation purposes)
Malaria Journal 2009, 8:153 http://www.malariajournal.com/content/8/1/153
methods concerning the problem; 2) describe a general During the life cycle in peripheral blood, the different spe-
computer vision framework to perform the diagnosis task; cies may be observable in the four different life-cycle-
3) resolve some ambiguities of different perspectives stages which are generally morphologically distinguisha-
regarding the problem, and 4) point-out some future ble: ring, trophozoite, schizont, gametocyte. The species
works for potential research studies. differ in the changes of the shape of the infected (occu-
pied) cell, presence of some characteristic dots (Schüff-
Microscopy diagnosis is performed by manual visual ner's dots, Maurer's clefts, Ziemann's Stippling) and the
examination of blood smears. The whole process requires morphology of the parasite in some of the life-cycle-stages
an ability to differentiate between non-parasitic stained [3]. The life-cycle-stage of the parasite is defined by its
components/bodies (e.g. red blood cells, white blood morphology, size (i.e. maturity), and the presence or
cells, platelets, and artefacts) and the malarial parasites absence of malarial pigment (i.e. Haemozoin). Illustra-
using visual information. If the blood sample is diag- tions can be found in various sources, e.g. [3,18].
nosed as positive (i.e. parasites present) an additional
capability of differentiating species and life-stages (i.e. Microscopy diagnosis
identification) is required to specify the infection. The WHO practical microscopy guide for malaria provides
detailed procedures for laboratory practitioners [3]. Diag-
From the computer vision point of view, diagnosis of nosis initially requires determining the presence (or
malaria is a multi-part problem. A complete system must absence) of malarial parasites in the examined specimen.
be equipped with functions to perform: image acquisi- Then, if parasites are present two more tasks must be per-
tion, pre-processing, segmentation (candidate object formed: 1) identification of the species and life-cycle
localization), and classification tasks. Hence, the com- stages causing the infection and 2) calculation of the
plete diagnosis system also requires some functions such degree of infection, by counting the ratio of parasites vs.
as microscope slide positioning, an automated, fast, and healthy components (i.e. parasitaemia). However, these
reliable focus, and image acquisition. Some studies con- tasks are not necessarily performed separately or hierar-
cerning image acquisition are examined in section Image chically.
acquisition. Usually, the acquired images from a micro-
scope have several variations which may affect the proc- Using a microscope, visual detection and identification of
ess. These are usually addressed by pre-processing the Plasmodium is possible and efficient via a chemical
functions which are discussed in section Image varia- process called staining. A popular stain, Giemsa, slightly
tions. An important step in automated analysis is to colors red blood cells (RBCs) but highlights the parasites,
obtain/locate possibly infected cells (i.e. candidates) white blood cells (WBC), platelets, and various artefacts
which are the stained objects in the images. Detection of (Figure 1). In order to detect the infection it could be suf-
staining and localization of these objects are discussed in ficient to divide stained objects into two groups such as
sections Segmentation and Stained pixels and objects. parasite/non-parasite and differentiate between them.
However to specify the infection and to perform a detailed
In order to perform diagnosis on peripheral blood sam- quantification, all four species of Plasmodium at four life-
ples, the system must be capable of differentiating cycle-stages must be differentiated (Figure 2). Despite that
between malarial parasites, artefacts, and healthy blood the term 'artefact" is not very definitive, any stained object
components. The majority of existing malaria-related that is not a regular blood component or a parasite is
image analysis studies (e.g. [8-11,14,15,17]) do not referred here using this term: these include bacteria,
address this requirement. This results in the over-simpli- spores, crystallized stain chemicals, and particles due to
fied solutions, which are not applicable to diagnosis dirt [3]. It must be noted that other peripheral blood par-
directly. On the other hand, the few methods which asites and RBC anomalies (e.g. Howell-Jolly bodies, iron
address the differentiation (e.g. [12,13,16]) have limited deficiency, reticulocytes) are included in this artefact class
experimental results to show that their proposed solu- definition. They could be examined in individual dedi-
tions are comparable to manual microscopy diagnosis or cated classes if their identification is also required.
able to replace it. To this effect the requirements for
proper experimental data and set-up is discussed in sec- A specimen for manual microscopy diagnosis can be pre-
tion Discussion. In order to set the scene, a brief introduc- pared (on a glass slide) in two different forms: 1) a thick
tion about the malarial parasite, its species, and life-cycle blood film enables examination of a larger volume of
stages is provided in the next section, followed by a short blood, hence it is more sensitive to detect parasites (as low
description of microscopy diagnosis. as 50 parasites/μl [19]). However, the thick film prepara-
tion process destroys RBCs and thus makes identification
Malarial parasite of species difficult. 2) On the other hand, a thin blood film
The genus Plasmodium has four species that can cause preserves RBC shapes and parasites and is thus more suit-
human infection: falciparum, vivax, ovale, and malariae. able for species identification. A common practice in
Page 2 of 14
(page number not for citation purposes)
Malaria Journal 2009, 8:153 http://www.malariajournal.com/content/8/1/153
Methods
There are many different paradigms of computer vision,
which can be utilized to build an automated visual analy-
sis/recognition system. Existing works on malaria com-
monly use mathematical morphology for image
processing since it suits well to the analysis of blob-like
objects such as blood cells. On the other hand, to differ-
entiate between observed patterns statistical learning
based approaches are very popular. The reader may find in
this paper many technical terms that are used to explain
Figure 1 of stained objects
Examples
different problems or approaches. Additional file 1 pro-
Examples of stained objects. (a, b) white blood cells, (c,
d) platelets, (e)-(h) artefacts, (i)-(l) P. falciparum ring, tropho- vides a brief definition for some of the image processing
zoite, gametocyte, schizont, (m, n) P. malariae ring and sch- related terms (e.g. pixel, histogram, gradient), mathemat-
izont (o, p) P. ovale and P. vivax trophozoites, (q, r) P. vivax ical morphological operators (e.g. erosion, dilation, open-
ring and gametocyte, (s) P. vivax ring, (t) extracted stained ing, granulometry), pattern classification concepts (e.g.
pixel group. (S, green region(s)) and the stained object (Sb, feature, classifier, and training). More detailed informa-
red region including the green one). tion can be found in following sources: on mathematical
Page 3 of 14
(page number not for citation purposes)
Malaria Journal 2009, 8:153 http://www.malariajournal.com/content/8/1/153
Image acquisition
In [34] the required number of images to capture a 2 cm2
region of specimen at 20× magnification is calculated to
be nearly 1,300 images using a 1,300 × 1,030 pixel 2/3
inch charge coupled device (CCD sensor) camera. Diag-
nosis of malaria requires 100× objective magnification
(recommended for manual examination), so the number
of captured images would be 25 times higher. Hence, it
roughly corresponds to over 30,000 slide movements,
focus, and CCD sensor shutter operations which require a
very fast technique. In order to reduce the time require-
ments, Wetzel et al [34] propose to capture the images
while the slide is continuously moving, which introduced
the problem of image blurring. They propose to use
Xenon strobe lights instead of conventional lights to solve
this problem, which probably raises the cost substantially.
Image variations
An image acquired from a stained blood sample (thick or
thin) using a conventional light microscope can have sev-
eral conditions which may affect the observed colors of
the cells, plasma (background), and stained objects. These
conditions may be due to the microscope components
such as: the different color characteristics of the light
source, intensity adjustments, or color filters. They may be
due to the use of different cameras or different settings in
Figure
Examples
smear
film smear
images,
3 of Giemsa-stained
(c) a concentrated
(a) thin
(thick)
and (b)
fieldthick
of a blood
thin blood
film the same camera: exposure, aperture diagram, or white
Examples of Giemsa-stained (a) thin and (b) thick balance settings. The differences in specimen preparation
blood film smear images, (c) a concentrated (thick) can cause variations as often as the imaging conditions
field of a thin blood film smear.
[35]. For example, acidity (pH) of the stain solution can
Page 4 of 14
(page number not for citation purposes)
Malaria Journal 2009, 8:153 http://www.malariajournal.com/content/8/1/153
seriously affect the appearance of the parasites [3]. background histograms from which two separate thresh-
Addressing these variations can simplify the main analysis old values are found. In the final step, the morphological
and contribute to the robustness of the system. In addi- double threshold operation [28] is employed to obtain a
tion to the necessity of reducing these variations for the refined binary foreground mask. However, it was shown
local process, if exchange of images and training samples in [37] that due to the final global threshold operation
could be made possible, then the different diagnosis lab- even this method is not immune to uneven illumination,
oratories which may employ the system in the future may and that the illumination must be corrected prior to any
benefit from a uniform diagnosis expertise. global (thresholding) operation.
Page 5 of 14
(page number not for citation purposes)
Malaria Journal 2009, 8:153 http://www.malariajournal.com/content/8/1/153
Page 6 of 14
(page number not for citation purposes)
Malaria Journal 2009, 8:153 http://www.malariajournal.com/content/8/1/153
Figure
Size granulometry
4 vs. area granulometry
Size granulometry vs. area granulometry. (a) negative image of the grey level sickle cell image, (b) granulometry using
disk shaped structuring elements, (c) area granulometry, (d) area granulometry based cell size estimation varies in different
fields of a thin film although the magnification is constant.
turing elements, could not produce an informative result. same for all the images, however estimated average cell
On the other hand, area granulometry is more accurate size varies remarkably. This is simply caused by the over-
because there is no assumption on the shape of the cells lapping cells in the dense image fields and the differences
and it has the computational advantage because it can be of the field thickness. Hence, the size estimation that is
computed within a single pass and independent of the based only on the area granulometry peak is not very pre-
number of scales [63]. Therefore, it should be preferred to cise for thin blood images. Existing malaria diagnosis
granulometry with the fixed shape structuring elements. methods concentrate only on using size or area granulom-
etries. However, the granulometry concept has more
Average cell size estimation potential to explore, which may be applicable to blood
A common practice is to estimate average cell size with the film image analysis. In [59], Breen and Jones extended the
peak index of the granulometry (which can be an area or definition of granulometry to be calculated with any set of
radius index). This assumes that the thin blood film image attribute openings or non-increasing opening-like opera-
is covered by resolvable individual RBCs of similar size. tions: thinnings [28]. In [62] Urbach et al proposed an
However, the RBC size variation in normal blood and the implementation of shape pattern spectrum which was later
disorders which cause abnormal RBC sizes are neglected. extended to the calculation of 2D granulometries (Shape
In addition, the thickness of the thin film varies through a × Area) in [64] and to the vector granulometries in [65].
slide and this results in varying focus depths, which can
also change the calculated average cell area. Figure 4(d) A final remark is that it is possible to calculate size distri-
shows the average cell pseudo radius (estimated by the bution in thick blood film images (Figure 3(b)) using area
peak index of area granulometry) distributions that are granulometry. However, it is difficult to use an "average
calculated from images of 140 different fields of a single object concept" for these images because RBCs are
thin blood film specimen. The optical magnification is the destroyed and not observable. Furthermore, it is not guar-
Page 7 of 14
(page number not for citation purposes)
Malaria Journal 2009, 8:153 http://www.malariajournal.com/content/8/1/153
anteed that the observed field will contain any well- using the granulometry-based method in [54]; however
defined structure, e.g. WBCs, platelets. eliminating some fields from the analysis may degrade the
sensitivity of the overall system.
Segmentation
Probably one of the most common shared tasks in image A global segmentation approach can be replaced with
analysis systems is segmentation. Segmentation aims to localized analysis which is discussed in the following sec-
partition the image plane into meaningful regions. The tion. However, another purpose of the segmentation is to
definition of the meaningful regions and partitioning count individual RBCs, especially for the quantification of
method is usually application specific. For example, the infection, i.e. parasitaemia calculation. It may be possible
methods can be aimed at separating foreground-back- to estimate the RBC count without performing a perfect
ground, moving-still regions or objects with specific prop- segmentation of the image [54]. Alternatively, parasitae-
erties from the scene. The segmentation strategy can be a mia can be calculated with respect to WBC count rather
hierarchical partitioning that operates deductively to than RBC's [3] if they can be identified.
define first a higher level of object plane, then the objects,
and then sub-object components. The inductive Stained pixels and objects
approaches define first the objects of interest with a spe- The staining process highlights the parasites, platelets,
cific property then perform higher levels of partitioning(s) WBCs, and artefacts in a thin blood (peripheral) film
if necessary. In order to localize highlighted (stained) image. In order to analyze the highlighted bodies it is
objects, either inductive or deductive segmentation essential to identify the pixels and thence locate the object
approaches can be followed. In some studies [10,16,37] regions. However, it must be noted that other blood par-
first the stained objects were identified by their intensity asites [70] and some disorders of blood, e.g. iron defi-
and color properties; then only the RBC regions contain- ciency are also highlighted by the Giemsa-stain.
ing the stained objects were segmented from the image.
On the other hand, in some studies, e.g. [8] a deductive Some methods of the literature name and describe this
strategy was followed: the image was first separated into step as "Parasite Detection" (or parasite extraction). This
foreground and background regions; then foreground results in over-simplistic solutions which are not applica-
regions were segmented to obtain individual RBC regions; ble to diagnosis of malaria, because diagnosis must be
then these were further analyzed to detect the presence of performed on actual peripheral blood specimens of the
staining. The global segmentation procedure is applied patients which are certain to contain other stained bodies:
usually if a deductive approach is proposed. WBCs, platelets and artefacts and may be infected by other
parasites or may have other disorders (e.g. iron defi-
The common problem associated with the segmentation ciency). This may be related to the use of in vitro samples
of thin blood film images is the under/over-segmentation as for the experimental data. Usually in vitro culture
of the cells. Under-segmentation, i.e. including two or images consist of samples grown in a laboratory environ-
more cells in one region, is usually caused by unresolvable ment. Hence, they are cleaner of artefacts and do not con-
cell boundaries of contacting or overlapped blood cells. tain platelets or WBCs. In [13,16,37] the necessity of
On the other hand, over-segmentation, i.e. dividing a sin- differentiating parasites and other stained bodies was
gle cell to more than one region, can be related to hetero- addressed. Therefore, since it defines the process more
geneity of the cell region or incorrect assumptions of the clearly, the term "stained objects" is used; and different
cell size. Several techniques have been proposed to pre- methods to find and extract them are discussed here
vent under/over-segmentations in thin blood film images: (instead of "parasite detection"). A simple example of the
morphological gradient [53]; morphological area closing stained pixels and stained object relationship is shown in
and distance transform [66]; area top-hats [8]; Bayesian Figure 1(s–t).
color segmentation and watershed segmentation [67];
minimum area watershed transform and the circle Radon Di Ruberto et al [10] employed morphological regional
transformation [56]; template (ellipse) matching [14]; extrema [28] to detect (i.e. marked) the stained pixels,
multi-dimensional Otsu thresholding [68], and clump then used morphological opening to extract the object
splitting [69]. regions marked by these pixels. However, they identified
the WBCs, platelets, and schizonts by comparing their size
Unfortunately, none of these methods are applicable to to the average cell size obtained from granulometry and
highly concentrated fields of thin blood film images (e.g. exclude these from further processing. Hence, their
overlapping cells, see Figure 3(c)). Hence, either the method can be regarded as addressing the detection issue.
whole analysis must be constrained to process only the However, detection of stained pixels with regional
"segmentable" (i.e. lightly concentrated) fields or global extrema is error prone because it will locate some pixels
segmentation must be totally avoided. It is possible to even if the image does not contain any stained pixels.
evaluate the image's (field's) cell concentration or density Moreover, eliminating WBCs and platelets with respect to
Page 8 of 14
(page number not for citation purposes)
Malaria Journal 2009, 8:153 http://www.malariajournal.com/content/8/1/153
the average area value can eliminate some parasite species window can be determined with respect to the physical
which enlarge the RBCs that they occupy. For example, scale information; alternatively area granulometry based
Plasmodium vivax infected cells can enlarge up to 2.5 times average cell size estimation can be utilized. In addition,
[3]. Ross et al [16] used a similar approach: they have used the sliding-window approach may be a generalized solu-
a two level thresholding (global and local) to locate tion to both thin and thick film analysis problems. Thick
stained pixels, then used morphological opening to blood films do not have resolvable RBCs. Hence, the
recover the object binary masks. Both of the methods rely detected stained pixels are isolated; and do not allow fur-
on opening and disk shaped structuring elements which ther defining operations based on the shape and size
creates problems because the cells are rarely perfect and assumptions. Moreover, it may be more practical to adapt
flat circles. this method for detection of other blood borne parasites
and disorders, which may be in any size or shape.
Rao et al [8] used thresholding to detect stained pixels,
however they pre-processed the images to remove a global Classification
bias color value that is caused by staining, which is to pre- There are only few studies which propose a classification
vent false pixel detections if the image do not contain any procedure [13,16,37] to differentiate between parasites
stained pixels. Since they use global segmentation to and other stained components or artefacts. The method
locate individual RBCs, the stained objects are defined by described in [14] also proposes a classification to differen-
the regions which contain stained pixels. As stated in the tiate between a healthy RBC and an "infected" RBC. How-
previous section global segmentation is error prone, ever, from the diagnosis point of view the essential task is
unless examined fields are limited to the lightly concen- to identify parasites in the presence of other stained struc-
trated fields. In addition, it must be noted that employing tures, artefacts, and then finally identify the species. As in
a thresholding operation to detect stained pixels assumes Di Ruberto's [10], the approach to the classification task
an ordered relation between stained and un-stained pix- in a recent work also was also limited to detection white
els, e.g. "stained pixels are darker than others". blood cells and gametocytes by area information, for the
purpose of excluding these from parasitaemia calculation
The authors [13] proposed to detect stained pixels accord- [72].
ing to their likelihood where a pixel's red-green-blue color
triple was used as the features and stained and un-stained However, although they do not address the parasite/non-
classes were modelled using 3-d histograms. This removes parasite differentiation, some automated diagnosis of
the limitation of the "stained pixels are darker/brighter" malaria studies rather focused on the life-cycle stage clas-
definition. Using the detected stained pixels as markers, sification. Di Ruberto et al [10] proposed to use the criteria
they located the objects by using morphological area top- of circularity (measured by the number of morphological
hats and reconstruction [28]. This approach prevented skeleton endpoints [28]) and color histogram to classify
over-segmenting of stained bodies, which could be caused the life-stages into two categories: immature and mature
by employing global segmentation based on area heuris- trophozoites. Their test set contained 12 images. Rao et al
tics. [8] proposed a rule-based scheme (area and haemozoin
existence) to differentiate five life-stages. They experi-
Detection of stained pixels is not a very complex problem mented on a set of Plasmodium falciparum in vitro samples
especially with the use of color correction algorithms. which contain immature-mature trophozoite, early-
However, as pointed out in [37], one of the biggest prob- mature schizont but no gametocyte class or other types of
lems of thin blood film analysis is to locate the stained stained object.
objects and define their boundaries, because the stained
pixels which are used as markers may be due to a variety Ross et al [16] proposed a consecutive (detection-species
of objects, e.g. to an artefact which can be any size or recognition) two-stages classification for the problem.
shape [3]. In addition, even for the defined blood compo- They proposed to use two different sets of features for par-
nents, the process is error prone because the boundaries asite detection and species recognition. The initial feature
are not always resolvable, especially in highly concen- sets were comprised of many color- and geometry-based
trated image fields (Figure 3(c)). features. For example, they have used average intensity,
peak intensity, skewness, kurtosis and similar abstract cal-
One alternative, which may worth an investigation, is to culations from the red green blue channels together with
locate the stained pixels by some method and, avoiding the same calculations from the hue-saturation-intensity
object localization, to use directly a sliding-window channel images. For geometrical features, they have iden-
approach on these regions to produce queries to a classi- tified roundness ratio, bending energy, and size informa-
fier. The sliding-window approach, usually in multi-scale, tion, i.e. area, in their feature set. For parasite detection
is used successfully in many general pattern recognition and following species recognition tasks, the initial feature
applications, e.g. face detection [71]. The size of sliding- sets were comprised of 75 and 117 features, respectively.
Page 9 of 14
(page number not for citation purposes)
Malaria Journal 2009, 8:153 http://www.malariajournal.com/content/8/1/153
Using principal component analysis [32] they reduced the For species recognition and life-stage recognition, the
number of features to 37 and 38, for detection and species authors [37] followed a different approach and compared
recognition respectively. They have trained a two level joint and separate classification schemes, which con-
Back Propagation Neural Network for parasite detection cluded that parasite detection can also be performed with
and species recognition. The results for detection were a joint classification (20 class all or 16 class parasite only
reported as: sensitivity (SE) of 85.1% with a positive pre- classes) instead of a separate two-step classification
diction value (PPV) of 80.8%. The specificity value or false scheme (e.g. binary detection followed by four class spe-
detection rate was not reported. See additional file 2 for cies recognition). In the 20-and 16-class classification
descriptions of these measures. For the species recognition schemes, species and life-stage recognition results were
task the SE-PPV results were: P. falciparum 57%–81%, P. comparable to manual microscopy [4,78]. However, it
vivax 64%–54%, P. ovale 85%–56%, P. malariae 29%– must be noted again that these results were based on a sin-
28%. The life-stage recognition problem was not investi- gle observed object not a whole specimen which may have
gated. Their experiments used a training set comprised of thousands of objects. 630 images containing 4,000
350 images containing 950 objects and in the similar test objects were used in hold-out evaluation for detection and
set. leave-one-out evaluation for species and life-stage recogni-
tion experiments.
The authors proposed a KNN based parasite detection
scheme in [13]. Later, the study was extended for a com- Nevertheless, the joint classification scheme, removing
bined analysis of detection, species, and life-stage recogni- the necessity for a binary detection (parasite/non-para-
tion [37]. They studied more generic features such as sites classification), may improve the expandability and
indexed color image histogram, correlogram [73], Hu scalability of a diagnosis system by preventing a narrow
moments [74], and localized area granulometry (3), and reference to "parasite" and "non-parasite" classes. For
proposed a concatenated feature vector. The results for example, if restricted to perform a binary detection, a
detection were SE:72.1%, PPV:85.1%, specificity malaria diagnosis system will have a different notion of
SP:97.45%, and negative prediction value NPV:94.52%. "parasites" than a diagnosis system for Babesiosis or
However, to reduce the effects of class imbalance on the Trypanosomiasis which are examples of other peripheral
results they proposed to use a biased KNN classifier [75]. blood parasites [70]. However, a multi-class joint classifi-
Using receiver operating characteristics (ROC) analysis cation scheme will treat each species and life-stages as sep-
[76] they showed that an adjustable sensitivity-specificity arate and provide other parasites or conditions to be
detection performance can be provided. The adjustable handled by the system. This should be supported by the
scheme is valuable because the methods [37] (also as in use of generalized features instead of the optimized fea-
[16]) report their results based on per-object accuracies tures.
rather than the per-specimen accuracy which would be
expected by a medical diagnostic test. This difference is Discussion
discussed in more detail in Discussion section. For an Imaging
expert microscopist, the tasks of parasite detection, life- In order to be feasible for mass screening or diagnosis, a
stage, and species recognition are not necessarily hierar- computerized inspection system must be provided with
chical, sequential, or independent. The diagnosis expert the automatic slide positioning and image capture facili-
can perform all these tasks in a single classification or ties. In performing diagnosis of a single sample, the slide
sequentially or even partially depending on the discrimi- must be re-positioned at least 100 times, focused, and
native information that exists in the observed object. For captured. The system would be highly impractical if man-
example, the expert can recognize a P. falciparum ring- ual assistance was required. Some of the state-of-art
stage parasite directly; or recognize a ring stage parasite microscopes that are located in well-equipped laborato-
and then can seek for more discriminative parasites to ries already can provide these functionalities. However,
decide its species category. Moreover, it may not be one of the general aims of malaria diagnosis research
required to determine the life-stage of every single parasite should be to produce a cost-effective diagnosis method
because a thorough examination of the whole slide can which can be used especially in the economically weaker
reveal the most frequent life-stage and the condition of a areas where malaria is endemic and causing a serious
single/mixed species infection. However, in manual prac- number of deaths. A possible solution to this problem can
tice, parasitaemia calculation requires counting of para- be the dedicated slide scanning boxes which have already
sites that are not gametocytes. Therefore, if one considers some examples in the market. A customized slide scan-
the diagnosis of a whole slide, the detection, species, and ning system does not require many of the general-purpose
life-stage recognition tasks can be regarded as contextual. functions of a microscope, but a highly sophisticated
It may be possible to incorporate the contextual informa- automated focus technology and hence at the moment
tion into the classification for malaria diagnosis, as pro- existing products are far from meeting the criteria of being
posed in [77] for WBC disorder detection. low cost. However, for some of other applications which
Page 10 of 14
(page number not for citation purposes)
Malaria Journal 2009, 8:153 http://www.malariajournal.com/content/8/1/153
can be performed semi-automatic such as training, define different levels of sample independence [37]. Ide-
research, or tele-diagnosis these requirements may be eas- ally, a system should be tested with different samples
ier to meet. Simply any optical compound microscope which are obtained from (1) different images, (2) differ-
with a digital consumer camera can be used as an imaging ent specimens (blood films), and (3) different imaging
system to acquire blood smear images [37]. sources (e.g. laboratories, hospitals) to simulate the diag-
nostic generalization capability. In addition, the test set
Stained object detection and localization should be allowed to contain completely healthy speci-
Existing methods based on segmentation are not applica- mens (negatives) and specimens of other conditions, e.g.
ble to all fields of a blood slide (negatively affected by rel- iron deficiency, WBC disorders, or other parasites. Having
ative thickness). Hence, either only the lightly a sufficient degree of independence between training and
concentrated fields should be processed or this strategy test samples, one should be careful to avoid repetitive tun-
should be altered. The alternative stained object extraction ing and optimizations in a fixed sample set [82].
methods based on local morphological processing are
mainly heuristic and are partial solutions. In addition, Per-object vs. per-specimen results
they are highly specific to malarial parasites. A possible The average sensitivity threshold for manual microscopy
unified solution could be an adaptation of the sliding- (by an expert microscopist, using a good microscope in
window-based approach which has been successfully good working order) of thick blood film examination is
used in other general object detection problems [71,79]. reported as 50 parasites/μl of blood [19] whereas thin
This can be a generalized solution which is applicable to blood films are reported to be 1/11 less sensitive [83]. Let
thin and thick blood films and for detection of other us assume the thin blood film sensitivity of expert micros-
blood parasites. In this case, a multi-scale scale search can copy as 500 parasites/μl. This corresponds to 0.01%,
be performed [79,80]. Otherwise a sophisticated problem based on the fact that an average blood sample contains 5
that could arise is how to determine the search window million RBCs per 1 μl. The expert sensitivity threshold
size. Area granulometry can be used for this purpose in (500 parasites/μl) is based on the assumption that an
thin blood films; however, as shown, it may not be very expert microscopist works with 100% per-object sensitiv-
precise to detect scale. Nevertheless, some works which ity (i.e. the expert is assumed to always recognize the par-
concern granulometries of other (and joint) attributes asite correctly), however, he/she can examine only a
[63,64] can guide future research efforts to improve scale limited number of fields in a limited time. For example, if
determination in thin blood films. the microscopist examines 100 fields with an average of
200 RBCs, he/she would be able to see only 20,000 RBCs
Sample independence which would have 86.52% probability of seeing an
The choice of the testing procedure and error measures infected RBC in a specimen of 500 parasites/μl. In order to
can significantly affect the results [81]. In practical pattern ensure that a higher probability (e.g. P > 0.999) of observ-
recognition, a general practice is to estimate the accuracy ing an infected RBC, the expert should observe at least
of the overall procedure by performing tests on a concrete with 45,837 RBCs (i.e. 229 fields). From the same per-
set of samples. The factors affecting the results are com- spective, if an automated parasite detector's (e.g. [37])
plex and unfortunately the true error rate is unknown. per-object sensitivity is ~72.37%, it will require at least six
Among several commonly used testing procedures in pat- parasites to be observed to ensure that at least one parasite
tern recognition the hold-out or leave-one-out evaluations is detected, which can be found in 45,889 (P > 0.999)
ensure the independence of the training and testing sam- RBCs if the specimen has 500 parasites/μl. On the other
ples, and thus may suggest a generalization in real appli- hand, the specificity value (~97.45%, [37]) of the same
cations [32]. In the hold-out evaluation, data is randomly detector would produce ~114 false positives (detections)
separated into two sets, training and optimizations are where 45,889 RBCs are observed. Therefore, the per-
performed on one and the generalization performance is object performances may be required to be higher to be
tested on the second test set. In the leave-one-out evalua- comparable to the manual microscopy. Another interpre-
tion, in order to test each sample of a set of N samples, the tation is that the classifier of sensitivity 72.32% and spe-
remaining N - 1 samples are used for training. The proce- cificity:~97.45% would be limited to operate only on the
dure is applied N times for each sample in the set. Leave- higher parasitaemia levels.
one-out is the marginal case of the m-fold evaluation
where m = N - 1. A detailed discussion on the hold-out It should be noted that the routine diagnosis is not per-
and leave-one-out evaluation methods can be found in formed on thin blood films. A study of routine malaria
[81]. diagnosis in the UK showed that the average detection
sensitivity for microscopy was around 500 parasites/μl
However, the sample independence should be more care- [78]. This would correspond to 5,000 parasites/μl in thin
fully examined in order to discuss the capabilities of a blood films if the 1/11 sensitivity ratio as given in [83] is
potential system. Specific to diagnosis, it is possible to considered. It should be also considered that although it
Page 11 of 14
(page number not for citation purposes)
Malaria Journal 2009, 8:153 http://www.malariajournal.com/content/8/1/153
was assumed that the expert microscopist works with automated analysis of thick films, despite being more sen-
100% per-object sensitivity, a recent study shows that the sitive in detection, has not been investigated from a com-
agreement rate among even the reputed expert micro- puter vision perspective. In the existing thin film analysis
scopists is not 100% and is negatively affected by the literature there are some works which propose oversimpli-
lower parasitaemia levels [84]. fied solutions that are not applicable to diagnosis. For the
studies which are methodically applicable to diagnosis,
Nevertheless, a large-scale test which contains many spec- some limitations arise from relying on global segmenta-
imens (positive-negative, mixed, other parasites, other tion of the image. An alternative sliding-window-based
blood disorders) can provide a useful evaluation of the detection approach [79] (avoiding segmentation) could
diagnostic tests. In practice, the requirements of the eval- be a generalized and possibly better solution to the prob-
uation and sample independence to prove a medical diag- lem and may be applicable to both thin and thick film
nostic test's clinical practical value are much higher [85]. analysis.
Thick film analysis In addition, the difference between the per-object and per-
As far as this survey was performed, only a preliminary specimen detection results is emphasized. The evaluations
study [86] for thick blood film analysis was found in the which are currently based on per-object are not necessarily
literature. The thick film examination sensitivity in micro- meaningful from the clinical perspective. The existing and
scopy diagnosis of malaria is higher than for thin films: 50 prospective methods must be evaluated on large-scale
parasites/μl [19]. However, species recognition is more specimen sets and results should be reported based on
difficult due to destroyed RBCs and deformed parasites. per-specimen (e.g. per-film) sensitivity and false positive
Hence, for this task thin blood film examination is being detection rates. Moreover, sample independence of the
used. If the process for detection in thin blood films could experimental data must be taken into account and the test
be made fast enough it can screen more fields to reach an data should include a variety of peripheral blood samples
increased detection sensitivity threshold to match that of including both negative and positive specimens with dif-
thick films. For example, hypothetically, processing 500 ferent levels of parasitaemia and preferably should be
fields (including average 200 RBCs) instead of 50 (recom- acquired using different imaging sources. Finally, future
mended for manual microscopy) can reduce the detection work should also consider expandability and thus the
sensitivity threshold by a factor of 10. This, which can be applicability to other blood-borne parasites and disor-
empirically tested by a large-scale test, can show that thick ders.
blood film analysis may not be essential and eventually
remove the necessity to prepare and examine a different Competing interests
blood film. On the other hand, it would be a great The authors declare that they have no competing interests.
improvement to microscopy diagnosis of malaria if the
same processing speed can be achieved in thick film anal- Authors' contributions
ysis and thus the sensitivity threshold can be reduced to a FBT structured the review and wrote the paper. AGD and
level below the expert microscopist's performance (i.e. IK reviewed the submitted manuscript and contributed to
~50 parasites/μl [19]). the writing of the paper. All authors read and approved
the final version.
Conclusion
This paper provides a good basis for researchers who are Additional material
starting to investigate the automated blood film analysis
for diagnosis or screening of malaria or similar blood
borne infectious diseases. In this paper, a review and cri-
Additional file 1
Description of image analysis and pattern recognition terms. This file
tique of computer vision and image analysis studies provides brief explanations of image processing and pattern recognition
which address the automated diagnosis of malaria on thin related terms.
blood film smears and its necessary auxiliary functions is Click here for file
provided. The computerized diagnosis of malaria is [http://www.biomedcentral.com/content/supplementary/1475-
addressed at system level; its practicality is discussed by 2875-8-153-S1.pdf]
pointing at the issues of imaging, its interoperability is
emphasized by addressing variations which can be caused Additional file 2
Description of the performance measures. This file provides explana-
by different imaging set-ups or differences in specimen tions of the accuracy related terms.
preparations. A system would benefit from the capability Click here for file
of processing images of external sources or allowing [http://www.biomedcentral.com/content/supplementary/1475-
exchange of images and learned parasite models, at the 2875-8-153-S2.pdf]
same time its functions may be calibrated for the imaging
equipment that it operates on. An open problem is the
Page 12 of 14
(page number not for citation purposes)
Malaria Journal 2009, 8:153 http://www.malariajournal.com/content/8/1/153
Page 13 of 14
(page number not for citation purposes)
Malaria Journal 2009, 8:153 http://www.malariajournal.com/content/8/1/153
49. Maragos P: Pattern spectrum and multiscale shape respresen- 75. Maloof MA, Langley P, Binford TO, Nevatia R, Sage S: Improved
tation. IEEE Trans Pattern Anal Mach Intell 1989, 2:701-716. rooftop detection in aerial images with machine learning.
50. Vincent L: Granulometries and opening trees. Fundamenta Infor- Machine Learning 2003, 53:157-191.
maticae 2000, 41:57-90. 76. Flach P: The geometry of ROC space: understanding machine
51. Di Ruberto C, Dempster AG, Khan S, Jarra B: Automatic thresh- learning metrics through ROC isometrics. Proc Int Conf on
olding of infected blood images using granulometry and Mach Learn, Washington DC, USA 2003.
regional extrema. In Proc Int Conf on Pattern Recognit Barcelona, 77. Song XY, Abu-Mostafa JS, Kasdan H: Incorporating contextual
Spain; 2000. information in white blood cell identification. Adv Neural Inf
52. Dempster AG, Di Ruberto C: Using granulometries in process- Process Syst 1997, 10:950-956.
ing images of malarial blood. Proc ISCAS, Sydney 2001. 78. Milne LM, Kyi MS, Chiodini PL, Warhurst DC: Accuracy of routine
53. Di Ruberto C, Dempster A, Khan S, Jarra B: Segmentation of laboratory diagnosis of malaria in the United Kingdom. J Clin
blood images using morphological operators. In Proc Int Conf Pathol 1994, 47:740-742.
Pattern Recognit Barcelona, Spain; 2000. 79. Viola P, Jones MJ: Robust real-time face detection. Int J Comput
54. Angulo J, Flandrin G: Automated detection of working area of Vis 2004, 57:137-154.
peripheral blood smears using mathematical morphology. 80. Rowley HA, Baluja S, Kanade T: Neural network-based face
Anal Cell Pathol 2003, 25:37-49. detection. IEEE Trans Pattern Anal Mach Intell 1998, 20:23-38.
55. Rao KNRM, Dempster A: Use of area-closing to improve gran- 81. Fukunaga K, Hayes R: Estimation of classifier performance. IEEE
ulometry performance. In Proc Int Symp on Video/Image Process and Trans Pattern Anal Mach Intell 1989, 11:1087-1101.
Multimed Commun Zadar, Croatia; 2002. 82. Salzberg SL: On comparing classifiers: pitfalls to avoid and a
56. Tek FB, Dempster AG, Kale I: Blood cell segmentation using recommended approach. Data Min and Knowl Discov 2004,
minimum area watershed and circle radon transformations. 1:317-328.
Proc Int Symp on Math Morphol, Paris, France 2005. 83. Warhurst DC, Williams JE: Laboratory diagnosis of malaria. J
57. Vincent L: Morphological area openings and closings for grey- Clin Pathol 1996, 49:533-538.
scale images. In Proc NATO Shape in Picture Workshop Driebergen, 84. Maguire JD, Lederman ER, Barcus MJ, O'Meara WAP, Jordon RG,
The Netherlands; 1992. Duong S, Muth S, Sismadi P, Bangs MJ, RoyPrescot W, Baird JK,
58. Tek FB, Dempster AG, Kale I: Noise sensitivity of watershed seg- Wongsrichanalai C: Production and validation of durable, high
mentation for different connectivity: experimental study. quality standardized malaria microscopy slides for teaching,
IEE Short Lett 2004, 40:1332-1333. testing and quality assurance during an era of declining diag-
59. Breen E, Jones R: Attribute openings, thinnings, and granulom- nostic proficiency. Malar J 2006, 5:92.
etries. Comput Vis Image Underst 1996, 64:377-389. 85. Bell D, Peeling RW: Evaluation of rapid diagnostic tests:
60. Meijster A, Wilkinson M: Fast computation of morphological malaria. Nat Rev Microbiol. 2006, 4(9 Suppl):34-38.
area pattern spectra. In Proc Int Conf on Image Process Thessaloniki, 86. Toha SF, Ngah U: Computer aided medical diagnosis for the
Greece; 2001. identification of malaria parasites. Int Conf on Signal Process Com-
61. Salembier P, Oliveras A, Garrido L: Anti-extensive connected mun Netw, Chennai, India 2007.
operators for image and sequence processing. IEEE Trans
Image Process 1998, 7:555-570.
62. Urbach ER, Wilkinson MHF: Shape-only granulometries and
grey-scale shape filters. In Proc Int Symp on Math Morphol New
South Wales, Australia; 2002.
63. Urbach E, Roerdink J, Wilkinson M: Connected shape-size pat-
tern spectra for rotation and scale-invariant classification of
gray-scale images. IEEE Trans Pattern Anal Mach Intell 2007,
29:272-285.
64. Urbach ER, Roerdink JBTM, Wilkinson MHF: Connected rotation-
invariant size-shape granulometries. Proc Int Conf on Pattern Rec-
ognit, Cambridge, UK 2004.
65. Urbach ER, Boersma NJ, Wilkinson MHF: Vector-attribute filters.
Proc Int Symp on Math Morphol, Paris, France 2005.
66. Rao KNRM, Dempster A: Modification on distance transform to
avoid over-segmentation and under-segmentation. In Proc Int
Symp on Video/Image Process and Multimed Commun Zadar, Croatia;
2002.
67. Cosio A, Flores FM, Castaneda JAP, Solano MA, Tato S: Automatic
counting of immunocytochemically stained cells. In Proc 25th
Ann Int Conf IEEE EMBS Cancun, Mexico; 2003.
68. Buxton BF, Abdallahi H, Femandez-Reyes D, Jaffa W: Development
of an extension of the otsu algorithm for multidimensional
image segmentation of thin-film blood slides. Proc Int Conf on
Comput: Theory and Appl, Calcutta, India 2007.
69. Diaz G, Gonzalez F, Romero E: Automatic clump splitting for
cell quantification in microscopical images. In Proc Progress in Publish with Bio Med Central and every
Pattern Recognit Image Anal and Appl, LNCS Germany: Springer-Verlag;
2007. scientist can read your work free of charge
70. Garcia L: Diagnostic Medical Parasitology 4th edition. Herndon, USA: "BioMed Central will be the most significant development for
ASM Press; 2001. disseminating the results of biomedical researc h in our lifetime."
71. Viola P, Jones M: Rapid object detection using a boosted cas-
cade of simple features. Proc IEEE Conf on Comput Vis And Pattern Sir Paul Nurse, Cancer Research UK
Recognit 2001. Your research papers will be:
72. Le MT, Bretschneider TR, Kuss C, Preiser PR: A novel semi-auto-
matic image processing approach to determine Plasmodium available free of charge to the entire biomedical community
falciparum parasitemia in Giemsa-stained thin blood smears. peer reviewed and published immediately upon acceptance
BMC Cell Biol 2008, 9:15.
73. Huang J, Kumar S, Mitra M, Zhu WJ, Zabih R: Spatial color index- cited in PubMed and archived on PubMed Central
ing and applications. Int J Comput Vis 1999, 35:245-268. yours — you keep the copyright
74. Hu MK: Visual pattern recognition by moment invariants.
IEEE Trans Inf Theory 1962, 8:179-187. Submit your manuscript here: BioMedcentral
http://www.biomedcentral.com/info/publishing_adv.asp
Page 14 of 14
(page number not for citation purposes)