Hudson 1994
Hudson 1994
Abstract- We define ordered subset processing for standard fast Fourier transforms. Related approaches to the solution of
algorithms (such as Expectation Maximization, EM) for image linear systems have been used in tomography (see Section IV).
restoration from projections. Ordered subsets methods group With data acquired in time order, sequential processing is
projection data into an ordered sequence of subsets (or blocks).
An iteration of ordered subsets EM is defined as a single pass also an option. General approaches to recursive estimation in
through all the subsets, in each subset using the current es- processing sequences of images are discussed by Green and
timate to initialize application of EM with that data subset. Titterington [7]. Titterington [8] has provided a recursive EM
This approach is similar in concept to block-Kaczmarz methods algorithm for sequential acquisition of data.
introduced by Eggermont et al. [l]for iterative reconstruction.
The purpose of this paper is to introduce and assess perfor-
Simultaneous iterative reconstruction (SIRT) and multiplicative
algebraic reconstruction(MART) techniques are well known spe- mance of the OS-EM algorithm and a regularized form (GP in
cial cases. Ordered subsets EM (OS-EM) provides a restoration Section V). We aim to show the acceleration of convergence
imposing a natural positivity condition and with close links to attained with OS. Section I1 defines OS-EM. Section I11
the EM algorithm. OS-EM is applicable in both single photon discusses choice of subsets and order of processing. Section
(SPECT) and positron emission tomography (PET). In simulation IV provides a parallel with iterative methods in transmission
studies in SPECT, the OS-EM algorithm provides an order-
of-magnitude acceleration over EM, with restoration quality tomography, particularly MART; EM is a simultaneous form
maintained. of OS-EM. Section V contains simulation study description
and results. Section VI provides discussion. The Appendix
contains a proof of convergence of OS-EM to a feasible
I. INTRODUCTION solution with exact projection data.
and Segman’s generalization of MART is contained in their block iterative methods, we conclude this section by providing
Theorem 1. For linear equations for which there exists a an argument suggesting that these distinct limit points cannot
feasible (nonnegative) exact solution, mild conditions on the be very different in many SPECT and PET applications.
weights (Section 11) and starting value suffice for convergence OS-EM resolves the linear system determined by A through
of block-iterative MART to the maximum entropy solution. a partition A I , .. . , A,, as above. In the geometry applying to
Block-iterative estimates may be contrasted with simultane- SPECT and PET systems (associated with projection symme-
ous iterative reconstruction techniques, introduced by Gilbert tries), this partition has special orthogonality properties.
[ 131. Simultaneous techniques include all equations relating Subsets should be selected so that CtESi atj is independent
projection data to parameters simultaneously, whereas MART of i, so that pixel activity contributes equally to any subset.
solves the same equations by introducing subsets of the data In SPECT, with subsets comprised of equal numbers of
in subiterations (see Censor [14]). EM can be regarded as a projections, this condition is satisfied if there is no attenuation
simultaneous form of OS-EM. or if the subsets chosen provide this balance. We may then
EM solves (consistent) nonlinear normal equations derived assume CtES, atj = 1 without essential loss of generality.
by maximizing the Poisson likelihood function of the data or Suppose also that the matrix ADAT is block diagonal for
similar regularized equations. OS-EM successively increases any diagonal matrix D of positive elements. In particular,
components of the likelihood corresponding to successive data for D = diag(xi), where xi is the current estimate, assume
subsets. AkDAT = 0 for k # i, with i , k E (1,.. . ,n}.
In consistent linear systems (including exact projections), Then, subiteration i of OS-EM has no effect on fitted values
assume there exist feasible solutions to the projection equa- for any projection k # i, since on substituting for xi+’,
tions y = A z . Maximum entropy and maximum likelihood according to (2), we obtain
solutions then satisfy this linear system. OS-EM or block
iterative methods provide solutions based on a partition AT = &(Xi+’ - xi) = AkDATZ = 0,
[AT,. . . ,A:].
OS-EM iterates towards simultaneous nonnegative solution where z is the vector with components (yt - p f ) / p i , for
of the equations A i z = yi, the equations for the projections t E Si. Hence, the Poisson likelihood function for all count
defining block z, for i = 1,.. . ,n. We prove, in the Appendix, data following subiteration i is (apart from a constant)
that under simple conditions on the weights, OS-EM will
converge to a feasible solution of the linear system. L(y, z i + l ) = C { y t log p f + l - pf+l)
EM is known to exhibit linear local convergence with a
= {yt 1ogpf+l - p f + l }
rate determined the maximum eigenvalue (spectral norm) of
k tESk
the iteration matrix I - D H , where D is a diagonal matrix
whose components are the solution provided by the iterations = L(y, xi) + {yt log pf+l - p f + l }
and H is the Hessian of the likelihood function of the full tES,
data set. (See Green [15] who proves all eigenvalues of this - { Y t 1% Pf - P i } .
matrix lie in the interval [0,1).) In certain cases (strong subset tES,
balance), we can compute the local rate of convergence of
OS-EM explicitly. The second equality assumes nonoverlapping subsets. The
Strong subset balance requires equality of the Hessian third equality follows since the subiteration affects only fitted
(equivalently, Fisher-information) matrices of each data sub- values for subset Si. But the final two terms of the right hand
set’s likelihood function, i.e., H I = . . . = H , and H = nH1. side provide the increase in the likelihood function of the data
Strong subset balance implies that each subset contains equal subset { y t , t E S i } , resulting from subiteration i 1. This +
information about image activity parameters. Since OS-EM is subiteration applies one standard EM iteration for the model
formed by n EM iterations, it is readily verified that conver- p i = Aix to this data subset, and hence has the EM property
gence is again linear, with iteration matrix [I-nDH,] . . . [ I - of increasing the likelihood function of this data subset so
nDH1] = [I- DHIn. Strong subset balance provides an that the sum of these terms is positive. In the circumstances
iteration matrix whose eigenvalues are precisely the n-th above, this implies that the likelihood function of the full set
power of the previous eigenvalues. The largest eigenvalue of projection data is also increased and OS-EM increases this
of this matrix is therefore very small in comparison to the likelihood function within each subiteration. By adapting an
largest eigenvalue of EM’S iteration matrix. With strong subset argument of Vardi, Shepp, and Kaufman [16], this implies
balance, convergence of OS-EM will therefore imply linear convergence of the OS-EM algorithm.
convergence at a geometrically improved rate (with exponent The orthogonality condition above cannot apply exactly in
the number of subsets). tomography, but our experience is that it is adequate as an
The results above for exact projection data provide some approximation. In particular, it is rare to observe a decrease in
confidence in the convergence of the algorithm in ideal cir- the likelihood function in any subiteration and we have never
cumstances. With noisy data though, inconsistent equations observed a decrease in likelihood over a complete iteration
result. The results of the Appendix are not applicable. While it of OS-EM. An argument given in Barnett er al. [17] for a
seems likely from our experiments that OS-EM cycles through similar orthogonality condition based on the geometry of the
a number of distinct limit points, as with MART and many detectors is applicable here, too.
604 IEEE TRANSACTIONS ON MEDICAL IMAGING. VOL. 13, NO. 4, DECEMBER 1994
~~~~ ~~~~~~
Fig. 1. Computer simulated chest cross-section: (a) chest phantom emitter activity; (b) attenuation map; (c) sinogram.
v. SIMULATION STUDY The variants of OS-EM and OS-GP used in the simulation
To study the properties of ordered subsets, we conducted are distinguished by a trailing number indicating the OS level
simulation studies employing computer generated projection (number of subsets). Levels considered were 1 (standard EM
data based on a model ("chest phantom") for activity. Gen- or OSL), 2, 4, 8, 16, and 32. Note that all variants of OS-EM
eration incorporated attenuation and Poisson noise. Sixty-four take equal time to compute a single iteration.
projections were generated over 360". Counts were recorded in Ordering of subsets was designed to introduce independent
64 bins per projection. Counts recorded on all projections to- information about the image in successive subsets. The se-
talled approximately 410000. Fig. 1 shows the chest phantom quence used with OS-EM level 32 introduced projection pairs
activity and attenuation map and the simulated projection data in the order 0", 90", 45", 135", 22.5", 112.5", 67.5", etc.
(sinogram). Chest phantom activity is concentrated in a ring of The reconstruction was scaled after each iteration, so that
high activity ("myocardium"). There are two regions of very total expected counts agreed with total counts (a property of
low activity ("lungs") with otherwise uniform low activity the ML solution). This step is unnecessary; it has no effect
within an elliptical body region, of cross section 40 x 32 on qualitative appearance of the reconstruction or on the
cm. The activity in myocardium, background, and lungs were subsequent iterations, as rescaling by any constant multiplier
specified to be in the ratio 8: 1:O. Attenuation coefficients were has no effect on EM. It was applied to provide consistency
0.03/cm in low activity ("lung") regions, 0.12/cm elsewhere of scaling with ML, so that criteria used to compare solutions
within the body ellipse, and O.OO/cm outside the body. were not affected by simple scaling effects.
Ordered subsets were applied with two algorithms, provid- The scaling may be conducted as follows. The image P
ing OS-EM and OS-GP algorithms. OS-EM is the adaption obtained at step 2(c) of the OS-EM definition in Section I1
of Shepp-Vardi EM described in Section 11. OS-GP is the OS is replaced by cP, where c = (ctgt)/(c, .i,a,), with
adaption of Green's one-step-late (OSL) reconstruction. U , = Et ut,. Projection weights ut, were precalculated
GP provides MAP estimation based on Gibbs priors, for for computational efficiency in our code. As a consequence,
which a penalized likelihood criterion is maximized. This the weights U were available and scaling was a trivial
criterion is operation.
Chi-square and MSE measures of error were then calculated.
Chi-square (otherwise known as deviance) is defined as G =
2 c [ y t l o g ( y t / p t ) - ( y t - p t ) ] , where { ~ t are
} the fitted
projections. Chi-square measures discrepancy between fitted
where and 0 are parameters of the procedure and 4 is projections provided by a reconstruction and the counts on all
a log-cosh function which penalizes discrepancies between projections; MSE, defined as c(PJ - ~,)*/5 ,
measures an
pixel neighbors s, T in a manner determined by the fixed average discrepancy between a reconstruction and the chest
weights w. Maximizing the criterion function L* balances two phantom image. Note that, for given data y, G differs from
objectives: increasing the likelihood function (the first sum the Poisson likelihood only by a fixed constant. Therefore, chi-
of the expression), and reducing the roughness penalty (the square decrease ensures a corresponding likelihood function
second sum). increase in the context of this study.
GP was specified with and defined as in Green [4], with Fig. 2 compares the reconstructions after a single iteration
parameters p = 0.006, ~7 = 2.00. These parameter values of each of the OS-EM variants. It can be seen clearly that
had been established as suitable for standard Gibbs prior higher OS levels provide better definition for small numbers
reconstructions of chest phantom data. of iterations, levels 16 and 32 providing low MSE after a
OS-EM was defined by (1) and (2), using nonoverlapping single iteration.
subsets. Results for cumulative subsets are not reported here Fig. 3 compares the reconstructions after matched iterations
because of the similarity of its results with those of standard of each of the OS-EM variants. The reconstructions have very
EM. similar chi-square and mean square error and are visually
HUDSON AND LARKIN: ACCELERATED IMAGE RECONSTRUCTION 605
Fig. 2. OS-EM reconstructions after one iteration, by level: 1, 2, and 4 (a)-@); 8, 16, and 32 (d)-(f).
X X
40000 0
x m
X X
30000 m m
X X
m m
X
X m
20000
m
X
Y
B
S B
d
U
5 10000
s 9000 53
E 8000 \
0
7000
6000
_ _ _ _ _ _ _ ~ ~ ~ ~
Fig. 7. OS-GP iterations after matched iterations. Reconstructions are: (a) level 1, 32 iterations; (b) level 2, 16 iterations; (c) level 4, 8 iterations; . .;
(0 level 32, 1 iteration.
+
+
i
+
+
A
+
A
+ B A0
O
\ 8 8
c
i
-
@ +
\ 0 @ +
0 @+
\ 0
+
@+
P
ZOW 3000 4000 5ow 2000 3000 4WO 5oW
Chi Square Chi Square
I I I I
Without such adaptions, the most appropriate level of
50 100 500 1000 data subdivision depends on a number of factors. These
Computation Time
are attenuation density, subset balance, and level of noise
in projection data.
Fig. 8. Chi-square versus computation time for 50 iterations of OS-GP.
Levels are: 1 (0);8 (+); 16 (x); 32 (0). With no attenuation, use of individual projections as subsets
is inefficient, as opposite pairs of projections provide estimates
of the same ray sum and should be included together in the
If an ML solution is required, a composite method may same subset.
be defined in which a single iteration comprises successive Subset balance is a second factor. In our simulations, subset
application of OS-EM at a sequence of levels descending to imbalance (variability in the probability of a pixel emission
level 1. For example, one iteration of the composite method being detected in different subsets) will be greatest for subsets
* could employ levels 32, 16, 8, 4, 2, and 1. Our computational composed of an individual projection, but balance will improve
experience is that this algorithm increases the likelihood as the number of subsets is reduced.
function at each iteration, and hence convergence to an ML With 64 projections of data available, both factors motivated
solution is attained. At the same time, overall computations our use of at most 32 data subdivisions. With fewer subdivi-
are greatly reduced. sions, slightly better likelihood values are eventually obtained,
608 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 13, NO. 4, DECEMBER 1994
(3)
Proposition 2: With assumptions (Al), (A2), and as L ( i , i )= 0. By the continuity of L, i is the only point of
algorithm (3) accumulation of x k , implying convergence to i.
For any subset Ti used infinitely often (“i.0.”) in the Corollary: The OS-EM algorithm with subsets selected
sequence {Sk}, pt -+ yt for t E Ti (i.e., convergence of from a fixed stock of exhaustive (but possibly overlapping)
fitted values occurs for t E Ti) along any subsequence subsets of the index set of the data by cyclic control (as in
of iterations using T; exclusively. Section 11) or by almost cyclic control converges, under (Al)
If subsets are selected from a stock {TI,. . . , T,} in such and (A2), to a feasible solution of the full linear system.
a way (“nearly cyclic control”) that 3N for which the
ACKNOWLEDGMENT
set { s k , . . . , S k + N } contains (for any k ) all members of
the stock, then x k + 5,where i is a feasible solution Brian Hutton, and colleagues in the Department of Nuclear
of the linear system for {yt : t E T = U:=”=,;}. Medicine, Royal Prince Alfred Hospital, have assisted in this
Proof: Consider the subsequence K1 of integers, indicat- formulation and added much to our understanding of SPECT.
ing iterations using a particular subset, say TI, occurring i.0. Our thanks to Jeanne Young who introduced us to the links
Then, by the remark following Proposition 1, the deviance for with ART and SIRT. Also to Adrian Baddeley, Jun Ma, and
this subset, CtETl yt log(yt/pt) -+ 0 for k E K1. But, the Victor Solo for helpful discussions. This paper was prepared in
condition for equality in ( 5 ) , and continuity of the deviance part while Hudson was a visitor to the Division of Mathematics
function, together imply p; -+ yt for k E K1, for all t E TI. Statistics, CSIRO.
We now show p t -+ ys for k E K1, for all s E T , i.e., REFERENCES
convergence occurs for all the data subsets.
Assume subsets sk are selected from the stock { T I ,. . . , P. P. B. Eggermont, G. T. Herman, and A. Lent, “Iterative algo-
rithms for large partitioned linear systems, with applications to image
Tn}, and each element of the stock is selected i.0. Then, by reconstruction,” Linear Algebra and Its Applicat., vol. 40, pp. 37-67,
the argument just given, p t -+ yt, for all t E Ti,along any 1981.
subsequence using Ti exclusively. Hence, for k l E K1 and L. A. Shepp and Y. Vardi, “Maximum likelihood reconstruction
for emission tomography,” IEEE Trans. Med. Imag., vol. MI-2, pp.
s E Ti, for i E {2,. ..,n} 113-122, 1982.
T. Hebert and R. Leahy, “A generalized EM algorithm for 3-D Bayesian
reconstruction from Poisson data using Gibbs priors,” IEEE Trans. Med.
Imag., vol. 8, pp. 194-202, 1989.
P. J. Green, “Bayesian reconstruction from emission tomography data
j
using a modified EM algorithm,” IEEE Trans. Med. Imag., vol. 9, pp.
ki -1 84-93, 1990.
E. S. Chomboy, C. J. Chen, M. I. Miller, T. R. Miller, and D. L. Snyder,
‘‘An evaluation of maximum likelihood reconstruction for SPECT,”
IEEE Trans. Med. Imag., vol. 9, pp. 99-1 10, 1990.
k*-1 L. Kaufman, “Implementing and accelerating the EM algorithm for
Positron Emission Tomography,” IEEE Trans. Med. Imag., vol. MI-6,
pp. 37-51, 1987.
P. J. Green and D. M. Titterington, “Recursive methods in image
processing,” in Proc. 46th Session ISI, 1990.
D.M. Titterington, “Recursive parameter estimation using incomplete
j data,” J. Royal Sratist. Soc., B , vol. 46, pp. 257-267, 1984.
.k -1 R. Gordon, R. Bender, and G. T. Herman, “Algebraic reconstruction
techniques (ART) for three-dimensional electron microscopy and X-ray
photography,” J. Theor. Biol., vol. 29, pp. 471481, 1970.
J. N. Darroch and D. Ratcliff, “Generalized iterative scaling for log-
linear models,” Ann. Math. Srarisr., vol. 43, pp. 147G1480, 1972.
Y. Censor and J. Segman, “On block-iterative entropy maximization,”
where k; is the largest integer less than or equal to k1 such J. Inform. Optimization Sci., vol. 8, pp. 275-291, 1987.
that Sk, = T;. Since all subsets occur i.o., the first term A. N. Iusem and M. Teboulle, “A primal-dual iterative algorithm for a
maximum likelihood estimation problem,” Computational Statist. and
approaches 0 as kl-and hence ki-approaches CO. Assuming Data Analysis, vol. 14, pp. 443456, 1992.
nearly cyclic control, there are at most N terms in the sum P. Gilbert, “Iterative methods for the three-dimensional reconstruction
of logarithms, and each term approaches 0 since pt yt, for
-+
of an object from its projections,” J . Theor. Biol., vol. 36, pp. 105-1 17,
1972.
all t E sk. Hence, the RHS of the inequality approaches 0 Y. Censor, “Finite series-expansion reconstruction methods,” Proc.
along subsequence K I ,completing the proof that pt -+ ys for IEEE, vol. 71, no. 3, pp. 409419, 1983.
P. J. Green, “On use of the EM algorithm for penalized likelihood
IC E K1, for all s E Ti. estimation,” J. Royal Statist. Soc., B , vol. 52, no. 3, pp. 443-452, 1990.
Consider a point of accumulation i of the subsequence Y. Vardi, L. A. Shepp, and L. Kaufman, “A statistical model for Positron
{xk : k E Kl}. Since every member of the sequence belongs Emission Tomography (with discussion),” J . Amer. Starisr. Soc., vol.
80, pp. 8-20, 1985.
to the closed bounded region { E : x 2 0, x3 = 1x;}, G. Bamett, S. Crowe, M. Hudson, P. Leung, K. Notodiputro, R.
i exists and belongs to this region, and is feasible. There Proudfoot, and J. Sims, “The use of small scale prototypes in image
exists a subsequence of K1 along which x k + 2 . Then, along reconstructions from projections,” J . Applied Statist., vol. 16, pp.
this subsequence pt + lim Cj u,jx? = Cj a s j i j , and by 223-242, 1989.
H. M. Hudson, B. F. Hutton, and R. Larkin, “Accelerated EM recon-
the result above, this limit along subsequence K1 is ys for struction using ordered subsets,” J . Nucl. Med., vol. 33, (abs), p. 960,
s E T . Hence, the limit i is a solution of the linear system 1992.
R. R. Fulton, B. F. Hutton, M. Braun, B. Ardekani, and R. S. Larkin,
for t E T . But then, limk-oo L(z’,,h)-which exists by “Use of 3-D reconstruction to correct for patient motion in SPECT,”
Proposition 1-is the limit along the subsequence, evaluated Phys. Med. Biol., vol. 39, no. 3, pp. 563-574, 1994.