Par Am 10000
Par Am 10000
Par Am 10000
Centre for Development of Advanced Computing, Pune University Campus, Pune 411007, India
SPG 4th Conference & Exposition on Petroleum Geophysics — Mumbai, India, 7 - 9 January 2002 Page 1
maintain efficient operation (Olfield et al., 1998, Poole, 1994). as post-stack migration. Migration can also be carried out in
In the present study we have used both MPI and MPI I/O to the prestack domain and the results obtained are more accurate
improve the performance and efficiency of the codes than that of poststack domain. However the computational
(Bhardwaj et. al. 2000). requirements of prestack migration algorithms is orders of
magnitude more than that of poststack migration algorithms.
Conceptually, MPI consists of distributed support software Processor speed, memory and I/O play a crucial role in the
that executes on participating UNIX / LINUX hosts on a implementation of these algorithms.
network, allowing them to interconnect and cooperate in a
parallel distributed computing environment. MPI offers an Most of the migration methods comprise of two steps,
inexpensive platform for developing and running application. extrapolation and imaging. In the extrapolation step the
Heterogeneous machines can be used in a networked wavefield is downward continued using some form of the
environment. The MPI model is a set of message passing acoustic wave equation. At each depth the image is formed at t
routines, which allows data to be exchanged between tasks by = 0. The extrapolation of the wavefield can be carried out in t-
sending and receiving messages. x-y, ω-x-y or ω-kx-ky domain. Here we shall describe the
implementation of migration in ω-x-y and ω-kx-ky domains.
Seismic Data Processing occupies a significant role in the
Another technique, Reverse Time Migration (RTM), which
exploration of oil and natural gases. Over the last two decades
makes use of the full wave equation is also developed and
the computational requirements of the SDP activities have
implemented on PARAM.
grown up many folds due to the increase in the data volume as
well as the development in the mathematical algorithms. Three 3D Depth Migration in ω-x-y domain
dimensional data acquisition has become routine as it has
become necessary to look at the minor details of the For 3D depth migration, the extrapolation equation in ω-x-y
underground geology. domain is a parabolic partial differential equation (Claerbout
1985) consisting of a diffraction term and a thin lens term. The
Wave equation based methods (Phadke et.al. 1998) are thin lens term, which accounts for lateral velocity variations, is
gaining more and more popularity in recent years as they usually ignored in time migration. The diffraction term is
provide finer detailed geological features than other numerically solved by the method of splitting, which is the
conventional methods as well as they preserve amplitude basis for the onepass approach. A Crank-Nikolson finite
information. Advanced techniques are distinguished primarily difference scheme with absorbing boundary conditions on the
by their use of wave equation. The most common advanced sides of the model is used for the solution. The thin lens term
techniques include seismic migration and forward modelling. is solved analytically. Imaging is the summation of all the
Finite difference methods are most suitable for migration and frequencies at t=0 for each depth.
modelling as they offer most direct solution to the problem in
terms of the basic equation and initial and boundary
conditions.
By nature most seismic problems carry an inherent parallelism
in subdivision by source, receivers, frequency or wave
number. Indeed the problem decomposition in several
domains is possible. With the change in demand it has become
very difficult for a processing facility build around a serial
architecture machine to cope up with increase in data volume.
The I/O problems are also better solved in parallel processing.
The wave equation based methods are computationally more
expensive but suitable for parallelization. The seismic
processing industries all over the world have found parallel
processing as the only solution to the challenges in probing the
earth’s interior for natural resources.
The digital data that needs to be processed before obtaining an
interpretable image of the subsurface geological structures in
enormous, amounting to 100s of GB (Giga Bytes) or a few TB
(Tera Bytes) for 3D acquisition. All this numerical input will
be passed perhaps 10 to 20 times through a major computer
facility, and only after the complex numerical operations, the
final processed sections are examined by geophysicists and
geologists to formulate an initial or penultimate interpretation.
Parallel processing is the only answer to cope with increase in Figure 1: (a) Zero-offset section of a line from 3D volume of
data volume and changes in processing methodology. we are SEG/EAGE overthrust model. (b) 3D depth migrated
fortunate that Seismic Data Processing (SDP) is an ideal section. The velocity model is also superimposed on the
application for parallel architecture machines. migrated section.
Migration Algorithms The depth migration algorithm in ω-x-y domain is inherently
The stacking of seismic data is a form of data compression, parallel in terms of frequencies. The parabolic approximation
which improves signal-to-noise ratio and produces idealized of the wave equation in frequency-space domain has
seismic traces simulating a coincident source-receiver decomposed the wave field into monochromatic plane waves
experiment. Migration of the resultant data set, called the zero- that are propagating downwards. Therefore, each frequency
offset seismic section or the post-stack time section, is known harmonic can be extrapolated in depth independently on each
processor and there is no need of inter-task communication.
SPG 4th Conference & Exposition on Petroleum Geophysics — Mumbai, India, 7 - 9 January 2002 Page 2
One can introduce parallel task allocation into each frequency 3D Depth migration with PSPI Algorithm
harmonic component with the ultimate goal being to have as
many processors as frequencies. At each depth step all The phase-shift migration method (Gazdag, 1978) downward
frequency components after extrapolation are summed up continues the wavefield in wavenumber-frequency domain,
(Imaging Condition) to give the migrated image. The under the horizontally layered velocity assumption. If the
summation is carried out by automatic merging using migration velocity has no horizontal variations, the phase-shift
MPI_Reduce. MPI I/O is used for reading and writing input method extrapolates the wavefield exactly by rotating the
data, velocity data and output data. phases of each Fourier component. In the presence of lateral
velocity variations, the exact extrapolation equation is no
We first tested the migration algorithm for the data set of longer valid. PSPI (Phase Shift Plus Interpolation) method
SEG/EAGE (1997) Overthrust model. The original data had circumvents the problem of lateral changes in migration
101X25 CDP traces with inline spacing of 100m and crossline velocity by downward extrapolating the wavefield with
spacing of 100m. We interpolated this data volume to 401X97 several reference velocities and then interpolating the
CDP traces to make both inline and crossline spacing 25m for wavefield for the correct velocity (Gazdag and Sguazzero,
avoiding spatial aliasing. The input Fourier Transformed data 1984).
size was of the order of 46MB. This data set was migrated
with a depth step of 25m for 161 depth steps. Figure 1 shows The parallel implementation of the PSPI method is also
the zero-offset section for one of the lines and the 3D migrated straightforward. The method is inherently parallel in terms of
data for the same line. The velocity model is also frequency. Here also the data if first Fourier transformed and
superimposed on the migrated data to show the accuracy of then different processors read and migrate their share of
the migration algorithm. Figure 2 illustrates the execution time frequencies. At each depth step phase-shift are applied for the
as a function of number of processors. Since the problem size reference velocities and then wavefield is interpolated for the
is small the speedup is not linear. actual velocity. One of the processors, which act as the master,
collects and images the data. The method was developed and
The second data set used for testing comprised of 950X665 implemented on PARAM 10000 and was tested by applying it
CDPs. The inline spacing was 25m, the crossline spacing was to both synthetic and real data sets.
37.5m, and the depth step size was 12.5m. The data was
migrated for 480 depth steps. Table 1 shows all the other Reverse Time Migration (RTM)
parameters and the time required to migrate this data set with Reverse time migration technique solves the full wave
64 processors. It is not possible to carry out a speedup analysis equation by extrapolation in time, allowing both the upgoing
on this data volume, since there is not enough memory and downgoing wave to propagate. The full wave equation is
available on a smaller number of processors and the execution solved using finite-differences and the wavefield recorded at
time required will also be very large. the surface is used as boundary condition. McMechan (1983)
has given the description of the method in detail and
demonstrated its ability to image all dips with great accuracy.
Time marching of the wavefield is similar to any modelling
algorithm. The parallelization is carried out using domain
decomposition scheme. A good description of wave
propagation using finite differences is given in the next section
on modelling algorithms.
RTM has the same problems with the stability and numerical
dispersion that finite-difference (FD) modelling has, and it is
straightforward (but computationally expensive) to control
these problems. We have implemented a central difference FD
scheme for RTM on PARAM 1000 using domain
decomposition. The application of the method to both the
synthetic and real data sets will be shown during presentation.
Modelling Algorithms
A basic problem in theoretical seismology is to determine the
wave response of a given earth model to the excitation of an
Figure 2: Number of processors versus execution time chart impulsive source by solving the wave equation. In scalar
for SEG/EAGE Overthrust model. approximation, the acoustic wave equation may be solved to
evaluate the waveform but only compressional waves are
Size of FFT data 1.3 GB considered. A more complete approach is to study the vector
displacement field using the full elastic wave equation for
Size of Velocity model 1.2 GB modelling both, compressional waves and shear waves.
However, important wave properties such as attenuation and
Frequency band 5 - 40 Hz dispersion require a more sophisticated set of equations. These
Number of Processors 64 properties will be incorporated in the future versions of codes.
2D Acoustic / Elastic Wave Modelling
Total Execution time with MPI-IO 7 hrs 44 mins
The mathematical model for elastic wave propagation in 2D
Table 1: Problem size for the second data set and the heterogeneous media consists of coupled second order partial
execution time on 64 processors. differential equations governing motions in x- and z-
directions
SPG 4th Conference & Exposition on Petroleum Geophysics — Mumbai, India, 7 - 9 January 2002 Page 3
∂u& ∂σ xx ∂σ xz edges of the model (Sochaki et al. 1987). Free-surface
ρ = + (1) boundary condition is used for top edge.
∂t ∂x ∂z
The parallel implementation of an algorithm involves the
∂w
& ∂σ xz ∂σ zz
ρ = + (2) division of total workload into a number of smaller tasks,
∂t ∂x ∂z which can be assigned to different processors and executed
and the stress-strain relations are given by concurrently. This allows us to solve a large problem more
quickly. The most important part in parallelization is to map
∂u ∂w out a problem on a multiprocessor environment. The choice of
σ xx = (λ + 2µ) +λ (3) an approach to the problem decomposition depends upon the
∂x ∂z
computational scheme. Here we have implemented a domain
∂u ∂w decomposition scheme.
σ xz = λ ( + ) (4)
∂z ∂x The idea of this scheme is simple. First, the problem domain is
divided into a number of subdomains that are assigned to
∂u ∂w
σ zz = λ + (λ + 2µ) (5) separate processors. The upper part of Figure 3 shows an
∂x ∂z example of the division of problem domain into nine
Where u and w are horizontal and vertical displacements, subdomains. Depending upon the number of available
processors and the problem, one can divide the problem
u& and w
& are the horizontal and vertical particle velocities,
domain into any number of subdomains. Since MacCormack
σ xx , σ zz and σ xz are the stress components, λ and µ DUH scheme uses a nine-point difference star, the calculation of the
the Lamé parameters and ρ is the density. wavefield at an advanced time level for any grid point,
requires the knowledge of the wavefield at 9 grid points of the
Instead of solving these second order coupled partial current time level. For grid points along the boundaries of the
differential equations we formulate them as a first order subdomain, the information about the neighbouring grid points
hyperbolic system (Virieux 1986, Vafidis 1988, Dai et al. comes from the adjacent subdomains. Therefore after each
1996): time step the subdomains have to exchange some wavefield
data. Lower part of Figure 3 shows the required memory space
∂Q ∂Q ∂Q
=A +B (6) for each 2D array of the subdomain and the communication
∂t ∂x ∂z between two adjacent subdomains. The data in the darker
Where, region is sent to the lighter region of the neighbouring
subdomian using MPI message passing calls.
u& 0 0 ρ −1 0 0 The two most important issues in this implementation are (1)
&
w 0 0 0 0 ρ −1 to balance workload (2) to minimize the communication time.
Q = σ xx , A = λ + 2µ 0 0 0 0 In a homogeneous multiprocessor environment, as in our case,
the load balancing is assured if all the subdomains are of the
σ zz λ 0 0 0 0 same size. Minimizing the perimeters of the subdomain
σ
xz 0 µ 0 0 0 boundaries minimizes communication.
0 0 0 0 ρ −1
Subdomain Subdomain Subdomain
0 0 0 ρ −1 0 1 2 3
and B = 0 λ 0 0 0
0 λ + 2µ 0 0 0
µ 0 0 0 0
Subdomain Subdomain Subdomain
4 5 6
When we move from elastic to acoustic media, the value of µ
becomes zero. By substituting µ LQ WKH DERYH HTXDWLRQ ZH
get a first order system of hyperbolic partial differential
equations which governs the acoustic wave propagation.
Subdomain Subdomain Subdomain
p 0 K 0 0 0 K 7 8 9
&
Q = u , A = ρ −1 0 0 and B = 0 0 0 (7)
−
w
&
0 0 0 ρ 0 0
1
Where p is the negative pressure wavefield, and K=λ is the
incompressibility.
For solving the first order hyperbolic system (6), we use the
method of splitting in time (Vafidis 1988). An explicit finite
difference method based on the MacCormack scheme is used
for the numerical solution (Mitchell and Griffiths, 1981). This Figure 3: The upper picture shows the division of problem
scheme is fourth order accurate in space and second order domain into a number of subdomains. The lower picture
accurate in time. The model discretization is based upon shows the communication between two adjacent tasks.
regular grid. Sponge boundary conditions are used for In the MPI implementation of the modelling codes there is a
attenuating the reflected energy from the left, right and bottom
master task and there are a number of worker tasks. The main
job of master task is to divide the model domain into
SPG 4th Conference & Exposition on Petroleum Geophysics — Mumbai, India, 7 - 9 January 2002 Page 4
subdomains and distribute them to worker tasks. The worker shortest wavelength. The finite difference approximation (2) is
tasks perform time marching and communicate after each time stable if
step. As per the requirement of the user the snapshot and
min(∆x , ∆y, ∆z)
synthetic seismogram data are collected by the master and ∆t ≤ (10)
written out on the disk. 2 Vmax
The wave propagation described by equation 6 is valid for where ( V = K ρ) and Vmax is the maximum wave velocity in
both acoustic and elastic media. This is because when the
Poisson’s ratio becomes 0.5 the medium becomes acoustic the medium.
(Phadke et. al. 2000). The upper part of Figure 4 shows the P- 0.5 1.0 1.5 2.0
wave velocity model used for calculating the synthetic data in 4000
a marine environment. There is a water layer at the top. The
3500
water bottom is quite undulating. Poisson’s ratio and density
in other layers are 0.25 and 2.2gm/cc respectively. The 3000
snapshots of the wave propagation through this model are also
shown in Figure 4. The synthetic seismogram data for this 2500
SPG 4th Conference & Exposition on Petroleum Geophysics — Mumbai, India, 7 - 9 January 2002 Page 5
attenuate the energy. The free-surface condition is applied to workers, compiles it and writes it on the disk in a proper
the top boundary. manner.
Distance (km) Finite-difference computation of the snapshots can help in our
0.4 0.6 0.8 1.0 1.2
0 understanding of wave propagation in the medium. We have
used a constant velocity model as a numerical example for
0.1
generating snapshots of 3D acoustic wave propagation. Source
is placed at the center of the cubic model. For simplicity sake
there is no density variation within the model. However, the
0.2 algorithm can handle density variations. The source wavelet
used for calculation of snapshots is the second derivative of a
0.3 Gaussian function with a dominant frequency of 30Hz. Figure
Two-way time (sec)
0.5
0.6
t = 0.07 sec
0.7
SPG 4th Conference & Exposition on Petroleum Geophysics — Mumbai, India, 7 - 9 January 2002 Page 6
A speedup analysis for the two model sizes (Figure 8b) shows Bhardwaj, D., S. Phadke and Sudhakar Yerneni, 2000, On improving
a sub-linear speedup as we increase the number of processors. performance of migration algorithms using MPI and MPI-IO,
For a fixed model size the compute to communication ratio Expanded Abstracts, Society of Exploration Geophysicists.
decreases with the increase in the number of processors. Claerbout, J. F., 1985, Imaging the Earth’s interior, Blackwell
Therefore if we increase the size of the problem, better Scientific Publications
speedup can be achieved for large number of processes.
Dai, N., Vafidis, A., and Kanasewich, E. R., 1996, Seismic migration
Conclusions and absorbing boundaries with a one way wave system for
heterogeneous media, Geophysical Prospecting, 44, 719-739.
In this paper we have presented several migration and
Gazdag, J, 1978, Wave equation migration with the phase-shift
modelling algorithms for seismic imaging on a parallel method, Geophysics, 43, P. 1342-1351.
distributed computer. PSPI algorithm and ω-x-y algorithm are
both parallelized in frequency domain. RTM algorithm is Gazdag, J, and Sguazzero, P, 1984, Migration of seismic data by
parallelized by domain decomposition. Highly efficient and phase shift plus interpolation, Geophysics, 49, P. 124-131.
scalable codes were developed for these algorithms and McMechan, G. A., 1983, Migration by extrapolation of time-
implemented on PARAM 1000. The algorithms were tested dependent boundary values, Geophysical Prospecting, 31, P. 413-
for both synthetic and real data sets. Modelling algorithms for 420.
wave propagation in heterogeneous media were developed and Mitchell, A. R. and Griffiths, D. F., 1981, The finite difference
parallelized using a domain decomposition scheme. Efficient method in partial differential equations: John Wiley & Sons Inc.
codes for both acoustic and elastic wave propagation were
developed. These codes form an integral part of the seismic Oldfield, R. A., D. E. Womble and C. C. Ober, Efficient Parallel I/O
inversion algorithms for estimating the physical properties of in Seismic Imaging, The Int. J. of High Performance Computing
Applications, Vol. 12, No. 3, Fall 1998, pp 333-344.
the subsurface.
Phadke, S., Bhardwaj D. & Yerneni, S., 1998, Wave equation based
(a) migration and modelling algorithms on parallel computers, Proc. of
5000 SPG (Society of Petroleum Geophysicists) second conference, pp.
55 - 59.
4000
Phadke, S., D. Bhardwaj and S. Yerneni, 2000, 3D seismic modeling
3000 in a message passing environment, In proceedings of 3rd
Execution
Time (sec) 2000 Conference and Exposition on Petroleum Geophysics (SPG’2000),
P. 168-172.
1000 Phadke, S., and D. Bharadwaj and Sudhakar Yerneni, 2000, Marine
synthetic seismograms using elastic wave equation, Expanded
0 Abstracts, Society of Exploration Geophysicists.
8 16 32 64
(b) Poole, J., Preliminary survey of I/O intensive applications, Technical
Report CCSF-38, Scalable I/O initiative, Caltech Concurrent
600 supercomputing facilities, California Institute of Technology,
500 Pasadena, 1995.
400 SEG/EAGE 3-d modeling series No. 1, 1997, 3-D salt and overthrust
models, SEG publications.
Execution 300
Time (sec) 200 Sochacki, J., Kubichek, R., George, J., Fletcher, W. R. and Smithson,
S., 1987, Absorbing boundary conditions and surface waves:
100 Geophysics, 52, 60-71.
0 Vafidis, A., 1988, Supercomputer finite difference methods for
8 16 32 64 seismic wave propagation: Ph.D. Thesis, University of Alberta,
No. of Processors Edmonton, Canada.
Figure 8: Comparison of execution time for Stripe, Hybrid- Virieux, J.,1986, P-SV wave propagation in heterogeneous media:
Stripe and Checkerboard partitioning for 3-D acoustic wave velocity stress finite difference method: Geophysics, 51, 889-90.
modeling for two model sizes viz., (a) 400 X 400 X 400,
(b) 200 X 200 X 200.
Acknowledgements
Authors wish to thank the Executive Director, C-DAC for
providing computational facility on PARAM 10000 and
permission to publish this work. We are also thankful to
Department of Science and Technology, Government of India,
for funding a part of this study under DCS (Deep Continental
Studies) program. Discussions with the scientists of GEOPIC,
ONGC, Dehradun, were also helpful in improving the quality
of the codes.
References
Bhardwaj, D., S. Yerneni, and S. Phadke, 2000, Efficient parallel I/O
for seismic imaging in a distributed computing environment, In
proceedings of 3rd Conference and Exposition on Petroleum
Geophysics (SPG’2000), P. 105-108.
SPG 4th Conference & Exposition on Petroleum Geophysics — Mumbai, India, 7 - 9 January 2002 Page 7